Automating actions on Windows — my first steps with AutoIt

I recently spent quite some time on getting to know Playwright — a great tool for automating virtually any operation performed in a browser. Playwright is great for automating testing web applications as well as for screen scraping (for data), tactical integration, RPA, prototyping and even customizing third party applications for my personal use. I am now looking to extend my NodeJS/Playwright programs — that execute in the context of browser based applications — with a complementary tool for automating tasks on Windows.

One thing I would like to be able to do is schedule my NodeJS/Playwright applications to be executed periodically, automatically. I have learned about Windows Task Scheduler that can be used for this. Another thing I want to be able to do is take automated control of Windows Applications. I cannot or do not want to program all actions in NodeJS. I want to be able to make automated use of the tools I use on Windows myself. To for example automate editing Powerpoint sides, Word documents or Video files.

I learned about AutoIt. An automation tool for Windows GUI interactions. AutoHotKey is another option, a fork from AutoIt with better support for “hot keys” (or short cuts). UI.Vision (fka Kantu) is another option. I decided to first give AutoIt a try.

Download and Install AutoIt

The downloaded file is an .exe file

that can be run to perform the installation using a simple next-next-finish wizard:

After installation, these options are available:

On the file system, a directory with examples has been installed:

These scripts are ready to run. They also provide clear, concrete examples of what the AutoIt scripts look like.

A fairly extensive help system is installed as well, accessible in the AutoIt3/AutoItX directory:

Online Documentation — same contents — is available at : https://www.autoitscript.com/autoit3/docs/ .

First Steps

These tutorials help to create a few simple first scripts that for example display a message window, run notepad to create a new file, install winzip — all fully automatically. The help system provides many more script snippets — from creating screen captures to creating, opening and manipulating Word and Excel documents, from interactions with the file system and commands to play sound files. AutoIt also provides a set of functions to create your own GUIs — for gathering input from a user for running a script and providing feedback during the execution of a script. These GUIs range from extremely simple to potentially quite sophisticated.

I am still in my very early days with using AutoIt. I have been able to create a few simple scripts that can still do nice things — more than on Windows I was ever able to accomplish. For example:

  • take a screenshot of a specific Window, create a new Word document and paste the screenshot into that document — just a few lines of code
  • open a screen recording application, have it point at the right Window and start recording; again, just a few lines of code

I can think of many more things that could be interesting — involving combinations of Desktop Automation (with AutoIt) and Browser Automation (Playwright would be my first option)

Create and Run a First Script — Create Hello World document with Notepad

So our first script should have a Hello World flavor, should it not?

Open the SciTE Script editor.

Type this script:

; Run the Notepad application
Run("notepad.exe")
; Wait for Notepad to become active - by waiting explicitly on a Window with the specified title
WinWaitActive("Untitled - Notepad")
; Now that the Notepad window is active type some text
Send("Hello World{ENTER}Live from Notepad{ENTER}")

Press F5 (or from the menu: Tools | Go) to execute the script. Here is the result (as expected):

To also save this file and close the application, add a few lines to the script:

; press Ctrl + Shift + S - the short cut for the File | Save As menu option
Send("^S")
; type ALT N (!n) to activate the file name entry field
; followed by the file name and add ALT+S (the ! causes the ALT key to be pressed)to activate the Save button (with short cut key ALT S)
Send("!nhelloworld.txt!S")
Sleep(2000)
; close Notepad.exe - by closing the window; note: the window can no longer be accessed through its original title
; there are several ways to identify a window - the title is but one; however, when multiple Notepad windows are running, this way would not suffice
WinClose("[CLASS:Notepad]", "")

Note how the characters ^ for CTRL and ! for ALT are used to insert shortcut keys in calls to the Send() function. Whenever an application makes its controls — fields, buttons, ..- accessible through shortcut keys, it is easiest to use that route for programmatic manipulation. Execution of this script results in:

Note: It turns out that (of course!) VS Code also has extensions for AutoIt:

Script to take screenshot and paste into freshly created Word Document

  • take a screenshot of a specific window and save that screenshot to a file
  • create a new Word application object (run Word) and create a new document
  • add a picture from the file created from the screenshot to the Word document
  • remove the screenshot-image-file from the file system

The code that performs these actions is shown below:

#include <MsgBoxConstants.au3>
#include <Word.au3>
#include <ScreenCapture.au3>
Run("notepad.exe")
; Wait for Notepad to become active - by waiting explicitly on a Window with the specified title
Local $notepadHWnd = WinWaitActive("Untitled - Notepad")
Local $tempScreencaptureFileName = @MyDocumentsDir & "\screenshot.jpg"
; Capture full screen
_ScreenCapture_Capture($tempScreencaptureFileName )
; create a new Word application object
Local $oWord = _Word_Create()
; Add a new empty document
Local $oDoc = _Word_DocAdd($oWord)
; position ourselves at the end of the document (slightly redundant in a fresh, new and empty document)
Local $oRange = _Word_DocRangeSet($oDoc, -1, Default, 4, Default, 4)
; add screenshot from file into new Word document at the "range" location
_Word_DocPictureAdd($oDoc, $tempScreencaptureFileName , Default, Default, $oRange)
; remove the temporary screenshot file
FileDelete ( $tempScreencaptureFileName )
; inform the user about our success!
MsgBox($MB_SYSTEMMODAL, "Word UDF: _Word_DocAdd Example", "A new empty document has successfully been added.")

Hopefully the commands with the comments are self explanatory.

The resulting Word document with the message box still showing:

There are several ways to make a screenshot for a more specific area. The command ScreenCapture_Capture can be passed coordinates to specify a region : the starting x,y position and the width and height, for example 0, 0, 200, 500. Alternatively, to capture only contents from a specific window, the handle for that window can be passed to function _ScreenCapture_CaptureWnd. A window handle can be created using several functions that open applications, wait for windows etc. The function WinGetHandle can also be used.

For example, to capture the contents of a running Notepad application, the code would be:

See how the function WinWaitActive — that waits for the Notepad application window — returns a handle. This handle is used in the call to _ScreenCapture_CaptureWnd to restrict the screen capture to only this window.

The resulting Word document now becomes:

Identifying Windows using AutoIt Windows Info

AutoIt ships with a convenient utility, called AutoIt Windows Info:

This utility provides an overview of details of various types of components: windows, fields, buttons and other controls as well as the text displayed in the window (and useable in AutoIt to identify the right window). Simply click on the Finder Tool icon and drag to the window or button or field you are interested in. Then release the mouse button. The properties are refreshed as shown below. When you double click a property value, it is copied to the clipboard and can easily be pasted into the AutoIt script you are coding.

Here you can see the exact title of a window. And the various ways in which controls can be identified: by name but also by class, instance and ID — and finally by position. AutoIt WIndows Info is very useful to quickly gather the information you need to pinpoint Windows objects you want to manipulate in the AutoIt scripts you create.

I have also added another tool to my toolkit: MPos. This is a tiny tool that constantly displays the current mouse position. For determining the X and Y coordinates where AutoIt should click on component in case these components cannot easily be identified in a diffrent way, MPos is very convenient.

Resources

Comparing AutoHotKey and AutoIt: https://ui.vision/blog/ahk-vs-autoit/ — also comparing Kantu, now known as UI,Vision — and (a long time ago)- https://stackoverflow.com/questions/1686975/choosing-a-windows-automation-scripting-language-autoit-vs-autohotkey#:~:text=AutoHotkey%20includes%20a%20DLL%20file,all%20in%20its%20initial%20download.

Download AutoIt: https://www.autoitscript.com/site/autoit/downloads/

Online Documentation for AutoIt: https://www.autoitscript.com/autoit3/docs/

UI.Vision (fka Kantu) — https://ui.vision/rpa — OpenSource, limited number of scripts and number of actions per script

MPos — desktop tool for locating mouse cursor — https://sourceforge.net/projects/mpos/

Originally published at https://technology.amis.nl on January 27, 2021.

Lucas Jellema is solution architect and CTO at AMIS, The Netherlands. He is Oracle ACE Director, Groundbreaker Ambassador, JavaOne Rockstar and programmer

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store