Automating Flash, AJAX, Popups and more using Ruby, Watir and Sikuli

Jonathan Kohl pointed me at Sikuli, a Python-based tool for automating applications using image recognition.  Unlike most tools, which attempt to identify objects via public APIs, Sikuli looks at the pixels on the screen and attempts to identify objects based on how they look.

This isn’t exactly a new approach, as commercial tools have had this feature for a long time as a means of creating custom objects.  So after playing with Sikuli, I wondered whether I could take advantage of it as a library to augment my Watir scripts in Ruby.

It turns out, the answer is ‘yes’, with a caveat.  You need to use JRuby (although you could probably do it in Ruby using the Ruby-Java bridge – It just looked a lot harder), and you also need to use Watir-Webdriver, a new implementation of Watir’s API which is used in a number of other automation frameworks.

Below is a simple example script and instructions to get you started.  It navigates to a website, then clicks on the flash control there. I’ve so far only tested this on Windows.  It should work on OSX and Linux, but perhaps not quite so easily (I’m waiting on some feedback).  Check the instructions for ‘Install OpenCV’ at the page with instructions for calling Sikuli from other tools.

Sikuli can be used to automate Flash components, any challenging AJAX elements of your web application, to dismiss pop-ups or probably even to inspect visual elements of the page (though I’d want to do this minimally).  It’s a little slow, but an interesting and immediately useful add-on to Watir or your favourite java-based testing tool.

#Install Java, or install the JRuby/JRE bundle at the next step
#Install JRuby 1.5.1 -
#Install Sikuli -
#Install watir-webdriver (eg. jgem install watir-webdriver)
#Update ssh - jgem install jruby-openssl
#Copy sikuli-script.jar to \jruby-1.5.1\lib
#Get the test image
#Download and put it in the image folder as below
#See to use Sikuli Script in your JAVA programs for examples

require ‘rubygems’
require ‘watir-webdriver’
require ‘java’

java_import “org.sikuli.script.SikuliScript”
java_import “org.sikuli.script.Region”
java_import “org.sikuli.script.Screen”


$ :ie
$browser.goto start_page


9 comments on “Automating Flash, AJAX, Popups and more using Ruby, Watir and Sikuli”

  1. silvere says:

    very good article.
    Any idea on how to build an testing framework on sikuli+web driver+mysqldb+jython?


  2. Jared says:

    Are you offering me a job? 🙂

    I’m not much of a Python/Jython expert, though I’d imagine the approach is going to be quite similar to the JRuby one (ie. drop the Sikuli jar file somewhere that Jython/Java can find it, then require the appropriate Sikuli packages). I was really just proving a concept with this experiment. I’m also not sure what Python/Jython database libraries are out there.

    If you have a more specific question, feel free to contact me directly ( around the problem you’re trying to solve. I will collect some thoughts on growing automation as well, but can’t promise that I’ll have that post ready any time soon.

  3. Daniel says:

    Would you mind explaining why one should use Watir-Webdriver over regular Watir?

    Thanks for a very informative post.

  4. Jared says:

    I think at the moment, unless you want to run your scripts under JRuby, then there’s probably not a compelling reason, given that the bindings to webdriver for Ruby look a bit underdone at this point. My understanding is that Watir 2.0 is probably going to be built on webdriver.

    JRuby/Watir-Webdriver/Sikuli is an interesting combination for now in that it extends Ruby’s automation capabilities a lot, as well as solving some problems in a much simpler way. This would be the main reason for me to consider watir-webdriver at this point, as I don’t have the time to look at the selenium-server stuff.

  5. Daniel says:

    Thank you for responding to my question. I agree that the JRuby/Watir-Webdriver/Sikuli combination is really interesting, and a beautiful example of combining libraries written in and/or accessible from Ruby/JRuby, Java, C++ and Jython. I can see myself dropping into Sikuli for testing web app features that, while perhaps supported by Watir, require an inordinate amount of effort to implement.

    I was initially uneasy because of the variety of different “Watirs” under development (official Watir, vapir which Ethan works very hard on, watir-webdriver which, as you mentioned, seems slotted to become Watir 2.0, plus headless Celerity), which is what stemmed my question. The mainstream Watir framework suits my current needs wonderfully, so your answer is further validation for me to keep coding and stop looking about for the time being 🙂

  6. Jared says:

    I notice that this version in French beat me to the Sikuli/JRuby punch (but not the Watir combination) –

    Provides some more examples anyway, particularly using the script part of the library to start a browser (and the fact that it’s on the mac).

  7. Chuckvdl says:

    FYI: when installing this on 10/21/11 I ran into an issue getting watir-webdriver installed due to a dependency of selenium-webdriver on v 1.0.9 of FFI.. (it pulled in 1.0.10 which was apparently too new). I had to use the -v switch on jgem install to manually pull down v1.0.9 of FFI first, and then was able to install the watir-webdriver stuff.

  8. David says:

    Just wanted to mention for those looking to have a test framework to automate something like

    sikuli+web driver (or just watir)+mysqldb+jython (or jruby, or other platform)

    one option to look at is That framework is Jython/Python based, but supports remote library interface to work with test libraries (i.e. modules) built in other platforms. And there are already some prebuilt libraries (in varying stages of maturity) for it that are useful:

    watir, Selenium, Selenium2, Sikuli, database access libraries (in Python or Java/Jython)

    I also built wrapper library for Sikuli to be used via command line, or integrated for use with Robot Framework, or other tools (over XML-RPC) or to be called from Java as an API:

  9. DavidTao says:

    We built the Test automation libraries with Sikuli on Jython( Python + Java) 20 month ago. we effort a lots of work and sweat. Our Jython Expert spent almost 3 month on design and creation the whole test library and integrated with Robot Frame work.
    Sikuli is a excellent tools, it totally was as human view UI experience to test the UI application, especial on Web, Flash and Image , graphic moving application.
    But, Only Python is not enough, unless you call Python Image library.
    The best way, use Jython calling Java Image Class on special TEXT image, the Sikuli tool is not good at the TEXT image recognized from Screen.

    The Screen Resolution is big problem, you have use the same resolution on Developing and running environment.

Leave a Reply

Your email address will not be published. Required fields are marked *