Win32-Screenshot - From Cradle To …

Win32-Screenshot is a library in Ruby for capturing screenshots on the Windows operating system. In this post i’m gonna write about the history of this gem since it just made it to the version 1.0.0 and i find that it deserves some extra attention due to the changes it has gone through.

Pre-Historic Times

The gem was originally created by Aslak Hellesøy, now a lead developer of the popular BDD library called Cucumber. Let’s see when was the first commit made to the Win32-Screenshot project by him:

C:\win32screenshot>git log --format=medium | tail -n 5
commit bd172fcc53c9a5a2dc96d5c8b71a2595fb16741b
Author: Aslak Hellesøy 
Date:   Wed Nov 29 17:27:13 2006 +0000

    Importing

It’s over 4 years old! Deciding of the commit message “Importing”, i’d say that the project started even sooner and got imported to the Git versioning system like this. Too bad that git-svn or something similar was not used to prevent losing the earlier history. The version of the library was at the time 0.0.1. Rails was already rocking with the version of 1.1.6.

Final Publicity

When browsing Git history it’s possible to determine that version 0.0.2 got ready at the 2nd Dec of 2006 and ended up also as a gem on RubyGems. The API was quite simplistic, but also limited:

There was only three simple methods part of the public API - #foreground, #desktop and #window. All of them returned an array of three elements - width, height and bmp. First two represented obviously width and height of the taken screenshot. Last one, bmp, was a blob of the bitmap in string format, which allowed it to be written into file by using Ruby’s File class like demonstrated above.

Inconveniencies

If we’re to think about the problems of that version’s API, then the main one for me was that three-element array returned from the methods. I wanted to use mostly one of the returned variables and it was obviously the one representing bitmap blob. So i had to do something like this:

I didn’t like any of the tricks above. Ruby is meant to be beautiful and this was just not it.

Usability Limitations

The limitations of the API might have not been so obvious, but i’m sure that eventually anyone who was using it, stumbled onto them. There was nothing wrong with the methods #foreground and #desktop since they did what you’d like them to do and there was no way that you’d like them to do anything else (well, maybe you’d have liked the #desktop to take a screenshot of the actual Windows Desktop and not from the entire visible screen, but that is just the functionality of the Windows API itself).

The #window method on the other hand supported only searching Windows with a regular expression so if you wanted to take a screenshot of some window with an exact title you had to use something like /^exact title$/. And you couldn’t do much if you had multiple windows with the exact same title since it was the only thing by which you could search by. Maybe you got lucky and got the correct window each time. Maybe you didn’t.

There were an internal methods called #get_hwnd and #capture_hwnd which allowed you to get the handle of the window and take a screenshot of it. The only problem was that #get_hwnd allowed to search for windows only by the regular expression of the title like #window did. If you got the handle to your window from somewhere else then you didn’t have any problems of capturing the correct window. Lucky you.

Technical Limitations

The main technical limitation of the Win32-Screenshot 0.0.2 was that it was using Windows API via Ruby/DL. The problem with that is the fact that DL is part of the Ruby standard libraries and isn’t compatible between different Ruby versions which automatically meant that Win32-Screenshot was not usable between different versions of Ruby either. There were also occasional segmentation faults happening caused by the DL.

Other big technical limitation was the need to use the RMagick gem to save screenshots into some other formats than big file-sized bitmap (bmp). If you’ve used RMagick on Windows then you ought to know that the installation of it can be quite painful.

Allow me to fill you in with the details in case you’re not aware of the complexity of that procedure. You have to install ImageMagick. And not any version will do, but you have to install very specific version of ImageMagick to make your experience a little more pleasant. You can get the specific version from the RMagick website. The provided 24 MB compressed file includes also the gem for that specific version which you can install with usual `gem install` command.

This all works as long as you’re using Ruby 1.8.x MRI version. As soon as you’d like to start using 1.9.x, you’re pretty much on your own. You have to compile RMagick gem for yourself. You ought to know that this is not an easy task under Windows. Yes, you can do it with the help of DevKit, but it’s still quite far of the simplest thing you’ll do within the day or two.

Imagine how hard it is to start using Win32-Screenshot gem with the full potential if every new user has to complete that process manually instead of just executing `gem install win32screenshot`.

In addition to the problems above the version 0.0.2 had also a little bug, which made capturing screenshots fail when they were at certain sizes.

Birthday Party Which Got Cancelled?

I have considered the possibility that most of the problems described above didn’t have any easy solutions at that time due to the missing libraries/tools which exist today, but the biggest problem was that all of these problems existed until May of 2010 when i started to clean it up. I guess Aslak just moved on to other platforms like OS X and didn’t have the time nor will to make it any better.

When browsing more of the Git history then there is one interesting commit:

C:\win32screenshot>git log --format=medium c8e6f9 | head -n 5
commit c8e6f9e20b957e52d1ddf5d5e0416f7f04d7b12b
Author: Aslak Hellesøy 
Date:   Thu Jan 18 18:36:15 2007 +0000

    Releasing 0.0.3

But when looking on the RubyGems then there isn’t version 0.0.3 anywhere to be found.

The only difference pretty much in that version was a small change in the API - instead of returning a three-element array, a block had to be used for all the methods:

I liked it at first, but now i’m thinking that it made the API even worse because there wasn’t even possibility to use Array#[] and Array#last methods anymore to avoid using variables for width and height. You just had to always declare those variables. It was like you had to pay your bills, but you had to do it much often. I’m pretty sure that you understand how uncomfortable that makes you feel.

Change Of Winds

If i recall correctly then i started using the library in the end of 2007. I didn’t know much about the Ruby at the time so i just used it. I noticed that sometimes screenshots capturing failed and other times segmentation fault greeted me with a big smile, but it didn’t happen too often. I was using the library occasionally during the period of two years before these things really started to bug me.

I started by looking for a newer version of the gem, but got disappointed since the last version was still 0.0.2. At least RubyGems told me so. Then i found out that there was offered a patch for the bug of erratic failures of capturing the screenshots. I forked the Win32-Screenshot project, applied the patch and made a pull request to Aslak. Aslak didn’t want to accept the pull request, but offered me to maintain the gem myself. I was surprised by that offer and took it. It was a right thing to do in terms of reviving Win32-Screenshot if i’m thinking about that now. Aslak could just have ignored my pull request and the things would be probably be the same today.

Surfing On The Code

That essentially meant i could do whatever i liked with the library. I’ve added the patch for the bug which caused screenshot capturing to fail and started with replacing fragile Test-Unit tests with more robust RSpec specs.

Next step was to replace Ruby/DL with something else due to these segmentation faults. I didn’t think at first that there would be any problems when using win32-api gem. All usages of Ruby/DL got replaced one by one with win32-api until i realized that if i’d like to use Ruby 1.9 then this solution wouldn’t work either, since there wasn’t any pre-built win32-api gem for these versions of Ruby. At this point i discovered the wonderful ruby-ffi gem which also promised to give me support for JRuby. What a bargain! Again, all usages of win32-api got replaced with ruby-ffi and a support for all MRI versions of Ruby and JRuby was achieved.

I started to like how the code was turning more beautiful with each day and released version 0.0.4 on May 26, 2010. This version addressed those three critical show stoppers - failure of capturing screenshots, segmentation faults and compatibility with different Ruby versions. It felt almost perfect. I called it as “A Complete Overhaul”.

Highway To Better

After the release of version 0.0.4 i started to polish the code and add some additional methods to the API like #window_area and it’s friends, which allowed to capture an area of the window specified by the coordinates. Also, exact window title could now be used by specifying string instead of a regular expression. I got also some contributions from Roger Pack. I always love contributions, thank you!

From that point forward new versions were released quite often, until 0.0.8. At that moment the API was more or less the same as it was in 0.0.3 (the need to use blocks was still there), but just more methods were in the arsenal. Even methods which didn’t make anymore sense to have in the context of “capturing a screenshot”. For example there were some utility methods like #all_desktop_windows and #window_process_id, which just didn’t seem to fit in my mind in that gem.

Cleaning Leaves

That’s when i realized that i wanted to do a completely separate library for searching and retrieving information about the windows. I started development of RAutomation. I’ve written about that library in my previous post “Automating Windows and Their Controls With Ruby”.

Extermination Of Bad Bacteria

As written above then the bad dependencies for Win32-Screenshot included ImageMagick and RMagick. I tried to find a way to remove the need for dependencies which need to be installed separately of the gem. My research for possible options ended up with an unfortunate news that there just wasn’t any fast enough solutions to save files to PNG format with pure Ruby. The next best idea was to use MiniMagick, which is a Ruby library wrapping ImageMagick’s executables for image manipulation and saving to different formats. So if i could bundle all necessary ImageMagick’s libraries and binaries with the gem i’d be on the road.

You know what? I managed to do it! Starting from version 1.0.0 all necessary parts of ImageMagick are bundled with the gem which means that you don’t have to have ImageMagick nor RMagick installed to save screenshots into BMP, GIF, JPG or PNG formats with Win32-Screenshot! Now you just have to have Ruby installed and you can just execute `gem install win32screenshot` and not worry about anything else. Awesome, isn’t it?

The New Beginning

Removal of dependencies ImageMagick and RMagick made me think that maybe, just maybe, i could make Win32-Screenshot even better.

Creation of RAutomation allowed me to delete a lot of code from Win32-Screenshot. For example i was able to delete all the code related with searching the windows - “Capturing screenshots functionality shouldn’t know anything about how to find windows”, i thought. All that code, in better form of course, ended up in RAutomation. I just had to set RAutomation as a Win32-Screenshot’s dependency and start using it in the code.

By using RAutomation i got this idea of changing the API of Win32-Screenshot quite drastically to make it better, but of course break backwards compatibility as a side-effect. I started to tinker the code by trying out different ways of how the new API could look like. The result i came up with has only one method in the public API which parameters decide what to take screenshot of and an Image class which holds the bitmap data with allowing to save it to the disk. Check out the following examples:

The integration with RAutomation allows to use the same nice API for specifying what window to take a screenshot of. I’ve also made it possible to give a RAutomation window object directly as a parameter to make it all even more flexible without the need to add much complexity to the Win32-Screenshot’s own code. Also, the #write method used in the examples above works as you might expect - if you specify the image file extension as a “.png” then it will be saved into the PNG format. No need to specify anywhere explicitly that you really want it to be saved as a PNG format and not just with an extension of “.png”.

From Cradle to Adulthood

There is this other popular saying called “From Cradle To Grave”, but this post is not about that at all even if the title might have suggested it. This is about a birth and it’s journey to it’s adulthood of Win32-Screenshot. It made it to the version of 1.0.0 on December 16th of 2010. Since it is now following “Semantic Versioning” rules then i felt increasing the major version number due to the backwards incompatibility changes to it’s API. Also there’s a statement on the “Semantic Versioning” guidelines which just made it great to increase the version: “Version 1.0.0 defines the public API”. I’m declaring to the world that this is the public API and i’m really happy with the results code-wise and API-wise.

Thank you, Aslak for making it all possible! Happy holidays with screenshottin’ and don’t hesitate to give me any feedback!

One last thing - guess what happens if the following code is executed right now on my machine:

Of course glimpse of The Awesomeness will be saved to my disk with The Awesomeness itself. It’s almost as scary as running VM inside of another VM!

Screenshot Of The Awesomeness