Big XML files, REXML and learning about stream parsers

After taking the easy route and building some XML check test scripts using Ruby and REXML’s DOM access, I decided that I really didn’t want my computer grinding to a halt for a whole day while it parsed a gig and a half of XML. So it was time to try a streaming parser. Unfortunately, the REXML website seemed to be unavailable. Which led me to this very nice tutorial on Jan Vereecken’s blog:

http://www.janvereecken.com/2007/4/11/event-driven-xml-parser-in-ruby

I’m pretty sure it’s nicer than the one on the REXML site, but I will have to wait and see.

Anyway, thanks Jan!

Ruby, windows, command lines and problems

I’ve been building tools for web service testing using Ruby and its SOAP libraries. I hope to write more on this later, but for now, a pointer to a simple problem that took up far too much time.

My test toolkit has three small programs, each providing different services. The first can be passed a list of named test conditions. It queries the database and returns identifiers for data which matches the test condition of interest. This list of identifiers is dumped to a file. The file is passed to the oracle program as input, generating expected results for the items requested. The list of identifiers is also passed to the web service. The output of the oracle and the web service are in the same format, so it’s then a simple case of automatically comparing the two outputs as files, using diff.

I can also use these tools interactively to run ad-hoc queries on the database and web service, so these tools give me a nice interface for exploratory testing, as well as being able to automate and integrate with the build if need be.

The simple batch file to execute all tests looks something like this:

————
‘ 1. Create the list of things to request from the web service
get_test_data_items > datalist.txt

‘ 2. Generate expected results using the list created, and dump the output to a file
generate_expected_results < datalist.txt > expected.txt

‘3. Query web service for the list of items
query_webservice < datalist.txt > actual.txt

‘4. Check that actual matches expected using Unix
diff actual.txt expected.txt
————
Note that the first three commands are ruby scripts, so windows kindly lets me omit the ‘.rb’ extension.

It turns out that this is a bad thing. Letting windows figure out the file association means that the command line fails to send the specified file as standard input to the ruby script.

The telltale error message for this problem is this:

D:/test/query_webservice.rb:32:in `gets’: Bad file descriptor (Errno::EBADF)
from D:/test/query_webservice.rb:32

The simple workaround is to bypass the ruby file association and explicitly invoke ruby:

ruby get_test_data_items.rb > datalist.txt
ruby generate_expected_results.rb < datalist.txt > expected.txt
ruby query_webservice.rb < datalist.txt > actual.txt

Now it all works.

There’s a more detailed description of the problem here: http://mail.python.org/pipermail/python-bugs-list/2004-August/024920.html

As a side note, I’m calling ‘diff’ from the Gnu utilities for Win32 (http://unxutils.sourceforge.net/) package, a collection of unix utilities to make the windows command line a little friendlier. Laziness is what got me into this problem in the first place. In the spirit of laziness, I’ve also installed a bash shell for Windows. By configuring bash to keep a history of the last 5000 commands, I get automatic logging of my test activities as well.

Nifty…

Here’s a cool little tester-valuable link courtesy of Mike Kelly’s Rational Functional Tester tutorials:

http://labs.google.com/sets

Generate test data, or ideas if you’re stuck.

Sorry!

I thought I was done with ranting about automation tools for a while, but I couldn’t resist this quote from my former boss’ blog:

“Tools that let programmers create software by manipulating icons and graphics shapes on screen have a long and sometimes successful history… But these have generally served as layers of shortcuts on top of the same old text0based code, and sooner or later, to fix any really hard problems, the programmer would end up elbow-deep in that code anyway.”

There’s more good software-development food for thought in his full post, so why not check it out here?

More Joys of Vendorscript

Someone recently suggested to me that the selection of VBScript for an automation language is because it’s easier for testers. This is slightly rant-y, but I promise there are a few helpful tips in here just to make up for a mild dose of language-warring.

Exhibit A.

I can understand that there’s a chance that a tester (if they came from a business role) might have some familiarity with VBScript as a result of writing office macros, but really…

Which would you rather write (and read)?

Dim newList
newListNextItem = 0

For i=1 to ubound(list)
    Redim newList(newListNextItem)
    newList(newListSize) = list(i).getROProperty(“text”)
    newListNextItem = newListNextItem+1
Next i

OR

list.each do | item |
  newList.push item.text
end

Of course, you can (and probably should) implement your own class to do something like the above. Below is how things might look if do.

Set newList = new dynamicList
while list.hasNextItem do
  newList.add(list.nextItem)
end while

You will, however, quickly encounter the issue detailed in exhibit C.

Exhibit B.

Is it easier to have to change the way that you assign a variable based upon the type of the thing being assigned?


a="Fred"
Set b=New Fred
c=a
Set c=b

OR


a="Fred"
b=Fred.new
c=a
c=b

The net effect of this is to have functions which need to be aware of exactly what kind of thing they are returning, then have them do this:

If isObject(a) then
   Set functionReturnValue = a
Else
   FunctionReturnValue = a
End if

Exhibit C.

We include a reference to file B from file A

File A contains:


Set a = New a

File B contains:


Class A
End Class

For some reason this doesn’t work, so just to check, we try a different method to include our Class definition. Now we are told that the class already exists.

But wait, if you somehow thought to create a special factory function to give you an instance of the class, it all works.

So now we change File B to:

Function newA
   Set newA = New A
End Function

Class A
End Class

Now why didn’t I think of that?

Rant off. I’ll be working with different languages and tools for a while now…But do we really have to suffer this as testers? I’ve never spent so much time in a debugger, ever. Productivity tool? I leave that to you to decide.

And in case you didn’t read this the first time around, discover the joys of Visual Basic over here:

http://adamv.com/dev/articles/hatevbs/vbscript
Terry Horwath has also done some further investigation of the namespace issue (Exhibit C) over at SQA Forums.

On corporately mandated tools and vendorscripts…

I have been riding the ups and downs as I transfer my previous test automation framework learnings to one of the big vendor automation tools. I’d resisted criticism, but today I have to say, loudly, it sucks.

Rather than waste my energy, I simply direct you to read this (http://adamv.com/dev/articles/hatevbs/) (make sure you follow the link at the end). It refers to ASP, but I found almost all of this applies.

The joy of unproductivity tools…

Thinking tools

While looking for advice on improving my critical thinking, I came across this article -

http://www.csse.monash.edu.au/~ctwardy/Papers/reasonpaper.pdf

Interestingly, there are a bunch of tools listed here that claim to help with the technique of argument mapping. I haven’t had a chance to try any of these yet. Hopefully, encouraging others to check them out will yield feedback faster than trying them all on my own!

Links to experience reports would be most welcome.

The simple things in life…

Does your system accept real world data? Does it restrict the lengths of fields and/or prevent certain characters from being entered? How do you know when you are allowing the right kinds of data?

While chatting with colleagues about the NOTAG bug and some of the features of the system we are working on (it is vehicle related), it became clear that many testers never bother to confirm whether field validations on everyday data items are actually appropriate.

For instance, many systems in Australia are developed on the East coast of Australia. Now, I haven’t had to check this out for a while, and my records are buried in some old notes, but the important point is that in Australia’s two most populous states the longest license plate you can have is six characters.

Most systems that I have tested which collected vehicle registration details, happily assumed that this held true for the rest of the country also. And for a while, I think this was true. Western Australia has, however, had nine-character license plates for at least a couple of years now. Some other states allow seven. I know of more than a few systems that probably aren’t going to accept your HIWAY2HEL plate.

So here’s my question: In systems that you are testing, how many of the everyday data fields have you researched to find out whether the specified constraints were actually appropriate?

I’m thinking of things like -

- A phone number. There must be at least a few people out there with global roaming.
- An address.
- An international postcode.
- A surname. On a credit card. Here’s your credit card Mr. Saravanalanganingham. Or not.
- An email address.
- A URL.

While your at it, it might be a good idea to check what characters are valid in that email address or URL before you plan testing for that validation feature. The days when I am no longer allowed to enter my .info email address are thankfully much fewer now.

Google is usually the place to start when you’re looking for this kind of information, but there are other resources too. We also make assumptions about the behaviour of other things, and there are references for those too -

- The printed white pages and yellow pages can be a great source for finding real addresses and names, as can city street maps. The Australian white pages has a bunch of other handy references, including international dialling examples and a list of all postcodes.
- Speaking of postcodes, you can find a complete list of Australian postcodes and cities here.
- Curious to know what you might need to handle in that email or URL field? Read the RFC
- Want to know who to blame for that website not working on your browser of choice? On another recent project, we couldn’t help but feel that keyboard navigation was a bit clunky on a select list control. There were also differences in IE and Firefox. It turned out that neither browser was actually behaving correctly. How did we know? We looked at the spec…www.w3c.org is your friend.

RFC.net is also a great resource for performance testers, or any time you need to work with the guts of the internet – HTTP and TCP/IP. On a recent project using AJAX, this was a vital resource for troubleshooting caching issues in our application. Mozilla/Firefox’s livehttpheaders extension was a great help too. It monitors the HTTP headers of webpages as they are received by your browser.

One final question: How many of you have been bitten when perfectly legitimate data couldn’t be accepted by a software system you needed to use? Care to share some stories?

Do you get annoyed with…

…tools for thinking workers which don’t work the way they think, and don’t help support thinking?

Which tools have you used that don’t leave you with this feeling?

Which tools make you feel this way?

My list-

- Mercury products I’ve used
- Rational Robot (for performance testing)
- Lots of programming languages. Korean makes sense to me. C# doesn’t. Or rather, it does, but I don’t understand why you would choose to build a new language that way. Not very poka-yoke.

It’s time for new tools.

Page 3 of 3«123

About me

I'm Jared Quinert, a testing consultant located in Melbourne, Australia. With over fifteen years of experience, I specialise in agile testing, context-driven testing and intelligent toolsmithing with a focus on business outcomes over process. As one of the most experienced agile testers in Australia, I've been diving in hands-on since 2003 to discover how to build successful whole-team approaches to software development.

Contact Me