Hi Folks, it's me again. You know, Matt. I'm a software tester.
And I test through the GUI.
No - wait - stop. Don't spill that soda, and don't slam your fist into the keyboard.
Yes, I've heard about another approach that is behind the GUI. I know, I could use a table-based tool like Fitnesse, and, yes, I even agree that the GUI layer is brittle. Then I might just explore each feature (or new story) once to make sure the message is passed from the GUI layer to server, then keep the layer so thin there isn't room for future regression problems.
Yet I choose to do otherwise - both writing 'automated tests' to drive a browser and
doing extensive exploratory testing at the GUI level. Here are a few reasons why:
1) Testing Eye Candy. Consider, for example, wikipages in Socialtext. One of the features is 'watch'; it's a button I can click on any given page, and I'll get emails when the page changes. I can also bring up a list of my watched pages and sort by last modified date, and so on. So here's what watching a page looks like. For example, consider this page header:
See the watch button? Let's click it.
When I click the watch button, two things are supposed to happen. First, "watch" should change to "stop watching." Second, the star should turn gold.
Now, when I started this blog post, I was going to say that one common point of failure is the javascript itself. So we could write a fitnesse test that says "when the watch on/off toggle is flipped in business logic, in permanent storage, the watched-ness changes. On reload, the watched-ness persists." But that doesn't cover the case where the javascript gets busted and the dang text values just don't change, or the star doesn't change to gold or back to grey.
Except, of course, I found that in this case, the star doesn't change to gold. It's a bug. Whoopsie. Glad I tested in the GUI.
That's actually what happened folks; the feature is incorrect in production, and I just filed three bugs around capitalization, stars and persistence. Only one of them was a production issue; I caught the rest in development.
2) Some defects are just plain rendering defects. Consider a textbox that takes an input and re-displays it -- say your status in Facebook. Now what happens if you type in an extremely long status - or one with no spaces in it? It's possible the browser doesn't know how to break the line, and you the text overflows the 'natural' boundaries in the margins. This is the kind of bug a human will see, and, sometimes, it occurs by regression. Now testing for an extensive list of visual bugs each time gets expensive (times 5 browsers), so at Socialtext we record a test script, then play it back while a human watches. The result is testing at 4x or 5x the speed of what a human typing would be.
3) The Browsers behave differently. Because the web
has no reference implementation, each browser company must interpret the standard themselves. That each web browser has a different interpertation of jacascript means that code that words under IE6 might not work under FireFox three - and a trivial change a dev makes, and tests under FF3 might just break IE, Safari, Chrome or Opera. (This is basically an argument that problem #2 is worse than you thought i would be.)
The traditional answer to this is "ha ha! Just make the GUI super-thin, and don't mess with that fancy-schmancy javascript - or test with jsUnit!" Well, we'll have to have some javascript to do things like flip the watched-ness in real time without submitting the page, or to have a "rich text" interface where what the user's sees is what they get. And those jUnit tools tend to have problems, like they are bound to a specific browser variant. The tools may mature, but you know what? To paraphrase Bob Martin "end to end testing is farther than you think."
4) Testing brand-new technologies. Consider enterprise java beans in 1999, the web in 2002, flash or silverlight today. At each step we've got some new, wonderful technology enabler that does lots of neat GUI stuff. Automated tools for those technologies will probably come ... eventually. And yes, we could wrap ourselves around the axle trying to automate the technologies, or we could just test it.
When a process decision results in faster time to market and less defects (because recording and running an individual test takes less time), I am reluctant to call that a problem.
Conclusions
All this started out because of
one little tweet by "Uncle" Bob Martin "This is 2010. Why are programmers still testing through the GUI?" My friend, Dawn Cannan, pointed out to me that the post was asking why /programmers/ were testing through the GUI. Her assumption being that maybe UncleBob makes a distinction between programmers and testers -- and maybe that's a fair point.
Perhaps I was responding to a statement UncleBob wasn't making at all.
Which, when I think about it, is pretty ironic. You see I wanted to end this post with my big concern - that folks asserting that GUI testing isn't necessary are sort of like doctors prescribing without examining the patient. They are making broad, general claims when every situation might be different.
I've just provided some examples where a specific niche of web-based applications might want to be test through the GUI, but there are lots of different kinds of applications. Some batch-oriented applications have no GUI, and some Create Read Update Delete database-backed apps might do just fine with the 'poking' strategy I outlined above.
My assertion is that what we actually need to do is to learn the domain, study where the bugs come from, and construct strategies that are the most effective for our current problem. For our work at Socialtext, I do find that GUI testing, below-the-GUI (through REST) to do business-logic operation, and code-level developer-tests all contribute to a
balanced approach to risk management in testing.
So hmm. A balanced breakfast.
Perhaps Mr. Martin wasn't suggesting that testers avoid the GUI at all, even for regression testing, but I certainly have heard that voice in the past. It's likely that UncleBob meant "
Programmers, why do you persist in writing automated tests that drive the GUI?", but I've got answers to that question.
It's possible he even means "(to the exlcusion of other tests like TDD, business-logic tests, and exploratory testing)."
That I agree with.
It's tough that twitter limits responses to 140 characters, so I wrote out a real explanation that, I hope, does a serious job of answering questions about GUI testing.
There are good people, doing good work, who simply have a different perspective on testing and test automation. I hope I've addressed some of their concerns, and this post could be a beginning of a discussion. Or, in other words, when people ask "why do you x?", I believe the best reply is:
Come. Let us reason together.