High-Impact, Minimal-Effort Cross-Browser Testing

Design, Web Design

Cross-browser testing is time-consuming and laborious. This, in turn, makes it expensive and prone to human error… so, naturally, we want to do as little of it as possible.

This is not a statement we should be ashamed of. Developers are lazy by nature: adhering to the DRY principle, writing scripts to automate things we’d otherwise have to do by hand, making use of third-party libraries — being lazy is what makes us good developers.

The traditional approach to cross-browser testing doesn’t align well with these ideals. Either you make a half-hearted attempt at manual testing or you expend a lot of effort on doing it “properly”: testing in all of the major browsers used by your audience, gradually moving to older or more obscure browsers in order to say you’ve tested them.

In this article, I’ll describe a testing strategy that I hope is not only less labour-intensive but more effective at catching bugs. I’ll try to document a realistic testing strategy that is more relevant and valuable than simply “test ALL the things!”, drawing upon my experience as a test engineer in the BBC’s Visual Journalism Unit.

What’s Your Objective?

Before diving blindly into cross-browser testing, decide what you hope to get out of it. Cross-browser testing can be summarized as having two main objectives:

  1. Discovering bugs
    This entails trying to break your application to find bugs to fix.
  2. Sanity-checking
    This involves verifying that the majority of your audience receives the expected experience.

These two aims conflict with each other. On the one hand, I know that I can verify the experience of over 50% of our UK audience just by testing in Chrome (desktop), Chrome (Android) and Safari (iOS 8). On the other hand, if my objective is to find bugs, then I’ll want to throw my web app at the most problematic browsers that we have to actively support: in our case, Internet Explorer (IE) 8 and native Android Browser 2.

Users of these two browsers make up a dwindling percentage of our audience (currently around 1.8%), which makes testing in these browsers a poor use of our time if our objective is to sanity-check. But they’re great browsers to test in if you want to see how mangled your well-engineered application can become when thrown at an obscure rendering engine.

Traditional testing strategies understandably put more emphasis on testing in popular browsers, because if a bug affects 50% of your audience, you’ll want to know about it! However, a disproportionate number of bugs exist in the older browsers, which under the traditional testing model would not come to light until towards the end of testing. To make matters worse, fixing those obscure bugs will shake your confidence in the validity of your application in popular browsers, because you tested those browsers against an earlier version of your product; you now have to expend additional effort going back and reverifying that those browsers get the expected experience.

Both bug-finding and sanity-checking are important and are best tackled incrementally, in what I like to call the three-phase attack.

Three-Phase Attack Link

Imagine you’re in a war zone. You know that the bad guys are hunkered down in HQ on the other side of the city. At your disposal are a crack team of battle-hardened guerrillas and a large group of lightly armed local militia. You launch a three-phase attack to take back the city:

  1. Reconnaissance
    Send a spy into the enemy’s HQ to get a feel for where the bad guys might be hiding, how many of them there are and the general state of the battlefield.
  2. Raid
    Send your crack team right into the heart of enemy territory, eliminating the majority of bad guys in one fierce surprise attack.
  3. Clearance
    Send in the local militia to pick off the remaining baddies and secure the area.

You can bring that same military strategy and discipline to your cross-browser testing:

  1. Reconnaissance
    Conduct exploratory tests in a popular browser on a development machine. Get a feel for where the bugs might be hiding. Fix any bugs encountered.
  2. Raid
    Manual testing on a small number of problematic browsers will likely reveal the most bugs. Fix the bugs encountered.
  3. Clearance
    Check that the most popular browsers amongst your audience get the expected experience (i.e. sanity-check).

Whether you’re on the battlefield or testing devices, the phases start off with minimal time invested and grow as the operation becomes more stable. You can do only so much reconnaissance — you should be able to spot some bugs in very little time. The raid is more intense and time-consuming but delivers worthwhile results and significantly stabilizes the battlefield (your application). The clearance phase is the most laborious of all, and you still need to keep your wits about you in case an unspotted baddie comes out of nowhere — but it is a necessary step to be able to confidently claim that the battlefield is now safe and that the general population can move back in.

The first two phases in our attack fulfill our first objective:discovering bugs. When you’re confident that the application is robust, you’ll want to move on to phase three: testing in the minimum number of browsers that match the majority of your audience’s browsing behaviors, fulfilling objective number two (sanity-checking). You can then say with quantitatively backed confidence that your application works for x% of your audience.

I do feel the need to emphasize that the bugs have to be fixed between phases. If you attempt to race through the steps and just document the bugs encountered, this strategy isn’t going to save you any time at all, for reasons that will become clear later in this post. A block of time (ideally, several days) should be set aside for this, to allow time to discover and fix bugs in several phases. This is likely not possible in just an afternoon.

Set-Up: Know Your Enemy

Don’t enter into war lightly. Before testing, you’ll want to find out how users are accessing your content.

Find out your audience statistics (from Google Analytics or whatever tool you use), and get the data into a spreadsheet in a format that you can read and understand. It should be easy to order your spreadsheet by browser popularity. You’ll want to be able to see each browser and operating system combination, with an associated percentage of the total market share.

Simplify Your Browser Usage Statistics

In the Visual Journalism Unit, we immediately narrow down the list by removing every browser that makes up less than 0.05% of our audience (you can adjust this threshold according to your quantity of users). It just isn’t feasible to concentrate on anything with a lower market share. We then break our statistics down into three categories: desktop, portable (mobile and tablet), and in-app browsers.

As web developers, we don’t particularly care which OS the desktop browser is running on — it’s very rare that a browser bug will appear on one OS but not another. We also don’t particularly care whether someone is using Firefox 40 or Firefox 39: The differences between versions are negligible, and the updating of versions is free and often automatic. To facilitate our understanding of the browser statistics, we merge all desktop browser versions — except IE. We know that older versions of IE are both problematic and widespread, so we need to track their usage figures.

Desktop browsers
Chrome Firefox Safari Opera IE Edge
IE 11
IE 10
IE 9
IE 8
We merge all desktop browser versions except IE.

A similar argument applies to portable OS browsers. We don’t particularly care about the version of mobile Chrome or Firefox, because these are regularly and easily updated — so, we merge the versions. But again, we care about the different versions of IE; so, we log their versions separately.

A device’s OS version is irrelevant if we’re talking about Android; what’s important is the version of the native Android Browser it is associated with. On the other hand, which version of iOS a device is running is very relevant because Safari versions are intrinsically linked to the OS. Then, there are a whole host of native browsers for other devices, whose market shares are significantly smaller than those of Apple and Android.

Portable browsers
Chrome Firefox Android Browser 4.* iOS 9 IE Edge Opera Mini
Android Browser 2.* iOS 8 IE 11 Amazon Silk
Android Browser 1.* iOS 7 IE 10 BlackBerry browser
iOS 6 IE 9 PlayBook browser

Finally, we have a new wave of browsers rapidly rising in popularity: in-app browsers, primarily implemented on social media platforms. This is still new ground for us, so we’re keen on listing all of the in-app browser platforms and their respective OS’.

In-app browsers
Facebook for Android Facebook for iPhone
Twitter for Android Twitter for iPhone

When you’re done, your spreadsheet should look a little like this (ignore the “Priority” column for now — we’ll get to that later):

BBC Visual Journalism UK browser usage statistics and priorities
The BBC Visual Journalism Unit’s UK browser usage statistics and priorities as of August 2015. (View large version)

You are now ready to embark upon the three-phase attack.

1. Reconnaissance: Find Browser-Agnostic Bugs

Long before you even think about whipping out a device to test on, do the easiest thing you possibly can: Open up your web app in your favorite browser. Unless you’re a complete masochist, this is likely to be Chrome or Firefox, both of which are stable and support modern features. The aim of this first stage is to find browser-agnostic bugs: bugs that are likely to affect many or all browsers — in other words, implementation errors.

Some of these browser-agnostic bugs might in themselves be the root cause of browser-specific bugs; so, by fixing the browser-agnostic bugs before we even begin cross-browser testing, we should be faced with fewer bugs overall. I like to call this the melting iceberg effect. We’re melting away the bugs hidden under the surface, saving us from crashing and drowning in the ocean — and we don’t even realize we’re doing it.

Below is a short list of things you can do in your development browser to discover browser-agnostic bugs:

  • Try resizing to view responsiveness. Was there a funky breakpoint anywhere?
  • Zoom in and out. Have the background positions of your image sprite been knocked askew?
  • See how the application behaves with JavaScript turned off. Do you still get the core content?
  • See how the application looks with CSS turned off. Do the semantics of the markup still make sense?
  • Try turning off both JavaScript and CSS. Are you getting an acceptable experience?
  • Try interacting with the application using only your keyboard. Is it possible to navigate and see all of the content?
  • Try throttling your connection and see how quickly the application loads. How big is the page load?

Before moving on to phase two, fix the bugs you’ve encountered. It should now be clear why: If we don’t fix the browser-agnostic bugs, we’ll only end up reporting a lot of browser-specific bugs later on. This acts as noise, obfuscating the root cause of the problem.

Be lazy. Fix the browser-agnostic bugs. Then you’ll be ready to launch the second phase of the attack.

2. Raid: Test In High-Risk Browsers First

When we fix bugs, we have to be careful that our fixes don’t introduce morebugs. This means we need to be smart about how we test.

Every change we make carries risk. Tweaking the CSS to fix the padding in Safari might break the padding in Firefox. Optimizing that bit of JavaScript to run more smoothly in Chrome might break it completely in IE. To be truly confident that new changes haven’t broken the experience in any of the browsers we’ve already tested in, we have to go back and test in those same browsers again. Developers are wary of cross-browser testing because whenever they make a fix, they end up right back at square one.

Consider the following table, where the cross icon means the browser has the bug.

Browser bugs matrix
Browser bugs matrix. 

Let’s say we’re to test our content in ascending order of risk: low-risk browser, medium-risk browser, then high-risk browser.

Testing the low-risk browser first, we’d find and fix bug number 2. When we move to the medium-risk browser, bug 2 is already fixed, but we discover a new bug: number 4. We change our code to fix the bug — but how can we be sure we haven’t now broken something in the low-risk browser? We can’t be completely sure, so we have to go back and test in that browser again to verify that everything still works as expected.

Now, we move on to the high-risk browser and find bugs 1, 3 and 5, requiring significant reworking to fix. Once these have been fixed, what do we have to do? Go back and test the medium- and low-risk browsers again. This is a lot of duplication of effort. We’ve had to test our three browsers a total of six times.

Now let’s consider what would have happened if we had tested our content in descending order of risk.

Right off the bat, we’d find bugs 1, 3, 4 and 5 in the high-risk browser. After fixing those bugs, we’d move straight on to the medium-risk browser and discover bug 2. As before, this fix may have indirectly broken something, so we need to go back to the high-risk browser and retest. Finally, we test in the low-risk browser and discover no new bugs. In this case, we’ve tested our three browsers on a total of four different occasions, which is a big reduction in the amount of time required to effectively discover and fix the same number of bugs and to validate the behavior of the same number of browsers.

Sometimes, we’re encouraged to test in the most popular browsers first, to quickly verify the experience of the majority of the audience. However, popular browsers are likely to be low-risk browsers. You know you have to support a given high-risk browser, so get that browser out of the way right at the beginning. Don’t waste effort testing browsers that are less likely to yield bugs, because when you switch to browsers that do yield more bugs, you’ll only have to go back to those low-risk browsers again.

IDENTIFYING PROBLEMATIC BROWSERS

So what is a high-risk browser? The answer is a little woolly and depends on the browser features your application makes use of. If your JavaScript usesindexOf, it’ll break in IE 8. If your app uses position: fixed, you’ll want to check it in Safari on iOS 7.

Can I Use is an invaluable resource and a good place to start, but this is really one of those areas that come from experience and developer intuition. If you roll out web apps on a regular basis, you’ll know which browsers throw up problems time and time again, and you can refine your testing strategy to accommodate this.

The brilliant thing about finding bugs in problematic browsers is that, quite often, they propagate. If there’s a bug in IE 9, chances are that it exists in IE 8. If something looks funky in Safari on iOS 7, it’ll probably look even funkier on iOS 6. Notice a pattern here? The older browsers tend to be the problematic ones. That should help you come up with a pretty good list of problematic browsers.

That being said, back these up with usage statistics. For example, IE 6 is avery problematic browser, but we don’t bother testing it because its total market share is too low. The time spent fixing IE 6-specific bugs would not be worth the effort for the small number of users whose experience would be improved.

Find your lowest common denominator. What’s the crappiest browser you have to support? What’s the minimum experience you’re expected to provide to that browser? Get the minimum experience working in the lowest common denominator, and you’ll go a long way to improving the experience in your other browsers.

This way of thinking isn’t relevant to all projects. For instance, if you have an experimental 3D WebGL canvas project with image fallbacks, many older browsers will just get the fallback image; if we do our raid tests in the older browsers, we likely won’t find many bugs. What we’d want to do instead is choose our problematic browser based on the application at hand. In this case, IE 9 might be a good problematic browser to choose because it is the first version of IE that supports the canvas element. Use your intuition and try to preempt bugs.

DIVERSIFY YOUR PROBLEMATIC BROWSERS

Browsers and browser versions are only one part of the equation: Hardware is a significant consideration, too. You’ll want to test your application in a variety of screen sizes and pixel densities, and switch between portrait and landscape modes.

Grouping together related browsers can be tempting because there is a perceived discount on the cost of effort. If you already have VirtualBox open to test IE 8, now might seem like a good time to test in IE 9, 10 and 11, too. However, if you’re in the early stages of testing your web app, you’ll want to fight this temptation and instead choose three browser-hardware combinations that are markedly different from one another, to get as much coverage over the total bug space as you possibly can.

As of October 2015, here are my problematic browsers of choice:

  • IE 8 on a Windows XP VM;
  • native Android Browser 2 on a mid-range Android tablet;
  • Safari on an iPhone 4 running iOS 6;
  • Opera mini (only really worth testing with content that should work without JavaScript, such as datapics).

FIXING BUGS IN BAD BROWSERS MAKES CODE MORE RESILIENT IN GOOD BROWSERS

Often, you’ll find that the bugs that arise in these problematic browsers are the unexpected result of poor code on your part. You may have awkwardly styled a div to look like a button or hacked in a setTimeout before triggering some arbitrary behavior; better solutions exist for both of these. By fixing the bugs that are symptoms of bad code, you’ll likely be fixing bugs in other browsers before you even see them. There’s that melting iceberg effect again.

By checking for feature support, rather than assuming that a browser supports something, you’ll be fixing that painful bug in IE 8, but you’ll also be making your code more robust in other harsh environments. By providing that image fallback for Opera Mini, you’ll be promoting progressive enhancement and, as a byproduct, you’ll be improving your product even for users of browsers thatcut the mustard. For example, a mobile device might lose its 3G connection with only half of your application’s assets downloaded: Now, the user will still get a meaningful experience where they wouldn’t have before.

Take care, though: If you’re not careful, fixing bugs in obscure browsers can make your code worse. Resist the temptation to sniff the user-agent string to conditionally deliver content to specific browsers, for instance. That might fix the bug, but the practice is completely unsustainable. Don’t compromise the integrity of good code to support awkward browsers.

With the browser-specific bugs fixed, you’re ready for the final phase of the attack.

3. Clearance: Sanity-Checking

You’ve now tested your app in the harshest browsers you have to support, hopefully catching the majority of bugs. But your application isn’t bug-free yet. I’m constantly surprised by how different even the latest versions of Chrome and Firefox will render the same content.

It’s that old 80:20 rule. Figuratively, you’ve fixed 80% of the bugs after testing 20% of the browsers. Now, what you want to do is verify the experience of 80% of your audience by testing a different 20% of browsers.

PRIORITIZING THE BROWSERS

The simple and obvious approach now is to tackle each browser in descending order of market share. If Chrome desktop happens to be the browser with the highest share among your audience, followed by Safari on iOS 8, followed by IE 11, then testing in that order makes sense, right?

That’s largely a fair system, and I don’t want to overcomplicate this step if your resources are already stretched. However, the fact is that not all browsers are created equal. My team groups browsers according to a decision tree, which takes into account browser usage, ease of upgrading and whether the browser is the OS’ default.

Until now, your spreadsheet should have a column for the browsers and a column for their market shares. You now need a third column, designating the priority of each browser. Truth be told, this prioritization work should have been done before launching the three-phase attack, but describing it here makes more sense for the purpose of this article because the priorities aren’t really needed until the clearance phase.

Here is our decision tree:

BBC Visual Journalism Unit's testing priority decision tree
BBC Visual Journalism Unit’s testing priority decision tree. 

We’ve designed our decision tree so that priority 1 browsers (P1) cover approximately 70% of our audience. P1 and P2 browsers combined cover approximately 90% of our audience. Finally, P1, P2 and P3 browsers give us almost complete audience coverage. We aim to test in all of the P1 browsers, followed by P2, followed by P3, in descending order of market share.

As you can see in the spreadsheet earlier in this post, we have just five P1 browsers. The fact that we can verify the experience of over 70% of our audience so quickly means we have little excuse not to retest in those browsers if the code base changes. As we move down to the P2 and P3 browsers, we have to expend ever-increasing effort to verify the experience of an ever-diminishing audience size; so, we have to set more realistic testing ideals for the lower-priority browsers. Here is a guideline:

  • P1
    We must sanity-check in these browsers before signing off on the application. If we make small changes to our code, then we should sanity-check in these browsers again.
  • P2
    We should sanity-check in these browsers before signing off on the application. If we make large changes to our code, then we should sanity-check in these browsers again.
  • P3
    We should sanity-check in these browsers once, but only if we have time.
  • P0
    These are the problematic browsers used in the raid phase of the attack. They are no longer applicable in our testing.

Don’t forget about the importance of diversifying your hardware. If you’re able to test on a multitude of screen sizes and on devices with varying hardware capabilities while following this list, then do so.

Summary: The Three-Phase Attack

Three-phase attack overview
Three-phase attack overview. 

Once you’ve put in the effort of knowing your enemy (i.e. simplifying your audience statistics and grouping browsers into priorities), you’re able to attack in three steps:

  1. Reconnaissance
    Conduct exploratory testing in your favorite browser, to find browser-agnostic bugs.
  2. Raid
    Test in your most problematic supported browsers on a variety of hardware, to find the majority of bugs.
  3. Clearance
    Verify the experience on the most widely used and strategically important browsers, to say with quantitatively backed confidence that your application works.                                                                                               [Source:- Smashingmagazine]