Limit Consumption of Brazil Nuts and WebDriver Tests for Optimum Health

End-To-End (E2E) Testing and Selenium Toxicity

Staying away from Brazil nuts is easy. However, web developers are at risk of consuming a dangerously high level of Selenium – the browser automation tool. For those of you not familiar with browser automation tools, they mimic a user navigating a web browser by clicking links, pressing keys, and waiting for pages to load.

Selenium is a powerful tool, and it can provide huge benefits for many projects. However, from my experience, an over-reliance on browser-based test strategies and naive test implementations can cause adverse effects to the health of your build.

Among developers, blaming the Selenium framework for failing tests is easy, but it is our responsibility to consume the correct amount of Selenium for optimum health benefits. We can’t blame any framework (or Brazil nuts) when we choose to consume them irresponsibly.

Symptoms of Selenium Toxicity in E2E Testing

My primary complaints with E2E test suites:

They take too long to execute
The tests are non-deterministic (individual tests will fail intermittently without any code changes).

Fortunately, strategies to improve runtime speed also help mitigate the risk of intermittent test failures.

Browser-based tests are relatively slow compared to most automated tests. The cause of this slowness is obvious: they require more orchestration in comparison to other test frameworks. For example, the entire application must be running, and the WebDriver framework must launch a browser and simulate user interactions such as clicking buttons and typing text. This process leads to the following maxim:

The fewer interactions required, the faster the test execution becomes.

Let’s examine how we can get ourselves into trouble by applying this maxim naively with the following case study.

UI Test Automation Case Study – Gmail Settings

Imagine we work for the Gmail development team and want to perform test automation for the following settings page that allows users to add support for various keyboard layouts.

The steps required to access this page are:

Navigate to https://mail.google.com
Log in to the Gmail app
Click the Gear icon to open the Settings menu
Click the Settings menu item to navigate to the Settings page
Click the Edit Tools link to launch the Input Tools control shown in the above figure

Here are a few ways we can get ourselves into trouble:

Symptom 1- The false economy of reusing login sessions & assumed start pages

The first step of every test is to log into the application. This step provides no value to our current test case, so we are tempted to skip it. The Selenium APIs are easy to use, but it is still tedious to write code to perform a log in and page navigation. A common but naive approach, is to identify a “first” test that will log in, and all subsequent tests assume that a valid user session already exists. Additionally, tests are further ordered so that the next test can begin on the page where the previous test ended.

Don’t do that. The above strategies introduce tightly coupled tests. These tests are brittle because a failure in one test can have a cascading effect and cause subsequent tests to fail. If one test fails before arriving at the desired end page, all future tests will fail because they only know how to run from their predetermined start page. Also, these tests rely on a strict ordering and cannot be run in isolation or in parallel, which would likely yield more significant reductions in execution time. This kind of optimization sacrifices minutes and hours of execution time to save seconds.

Do this instead:

Ensure that your tests are independent and can be run in any order.
Perform login operations via the remote API instead of instrumenting the UI.
Bypass the authentication or authorization services for tests that don’t require these services.
Only drive the UI for the operations under test, perform navigation and create test data with the application’s API.
Enhance the API to make any of the above recommendations possible and robust.

Symptom 2 – Cascading wait statements

Browser-based tests spend much of their time waiting for page elements to become visible or enabled. Many user interactions trigger partial page updates, field lookups, validations, and fetching remote data. If these partial updates fail, browser-based frameworks will wait idly because they have no way of knowing that something went wrong. This waiting causes failing tests to take much longer than passing tests.

For the Input Tools page above, imagine that we want to test that a certain keyboard layout can be added to the top of the list. We’ll have to instruct Selenium to take the following steps:

Find and select the correct input format from the list element labelled All input tools
Find and click the button to add the item to the Selected input tools list
Find and select the newly added keyboard layout in the Selected input tools list
Find and click the up arrow repeatedly to move the item to the top of the list

Depending on the way the page is implemented, we may need to wait for the following events:

The contents of the All input tools list box must be loaded before we can find the desired layout
The selected keyboard layout must appear in the Selected Input tools list before we can change its position in the list
The OK button must be enabled before we can click it

Naive tests may not check the status of each interaction to determine whether to continue with the test case. All actions that follow will wait for a condition that cannot be met and will timeout after the maximum wait time is achieved.

Don’t do that. Tests that time out take several times longer to execute than tests that pass or those that fail due to failed assertions. This can lead to unreasonably high execution times. I have worked with test suites that complete in 20 minutes when successful, but take 2-4 hours when unsuccessful. Test suites that take several hours to run will not be run by developers during feature development, resulting in more bugs being deployed to test environments where they are more expensive and time consuming to fix.

Do This Instead:

Validate every remote operation and fail the test as soon as possible
Minimize the number of partial page refreshes for each test

Symptom 3 – Creating Test Data via the UI

Some test authors are tempted to create test data using the GUI, particularly if there is already code to validate the behaviour of data input. You may have heard the rationale that extra validation of the data entry pages comes for free with this approach.

Let’s assume we want to run a test where we switch keyboard layouts in Gmail. To do so, we need to select at least two input tools before the test can proceed. We may be tempted to use Selenium to navigate to the Input Tools menu and select a few input tools as part of our test setup.

Don’t do that. Each test should verify one and only one feature of a system. Bundling multiple steps together makes it more difficult to determine the actual cause of failure. Additionally, test cases that are coupled together cannot be split apart and run in parallel. This “optimization” is tight coupling in disguise and creates bottlenecks to performance gains in the future.

Do this instead.

Improve the application API to allow the creation of all necessary test data
Create test data during the setup phase of each test, ideally in a dedicated and ephemeral environment. How to isolate and load test data is beyond the scope of this post.

In Conclusion

By this point, our E2E tests run a little faster and reliably. We have:

Restructured tests to remove unnecessary browser-based interactions
Bypassed authorization and authentication for tests that don’t require it
Created test data more directly instead of through GUI interactions
Removed any dependencies between tests so each test can be run in isolation

By removing all the GUI interactions listed above, we can bypass many sources of test failures and timeouts. Our tests run in a more predictable time frame and with more consistent results. Life is good.