Overcoming UI Automation Challenges

UI automation isn’t an easy task, as any UI feature test owner who is responsible for automating their areas can tell you. It is quite challenging in terms of getting more test coverage and higher test reliability from automation. Let's discuss these challenges in more detail.

From the test coverage perspective:

Each automated test consists of two parts: the code to drive the product and the code to validate the test result. Many product teams have a coverage goal of 70% or higher. Without help from developers' unit tests, it is actually quite difficult to achieve this goal by UI automation tests alone. Here are 3 main reasons: test hooks, result validation, and cost/complexity.

1) Some UI controls can't be seen (or found) by the UI automation framework being used. This is either due to some technical limitations of the UI automation framework or the accessibility support for the UI controls not implemented properly in the product code. Either way, this often prevents common tests from being automated until the problems are resolved.

2) Test validation often is not easy, especially for UI tests or UI changes that require manual verification from testers. Many teams are trying to solve this problem by developing tools or using techniques like screenshot comparison, video comparison, etc. But these tools can't replace human testers very well because they usually cause higher rates of false positive or false negative results in automation tests. As a result, many UI tests like these won't be automated.

3) Complex UI test scenarios have more steps involved, which potentially adds more cost to the automation effort and may introduce more instability to the tests. From an efficiency perspective, sometimes it is much cheaper to run the tests manually rather than spending time automating them, especially when there is a short product cycle. Also, it is difficult to manually trigger most error messages and difficult to automate them. For this reason, we are not able to achieve the test coverage we want from our automated tests.

From the pass rate (reliability) perspective:

Ideally, tests fail when the product breaks. But this is not always the case for complex projects. Sometimes no matter how hard they try, test owners still need to spend time fixing their tests as part of their daily routine. This is, of course, a headache that they try to avoid. Unfortunately, there isn't a quick or easy solution to remedy this. Here is why:

1) Timing issues are one of the most common and annoying problems that we encounter in UI automation. The product code and the test code usually run in different processes, which sometimes results in their getting out of sync. It isn't easy to handle or debug a timing issue. Some testers run into situations where their test continues to fail although it has been fixed over and over again. Every time there is a UI state change in a product, it creates a potential timing issue for UI automation. To handle a timing issue properly in test code, waiters (UI event listeners), ETW events, polling, or even "sleeps" are often used. Since timing issues can be difficult to reproduce, it might take a few tries before the root cause is determined and the proper fix is applied.

2) Each UI automation test is designed and written based on the expected product behaviors, UI tree structure and control properties. So if a developer changes a product behavior or UI layout, or modifies a control property, that could easily break automated tests. When that happens, testers will need to spend time to debug and fix their test failures. If there is an intermittent issue in the product, debugging the test failure to figure out the root cause is even tougher. In this case, test logs, product logs, screenshots and videos for the failing test state are needed for debugging. Sometimes, additional test logging is required to help debug the problem. In the early product development stage where the product and test automation are not yet stabilized, the test code is always assumed "guilty" by default until proven "innocent" when there is a test failure.

3) We all know that there is no bug-free software out there. So it shouldn't be a surprise that issues from the operating system, network, backend services, lab settings, external tools, or other layers account for many automation test failures. Since those are external issues, testers have to implement less than ideal workarounds to fix their automation failures, which is very unfortunate. Please keep in mind that workarounds for external issues don't always work well because they often band aid problems only.

UI automation remains one of the biggest challenges and a significant cost for test teams in product groups in the upcoming years. Below are some proposed ideas on how to mitigate the UI automation challenges discussed above:

1) Choose a UI automation framework that fits your automation requirements. Four major criteria can be used when testers write prototype tests during the framework evaluation.

- Performance: how fast your tests can be run

- API Supports: how easily you can build your own OMs and tests and how many major test scenarios can be automated with the framework

- Reliability: how stable your tests can be (are your tests flaky because of the framework?)

- Maintainability: how easily you can update your existing test OMs when there is a UI/behavior change in the product

2) Work closely with your developers to add testability (such as accessibility supports and automation IDs for UI controls) needed by your automation. Searching for UI controls by their automation IDs is usually the preferred way because that can reduce some localization support for non-English builds.

3) Make sure to use a waiter or other waiting logic every time there is a UI state change or a timing delay in the UI after triggered by a test step.

4) Understand your product behaviors and pay attention to detail. Again, UI automation tests are written based on expected product behaviors. In other words, any unexpected product behavior could potentially break your tests. Your tests will become more reliable if you can properly handle product behaviors in various situations.

5) The cost for UI automation is actually a long term investment and isn't cheap. So do not try to automate every test. We can't afford to do that anyway since we always have other tasks to do. Before you decide to automate a test, you should ask yourself a couple of questions first: is it worth the effort to automate this test? What is the gain vs. the cost? Basically, almost everything can be automated, the difference being the cost. Here are 3 areas that we can start first: a) BVTs (Build Verification Tests) and performance tests; b) common P1 test cases; c) test cases that provide high coverage with reasonable cost (under 2 days). After that, then we can continue to automate more test cases if time allows.

6) Find and leverage (maybe develop) more tools to help validate results for UI tests.

7) Identify the root cause of a test failure and fix it completely in the right place. Do not just band aid the problem. For example, try to remove or not use SLEEPs!

8) If a test failure is caused by an unexpected product behavior, push the development team to fix the issue. Do not try to work around the issue in test code as that will hide a product bug.

9) Write less automated tests and focus on achieving a higher test pass rate in the early stages of product development. That will help reduce the cost for test case maintenance. Once the product is stabilized, then write more test automation and focus on the code coverage goal.

10) Abstract test cases from UI changes to reduce the test maintenance cost. Please see the article on Object Model Design.

 

Wei X. Feng, Microsoft

Comments

  • Anonymous
    March 05, 2012
    Very good information!

  • Anonymous
    April 30, 2014
    I completely agree with all this

  • Anonymous
    February 11, 2015
    Good summary! Regarding false positive results in automated regression, please do you know any benchmark how many tests can fail from an entire automated regression due to this type of issue? For example, we kick off 1000 tests and 200 fails. From 300 failed tests, only 100 tests failed due to a defect. 200 tests failed due to script design issues, test data issues, changes in the apps under tests, and environmental issues. This is 20% of all run tests. I found another article saying that 20% of false positives can be considered as a benchmark for any automation framework. My team is now targeting 10% of false positives in our Selenium framework, but I would like to hear from other teams to see how we perform.