(My) Definitions

When I say web UI automation…

… I mean:

  • Control a web browser and access a web page’s elements…
  • Programmatically :)

DOM Object

DOM stands for Document Object Model and “is a cross-platform and language-independent application programming interface that treats an HTML, XHTML, or XML document as a tree structure wherein each node is an object representing a part of the document. The objects can be manipulated programmatically and any visible changes occurring as a result may then be reflected in the display of the document [1].”


Document Object Model By Birger Eriksson - Own work, CC BY-SA 3.0

When you’re automating a web application, your job is to locate all elements that you need to interact with:

  • Fields
  • Buttons
  • Menus
  • Images
  • Spinners
  • Dropdowns
  • Lists

Locating Web Elements

There are different schemes for locating a web element, each of them having different levels of complexity and reliability:

  • Text
  • Value
  • CSS attributes
  • Tag name
  • XPath [2]
Google web page

Google’s web page and source code for the search field.

There is a reason why I placed XPath at the bottom of this list:

  • XPath is very fragile and any changes to a web element in the web page can break your locatiing strategy

  • Xpath can be very evil too (i.e. hard to wrap your mind around it):

    //a[contains(@href, 'compute_resources') and
[1]DOM https://en.wikipedia.org/wiki/Document_Object_Model
[2]Xpath (XML Path Language) is a query language for selecting nodes from an XML document. https://en.wikipedia.org/wiki/XPath