Transcripted Summary

HTML, CSS, and JavaScript are all coded documents that a Web browser renders together into a visual Web page. When the browser renders the page and then does a subsequent execution, it needs an interface for handling the Web page. Enter the “DOM.”

No, it’s not someone’s nickname - The Document Object Model, or “DOM” for short, is a programming interface for HTML and XML documents. Programming with the DOM is a big deal. It enables programmers to manipulate the page in various ways, such as:

  • Searching for elements
  • Changing element content
  • Changing the HTML structure of the page
  • Changing the CSS styling of the page

The DOM is called an “object model” because it presents the page as an object. That “document object” contains an object representing each element. Element objects are nested from a root element to mirror the HTML structure of the page.

What’s really nice about the DOM is that it is not dependent upon any one programming language. It is most commonly used by JavaScript to manipulate Web pages in a browser, but it could be used by any other language, too. A good example of this would be using a scripting language like Python to scrape Web page contents. Another good example would be using test automation to poke and prod pages. The DOM also works for XML, but for this course, we will focus on HTML.

The first step with DOM programming is getting the elements themselves. Programming with the DOM makes one thing very clear: there is a difference between an element, a locator, and a selector.

  • An element is an object representing a live, rendered HTML element on the page.
  • A locator is an object that points to an element on a page.
  • A selector is a query string that denotes how to locate the element in the DOM.

To sum them up in one line: A locator uses a selector to find an element on a web page.

Why is this distinction important? Two main reasons:

  1. Direct paths from root-to-child would be very long and complicated. It’s not uncommon for child elements to be nested under dozens of layers. Imagine programming the object references from parent to child for the whole chain - that would be crazy long! It makes more sense to write smaller, more meaningful locator queries to find desired elements.
  2. There is no guarantee that specific elements will actually appear on the page. Dynamic content means ever-changing content, and elements can be added, removed, or changed on a whim. Developers could also change the HTML structure, too, so it makes more sense to try to “discover” desired elements. Errors in the HTML, CSS, or JavaScript may also cause web elements to not appear on the page at all.

For these reasons, we must separate the concerns of the element objects themselves and the locators used to find them.

There are many types of locators, such as:

  • IDs
  • Class names
  • CSS selectors
  • XPaths

We will cover different locator types in great detail in future chapters, as well as when to use which one. For now, just know that locators are the standard way for finding elements in a Web page, and that every element can have a unique locator. Also, know that a locator can return multiple elements, not just one - it will return all elements found that match its query.

Once element objects are obtained, there are many ways to interact with them. JavaScript specifically provides methods to not only change the state of the elements but also to send user-like actions to them. For example, the “click()” method will programmatically click an element as if a user had clicked it visually. The “textContent” property will get the text displayed by the element. The “getAttribute()” method will get a particular element attribute by name, and the “setAttribute()” method will add or change an element attribute. Anything a user can do visually in a browser can also be done programmatically with JavaScript actions. In fact, Cypress relies upon direct JavaScript calls within the browser.

Locators are also crucial for black-box testing outside of the browser. For example, Selenium WebDriver relies upon locators to find elements and interact with them. The main difference for WebDriver calls is that they cannot change the state of elements - they can only access the state and send interactions. Furthermore, WebDriver calls don’t call JavaScript directly - they operate using the WebDriver protocol.

Another browser automation tool that uses locators is Playwright. Unlike Selenium WebDriver, Playwright manipulates the browser using debug protocols. However, just like Selenium and Cypress, Playwright uses locators to find elements.

Regardless of the tool, you need to understand the DOM and know how to write good locators to develop automation.


© 2023 Applitools. All rights reserved. Terms and Conditions Privacy Policy GDPR