Chapter 2 - Programming with the DOM



Transcripted Summary

HTML, CSS and JavaScript are all coded documents that a web browser renders together into a visual web page. When the browser renders the page, and then, does a subsequent execution it needs an interface for handling the web page.

Enter the DOM.

No, that's not someone's nickname.




The Document Object Model, or DOM for short, is a programming interface for HTML and XML documents.

It enables programmers to manipulate the page in various ways such as:

  • Searching for elements

  • Changing element content

  • Changing the HTML structure of the page

  • Changing the CSS styling of the page

The DOM is called an "object model" because it presents the page as an object. That document object contains an object representing each element within it. Element objects are nested from a root element to mirror the HTML structure of that page.

What's really nice about the DOM is that it is not dependent upon any one programming language. It's most commonly used by JavaScript to manipulate web pages in a browser, but it could be used by any other language. A good example of this would be using Python to scrape web page contents.

Another good example would be using test automation to "poke and prod" at pages under test. The DOM also works for XML, but for this course we'll focus on HTML.

The first step with DOM programming is getting the elements themselves.




Programming with the DOM makes one thing very clear — there is a difference between an element and its locator.

A web element is an object representing a live rendered HTML element on the page. A Web element locator on the other hand (also sometimes called a "selector") is a query that finds and returns specific elements from the DOM. In short, locators find elements.

Why is this distinction important? Two main reasons:

  • First, direct paths from root to child would be very long and complicated. It's not uncommon for child elements to be nested under dozens of layers. Imagine programming object references from parent to child for the whole chain. That would be crazy long. It makes much more sense to write smaller, more meaningful locator queries to find the desired elements.

  • Secondly, there's no guarantee the specific elements will actually appear on the page. Dynamic contents means ever-changing content, and elements can be added, removed or changed on a whim. Developers could also change the HTML structure too, so it makes more sense to try to discover desired elements. Furthermore, errors in the HTML, CSS or JavaScript could cause web elements to not appear on the page at all.

For these reasons we must separate the concerns of the element objects themselves, and the locators used to find them.




There are many types of locators such as IDs, names, class names, CSS selectors, and XPaths.

We'll cover different locator types in great detail in the future chapters, as well as when to use which one. For now, just know the locators are the standard way for finding elements in a web page, and that every element can have a unique locator.

Also, know that a locator can return multiple elements, not just one. It will return all elements found that match its query.

Once element objects are obtained, there are many ways to interact with them.

JavaScript specifically provides methods, not only to change the state of the elements, but also to send user like actions to them.




  • For example, the click() method will programmatically "click" an element as if a user had clicked it visually.

  • The textContent property will get the text displayed by the element.

  • The getAttribute() method will "get" a particular element attribute by name (for example, getElementByClassName, in the example above),

  • And likewise, the setAttribute() method will add or change an element attribute.

Anything a user can do visually in the browser can also be done programmatically with JavaScript actions.

In fact, test frameworks like Jasmine, Mocha, Jest, and Cyprus all rely upon direct JavaScript calls within the browser.

Locators are also crucial for black-box testing of the browser.

Selenium WebDriver relies upon locators to find elements (findElement) and interact with them.

  • The main difference for WebDriver calls is they cannot change the state of elements. They can only access the state and send interactions.

  • Furthermore, WebDriver calls don't necessarily call JavaScript directly. They operate using the webDriver protocol as implemented by each browser type.

Regardless, the DOM, and the need for locators are still present with Selenium WebDriver automation, and as a caveat, webDriver can execute JavaScript code directly.



Resources