 
  One of the most common problems in test automation is to perform comparison of pdf files.
To consider a simple test flow, we could be doing some actions on the web page, generate a pdf and then verify the pdf is as per our expectation by either asserting on the pdf or compare it to some existing baseline.
If we want, we could implement this with a functional assertion which would be tedious at best.
However, Applitools provides a much simpler and flexible approach to achieve this in the form of a Java command line utility called **ImageTester **which works and provides us results in Applitools site as we have been seeing throughout this tutorial.

In this tutorial, we will dive into this with an example site where we generate a pdf invoice and then perform visual assertions on it.
But first, let’s see how to setup this utility and see an example:
Step1: Navigate to Applitools website
Step 2: Click on the Download button and then select Files under the bintray url

This gives us access to the Image Tester jar.

You can also check out additional documentation on GitHub repo
Step 3: I have already downloaded this and moved it to a libs directory
To run this utility, you must have java installed on your machine.

You can check what version of java you have installed and if not go to Oracle website and download it. I have java 8 installed and set up my machine currently.
As you can see from the Applitools ImageTester website (first image at the top of this page), here is a list of parameters that this utility accepts and a list of optional parameters also.
For instance, you can tell it:
the name of application under test
the Applitools server URL
the match level for the beauty of comparison
Let’s see this utility in action.
In our project structure, I have a pdf file already created.

Let’s say, for instance, that we want to check if this pdf file is as per our expectation.
We can do this with the ImageTester utility.
Let’s do that now.
I currently have changed directories into the folder which has this pdf file located.

And now I want to trigger the ImageTester utility.
Step 4: To run this utility, we need to execute java with the path where the utility is present.
java -jar ~/libs/ImageTester_0_4_8.jar -k $APPLITOOLS_API_KEY -f .
The -k command line parameter accepts the Applitools API Key. We can either provide the API Key directly, or we can put it into an environment variable named  APPLITOOLS_API_KEY and use it here.
The -f argument specifies the folder where our desired files are present (and by default it takes current path).
Since we are already into the directory where the pdf file, let’s execute this utility now.
The utility basically executes a comparison between this pdf and an earlier captured baseline.

In my case, I have an example set up where I already captured the earlier baseline.
And as you can notice, the utility says it has detected a mismatch.
Let’s take a look into Applitools to see what has gone wrong.

As you can see, there is a new test created, which is Unresolved in status.
ImageTester has compared this checkpoint to an earlier existing baseline where the comments and the dates were different, and hence the mismatch that we received.

Since this change looks valid to me, I have accepted this as the new baseline.
Let’s run the utility again and it was deemed to be passed.
Now that we understand how the utility works, let's explore how we can integrate this in our automated tests.
For the next part of this tutorial, we are going to use a simple web app that takes a bunch of invoice related information and generates a pdf file.

Our test flow looks like something like this:
Open the application under test
Enter the business and client name
Enter an item that we want the invoice to be generate for and its rate
Click on Get Link to generate the pdf file
Let’s do that now.
This flow generates an invoice which we want to compare against a preexisting baseline.

We will download the invoice.
Then we will move the downloaded file to resources directory inside out project structure.
Next, we will be visually asserting the pdf files by executing the ImageTester jar from our Python code.
Finally, we assert that if the response from the ImageTester utility does not contain the word “Mismatch” which gives us a hint that the visual assertion passed.
I have already gone ahead and added a test that does the flow that we described earlier.
import pytest
import assertpy
from automation.page_objects.invoice_simple.invoice_page import InvoicePage
APP_UNDER_TEST = 'https://app.invoicesimple.com/'
EXPECTED_FILE_NAME = 'INV0001'
@pytest.fixture(autouse=True)
def setup(manager):
    driver = manager.driver
    driver.get(APP_UNDER_TEST)
    driver.maximize_window()
    yield manager
def test_pdf(setup):
    invoice_page = InvoicePage(setup.driver)
    invoice_page \
        .enter_from_and_to_details('Gaurav', 'Rob') \
        .enter_item_with_rate('Comics', '10') \
        .click_on_get_link() \
        .download_pdf(EXPECTED_FILE_NAME) \
        .move_to_resources()
    result = setup.validate_pdf()
    assertpy.assert_that(result).does_not_contain('Mismatch')
Let’s take a look at the test file.
Here we have a simple test, test_pdf, that does the same actions that we described.
After generating the pdf, we will download_pdf file and move_to_resources directory.
This is followed by triggering the validate_pdf function.
Let’s take a look at that function in our eyes_manager.py file.
    @staticmethod
    def validate_pdf():
        cmd = """java -jar {} -k {} -f {}""".format(IMAGE_TESTER_PATH,
                                                    APPLITOOLS_API_KEY,
                                                    get_resources_dir_path())
        output, _ = execute_cmd(cmd)
        str_output = output.decode('utf-8')
        print('Command execution completed... \n' + str_output)
        return str_output
The validate_pdf function prepares the same command that we want to execute, and accepts the path where the ImageTester utility is present, the Applitools API key, and the path to the resources directory.
We’ll execute this command (execute_cmd) using the Popen module of Python.
import subprocess
from subprocess import Popen
def execute_cmd(cmd):
    handle = Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.STDOUT, shell=True)
    stdout, stderr = handle.communicate()
    return stdout, stderr
A couple of important things to notice:
We need to provide the default argument with shell=True to ensure that it actually creates a terminal session and goes ahead and executes the command.
Also, we call the communicate function on the handle to make sure we get the standard output (stdout) that is returned by the command line execution.
I have also added a bunch of file related utils which help us to ascertain whether the file has been downloaded.
import os
def is_file_downloaded(path):
    if os.path.exists(path):
        return True
def remove_file_if_exists(path):
    if os.path.exists(path):
        os.remove(path)
def move_file_to_path(src, dst):
    os.rename(src, dst)
This will also remove the file if it already exists; and moves the file to our path in the resources directory.
We have also set up a custom wait function that will accept any method and its argument (args) till a certain specified timeout.
import time
POLL_FREQUENCY = 0.5
WAIT_TIME = 1
def until(method, args=None, timeout=30, message=''):
    time_to_wait = WAIT_TIME
    end_time = time.time() + timeout
    while True:
        try:
            value = method(args)
            if value:
                return value
        except Exception as exc:
            print(exc)
        time_to_wait *= POLL_FREQUENCY
        time.sleep(time_to_wait)
        if time.time() > end_time:
            break
    raise NameError(message)
A couple of interesting things to note in this framework for our test.
We have broken down the application into Page Objects with one Page having the code for the invoice page.

And another one with the methods around the download operation.

One thing interesting thing to note here in the invoice_download_page is we are returning either the object instance of the current page object (self) or the instance of  next instance of the page object (InvoiceDownloadPage) when we perform a certain operation on the app.
This allows us to chain the methods in a logical sequence and also increases the readability aspects of our test code.
This pattern is also known as the Fluent Pattern.
Let’s run the test now.

As we can see the test failed.
Our assertion expected the ImageTester utility to not have “Mismatch” keyword but it did; and we failed the assertion.
Let’s take a look into Applitools website.

We can see that our automated test has created a new test.
We can see that the date that this invoice was generated was different, and our visual assertion has failed. This is a very common scenario that can happen when you want to compare your checkpoint and your baseline images.

However, we can instruct Applitools to be smart about it and ignore this region.
Let’s do that now.
We can go into ANNOTATIONS and select Ignore Region from the dropdown. Then we select the area on the screen that we want Applitools to ignore.

Let’s mark this as the new baseline and run the test again.
The test passed, and if we take a look into the Applitools website you can see that it passed there as well.
And that’s it.
Using the ImageTester utility, you can verify your pdf files against any baselines that you have captured.