Data-driven testing


Core concepts

The data-driven testing methodology consists of running the same test repeatedly using different data values with each new iteration. This has the benefit of improved test coverage with less development effort. The data-driven approach makes it very easy to add new iterations simply by adding yet another data record into the data set used by the test. The data records must be stored in a tabular-like data format where each row (record) contains one or more fields that will be referenced and used in the test.

Below you have an example of a simple data-driven test whose data source (specified in the dataSet property) consist of three objects representing people we want to greet. The test will execute three times, once per each data record in the data set, and will log a greeting using the name stored in each record’s name field.

test-repo-path/tests/Data-driven.yaml
description: Simple data-driven test
dataSet: [ {name: Fred}, {name: Wilma}, {name: Pebbles} ]
actors:
  - actor: ACTOR1
    segments:
      - segment: 1
        actions:
          - description: Say hello
            script: |
              $log("Hello, " + $dataRecord.name + "!")

There are two important things to note about this test:

  • The records to iterate over are passed into the test using the dataSet top-level property. This property is typically an array of objects. This array can be specified using the JSON syntax or any variation of the YAML syntax. For complex data sets, the data is usually read from an external data file using the $data syntax.

  • The $dataRecord syntax is a variable that points to the current record in the data set. The value of this variable is updated at every iteration, so every time the test starts executing it will point to the next object (record) in the data set (see more on that below).

Using external data files

Real-life data used for data-driven testing can get quite complex. Data sets can easily extend to hundreds of records, and each data record can have tens of fields or more. Inlining the data set as shown in the previous example can come in handy when doing quick experiments or troubleshooting issues, but is not practical most of the times. Instead, we must store the data in external files and reference those files using the $data syntax. Let’s see how that’s done.

First, we must create a YAML data file in the data directory of the test repo:

test-repo-path/data/people.yaml
- name: Fred
  age: 36
  employed: yes
  occupation: crane operator

- name: Wilma
  age: 28
  employed: no
  occupation: housewife

- name: Pebbles
  age: 2
  employed: no
  occupation: infant

Now we can reference the people.yaml data file in our test. Just for fun, let’s also add a script action that performs a validation by checking the age and employed fields of the data record.

test-repo-path/tests/Data-driven.yaml
description: Simple data-driven test
dataSet: $data("people")
actors:
  - actor: ACTOR1
    segments:
      - segment: 1
        actions:
          - description: Say hello
            script: |
              $log("Hello, " + $dataRecord.name + "!")

          - description: Check minimum age of employment
            script: |
              if ($dataRecord.age < 14 && $dataRecord.employed == "yes") {
                $fail("Persons under 14 are not allowed to work!");
              }

Working with CSV data

There’s an all too common practice in the test automation community to use Excel files for storing test data. Excel is a very powerful tool, when used properly. However, Excel’s native file format presents a fundamental issue: it’s a binary format and not a text file. If your test data is under source control, which should always be the case, whenever two people make changes to an Excel file at the same time, one of them is bound to lose his changes, since binary files cannot be merged automatically by source control tools. If you insist on using Excel as a data editing tool, you should store your data using the CSV format instead. We don’t encourage this approach, but we do agree that there are times when a tabular format is more appropriate for the task at hand.

CSV (comma-separated values) is a plain text format that stores data records as lines of text and separates the field values for each record with commas. Here’s how our people.yaml data file looks like in the CSV format:

test-repo-path/data/people.csv
name,age,employed,occupation
Fred,36,yes,crane operator
Wilma,28,no,housewife
Pebbles,2,no,infant

The first row in the CSV is the "header row" and contains the names of the data fields, separated by commas. Each row after that represents a new data record that will be iterated over.

To use the CSV file instead of the YAML one, it’s sufficient to include the .csv extension when specifying the file name:

test-repo-path/tests/Data-driven.yaml (fragment)
dataSet: $data("people.csv")

OpenTest will convert the CSV data into the exact same data structure we had before: an array of objects, with each object containing all the fields (properties) defined in the CSV header row.


What next?

  • Create and run a simple data-driven test:

    • Go into the C:\opentest\test-repo\tests directory and create a file named Data-driven.yaml. Copy & paste the contents of the first code block in this page.

    • Go to the OpenTest web UI, create and start a new test session to run the Data-driven test you just created.

    • Examine the test results and the log file.

  • Modify the data-driven test you created in the previous step so that it loads its data set from an external YAML file:

    • Go into the C:\opentest\test-repo\data directory and create a file named people.yaml. Copy & paste the data shown in the first code block in the "Using external data files" section.

    • Change the dataSet property in your Data-driven.yaml test file to point to the data file you just created. The syntax is dataSet: $data("people").