How to use the Factory design pattern to create browser instances: the simple approach
March 14, 2021The best way to add a Request Body to a POST request using Rest-Assured
November 28, 2021Which problem do we want to solve?
We will try to solve one of the worst practices used on tests at any level: fixed/hard-coded data.
I want to avoid as much as possible any manual pre-action before I can run my tests and, because of that, I try to avoid as well the usage of static files (CSV, TXT, XLS, JSON).
Here we will see a common usage from Java developers: the RamdomStringUItils
and how it might not be the best choice for automatic data generation.
By the way, I recommend automatic data generation in the tests using the Test Data Factory approach, and you can find an example here in my blog: Test Data Factory: Why and How to Use.
The examples described here are simple, without the usage of the Test Data Factory, and will show you why the RandomStringUtils
might not be the best approach.
Example
We will automatically generate data for a Customer object with the following criteria
Attribute | Type | Constraints |
---|---|---|
id | int | Not null |
name | String | Not null and size between 2 and 50 characters |
profession | String | Not null and size between 2 and 30 characters |
accountNumber | String | Not null and size as 18 characters |
address | String | Not null and size between 2 and 50 characters |
phoneNumber | String | Not null and size between 11 and 14 characters |
birthday | Date | Not null |
To reduce the number of tests, the key point is to generate valid data given the constraints. In a professional environment, we would implement the tests for the edge cases as well.
Think about this Customer data as an object used in any test level (unit, integration, service, UI).
What does the RandomStringUtils class do?
RandomStringUtils is a class from the Apache Commons Lang library that generates random Strings based on different conditions like:
- length
- letters
- numbers
- alphanumeric
- ASCII
- numeric
It’s a static class where you can directly generate any String, so it’s super handy!
See the example below, where you can generate a different set of random data.
public class RandomStringUtilsExample {
public static void main(String[] args) {
// returns a String with 5 numbers
// example 82114
RandomStringUtils.randomNumeric(5);
// returns an alphanumeric String with length as 30 mixing upper and lower cases
// example gQ6RB8MiwKOg9O3qnHFo7I3jilHoIy
RandomStringUtils.randomAlphanumeric(30);
}
}
What is the result of using RandomStringUtils class?
Let’s first take a look at the code example implementing the usage of RamdonStringUtils
:
- line 7 uses the
RandomStringUtils.randomNumeric()
method to generate an int value and, to make it, possible we are parsing the String into Int using Integer.valueOf() - lines 8 to 12 use
RandomStringUtils.randomAlphanumeric()
to generate alphanumeric data - line 13 has a fixed date as now (today) because
RandomStringUtils
generates onlyStrings
class BasicExampleTest {
@Test
@DisplayName("Data validations using RandomStringUtils")
void randomStringUtils() {
CustomerData customerData = CustomerData.builder().
id(Integer.valueOf(RandomStringUtils.randomNumeric(10))).
name(RandomStringUtils.randomNumeric(50)).
profession(RandomStringUtils.randomAlphanumeric(30)).
accountNumber(RandomStringUtils.randomAlphanumeric(18)).
address(RandomStringUtils.randomAlphanumeric(50)).
phoneNumber(RandomStringUtils.randomAlphanumeric(14)).
birthday(new Date()).
build();
}
}
The output of the test execution, if we print or inspect the customerData
object, is:
{
"id": 1335130963,
"name": "GGXS19kN6kSuzHwW6T0YjJCxUaIyKKmAaUdQH51gdUAtt1TwqY",
"profession": "0kk8HSiFgCUVfLzbD3PyR6cn8j0LH3",
"accountNumber": "PqvekXb9ekRAJi3ypy",
"address": "90lqP2LHnQMWtmMP8vasO3BR5dsICIL85u5sJ0yjGKWXxCkFsj",
"phoneNumber": "OpoJ3tOE53woy9",
"birthday": "Sep 26, 2021, 10:01:10 PM"
}
We could successfully generate the necessary data! Yay!
What does DataFaker do?
DataFaker is an open-source library based on (actually an improvement of) DataFaker to generate fake data.
I invite you to take a look at the GitHub repo and see the different objects to generate data.
What is the result of using DataFaker?
The code implementation to generate data using the CustomerData
class is:
- in line 9, the
number()
method is in use to generate a random number - in line 10, the
name()
method is in use to generate a full name - in line 11, the
company()
is in use to generate a profession - in line 12, the
finance()
method is in use to generate a valid IBAN for the Netherlands country - in line 13, the
address()
method is in use to generate a full street address - in line 14, the
phoneNumber()
method is in use to generate a cell phone number - in line 15, the
date()
method is in use to generate birthday data for the age between 18 and 90
class BasicExampleTest {
@Test
@DisplayName("Data validations using faker library")
void faker() {
Faker faker = new Faker();
CustomerData customerData = CustomerData.builder().
id((int) faker.number().randomNumber()).
name(faker.name().name()).
profession(faker.company().profession()).
accountNumber(faker.finance().iban("NL")).
address(faker.address().streetAddress()).
phoneNumber(faker.phoneNumber().cellPhone()).
birthday(faker.date().birthday(18, 90)).
build();
}
}
The output of the test execution, if we print or inspect the customerData
object, is:
{
"id": 520543,
"name": "Tena Pagac",
"profession": "photographer",
"accountNumber": "NL07HUUN1518167413",
"address": "12672 Romaguera Tunnel",
"phoneNumber": "(561) 638-5813",
"birthday": "Mar 5, 1982, 10:29:18 AM"
}
We could successfully generate the necessary data! But let’s not focus on the differences.
Comparing both approaches
There are two aspects I would like to consider to choosing between one approach or another:
- legibility of future troubleshooting (log analysis)
- easy data creation with different criteria
We can see the main differences by comparing the data results side by side (click on the image to expand it):
Legibility of future troubleshooting (analysis)
The regular activity for an engineer who writes code is troubleshooting: we constantly see the logs and debug the application to understand current and future problems in the code.
Now imagine yourself looking at the CustomerData
object where the data was filled in with the RandomStringUtils
approach: it’s hard to correlate the data you have with a list of objects you might get or even take a look at the data used inside a log file.
Easy data creation in different criteria
For most of the attributes present in the CustomerData
class, you can use RandomStringUtils
to generate the different criteria. For example, you can easily set 51 characters to the name attribute and expect a failing constraint validation using RandomStringUtils.randomAlphanumeric(51);
For more specialized data, like phone number and date you need a proper library, and DataFaker can generate both data.
In this way, we can make the process easier by adopting one library.
Considerations
Of course, I’d put more emphasis on the DataFaker library because we have almost everything we need to generate data, but it does not exclude a possible necessity to use the RandomStringUtils class or any other class placed in the Apache Commons library.
The main consideration here is the ability to generate all the possible data you need using a single source of truth without reinventing the wheel, as well as the indirect benefits it will show during the troubleshooting process.
Examples
The avoid-random-string-utils project shows a basic example comparing RandomStringUtils vs DataFaker.
The restassured-complete-basic-example project has a factory data class to generate all the necessary data in different conditions. It’s a good real-world example.