While performance testing, it might not be a huge issue if the data you submit as part of your tests only vary slightly. In some cases, however, you might find yourself in a position where you’d like to keep not only the user interactions but also the data, as realistic as possible. How do we accomplish this without having to maintain long data tables? In this article, we’ll explore how we may utilize fakerjs and k6 to perform load tests using realistic generated data.
What is k6?
k6 is an open-source performance testing tool written and maintained by the team at k6. One of the main goals of the project is to provide users with a developer-centered, code-first approach to performance testing.
🤓 Completely new to k6?
Then it might be a good idea to start out with this Beginners guide to k6, written by Mostafa Moradian.
What is Faker?
Faker is a tool used for generating realistic data. It’s available for a lot of different languages - python, ruby, php and java to name a few.
In this particular case, we’ll use the javascript implementation, fakerjs, as it allows us to use it from within our test script, rather than generating the data before execution.
Goals
Historically performance testing, to a large extent, has been performed by running your test and then manually analyzing the result to spot performance degradation or deviations. k6 uses a different approach, utilizing goal-oriented performance thresholds to create pass/fail tollgates. Let’s formulate a scenario (or use case if you prefer) for this test and what it tries to measure.
The Acme Corp Scenario
Acme Corp is about to release a submission form, allowing users to sign up for their newsletter. As they plan to release this form during Black Friday, they want to make sure that it can withstand the pressure of a lot of simultaneous registrations. After all, they are a company in the business of making everything, so they expect a surge of traffic Friday morning.
Our test goals
While we could very well set up complex custom thresholds, it’s usually more than enough to stick with the basics. In this case, we’ll measure the number of requests where we don’t receive an HTTP OK (200) status code in the response, as well as the total duration of each request.
We’ll also perform the test with 300 virtual users, which will all perform these requests simultaneously.
Configuration
In k6, we express this as:
What does this mean?
So, let’s go through what we’ve done here. With 300 virtual users trying to fetch and submit the subscription form every second, we’ve set up the following performance goals:
- Less than 10% are allowed to fail in retrieving the form
- Less than 10% are allowed to fail in submitting the form data
- Only 5% or less are permitted to have a request duration longer than 400ms
The actual test
Now, let’s get on to the actual test code. The test code, which is executed by each VU once for each iteration, is put inside an anonymous function. We then expose this function as a default export.
The sleep test 😴
To make sure our environment is working, I usually start by setting up a test that does nothing except sleeping for a second and execute it once.
Which, when run, produces output similar to this:
Adding our thresholds
Notice the two new lines in the default function? For each iteration, we’re now adding data points to our threshold metrics, telling it that our requests did not fail. We’ll hook these up to do something meaningful as we proceed. We also added a duration to make the script run for more than one iteration.
For now, running the script should give you the following output:
Yay, it passes! Two green checks!
Adding requests
To be able to measure anything useful, we also need to add some actual requests. In this example, we’ll use https://httpbin.test.loadimpact.com/ as our API, which is our mirror of the popular tool HTTPBin. Feel free to use whatever HTTP Request sink you prefer!
And once again:
The output now also includes metrics around our HTTP requests, as well as a little green check next to the duration.
Adding Bundling and Transpiling
Now that we’ve got our script to work, it’s almost time to add faker. Before we do that, we need to make sure that k6 can use the faker library.
As k6 does not run in a NodeJS environment, but rather in a goja VM, it needs a little help. Thankfully, it’s not that complex. We’ll use webpack and babel to achieve this, but any bundler compatible with babel would likely work.
Let’s start by initializing an npm package and add all the dependencies we’ll need:
We’ll then create our webpack config. The details of webpack and babel are outside the scope of this article, but there are plenty of great resources out there on how it works.
and the .babelrc
file:
We’ll also modify our package.json so that we can launch our tests using yarn:
🧠 Did you know?
Using
pre
orpost
at the beginning of a script name, results in that script running before/after the script you’re invoking. In this case, thepretest
script ensures that every time we run our test, webpack first creates a new, fresh bundle from the source code. - Sweet, huh? 👍🏻
Enter Faker!
Let’s get right into it then! The first step is to add faker to our dependencies:
Faker has a quite extensive library of data that it’s able to generate, ranging from company details to catchphrases and profile pictures. While these are all handy to have, we’ll only use a tiny subset of what faker has to offer. Our object follows this structure:
We’ll now go ahead and create a service that we may use to generate said persons:
👿 Possible performance issues ahead!
All dependencies added tend to balloon the memory consumption to some extent, especially when they scale up to 300 concurrent instances. Because of this, it’s crucial that we only import the locale(s) we are using in our test case.
While putting together the example repository for this article, I noticed that using faker adds about 2.3MB of memory per VU, which for 300 VUs resulted in a total memory footprint of around 1.5GB.
You can read more about the javascript performance in k6 and how to tune it here.
You might have noticed that we prepend the name of the generated user with SUBSCRIPTION_TEST
. Adding a unique identifier for your test data is just something I find convenient to be able to quickly filter out all dummy data I’ve created as part of a test. While optional, this is usually a good idea - especially if you test against an environment that you can’t easily prune.
Final assembly
Now, let’s put it all together!
And with that, we’re ready to go:
Closing thoughts
While the flexibility you get by combining the javascript engine used in k6 with webpack and babel is near endless, it’s essential to keep track of the memory consumption and performance of the actual test. After all, getting false positives due to our load generator being out of resources is not particularly helpful.
All the code from this article is available as an example repository on GitHub, which I try to keep up to date with new versions of k6 and faker.
I’d love to hear your thoughts, so please hit me up with questions and comments in the field below. 👇🏼