Code smell | Hardcoded fake data in tests

Code smell | Hardcoded fake data in tests

May 21, 2023

codequality
refactorit
code-smell
testing

Hello, today I am writing again and in this post I am going to introduce you to how we incur in a frequently common code smell called Hardcoded fake data in tests, this code smell occurs when you see fake data that is needed for tests within the same test file.


Cause

When fake data is observed in the test files making them difficult to read and maintain.

Example

Suppose we have the following type to represent a user:

export type User = {
    id: string
    firstName: string
    lastName: string
    email: string
    phone: string
    active: boolean
}

And some functions inside a utils file to apply different filters to an array of users:

export const filterByActivated = (users: User[]): User[] =>
    users.filter((user) => user.active)
 
export const filterByValidEmail = (users: User[]): User[] => {
    const VALID_EMAIL_REGEX =
        /^(([^<>()[\]\\.,;:\s@"]+(\.[^<>()[\]\\.,;:\s@"]+)*)|(".+"))@((\[[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\])|(([a-zA-Z\-0-9]+\.)+[a-zA-Z]{2,}))$/
 
    return users.filter((user) => VALID_EMAIL_REGEX.test(user.email))
}

Now let's see how the tests of the functions of our utils would look like:

// utils.test.ts
import { User } from './types/User'
import { filterByActivated, filterByValidEmail } from './utils'
 
describe('utils', () => {
    it('filterByActivated', () => {
        const mockActivatedUser: User = {
            id: '1',
            firstName: 'Jhon',
            lastName: 'deville',
            email: 'test@test.com',
            phone: '000000000',
            active: true,
        }
 
        const mockDeactivatedUser: User = {
            id: '2',
            firstName: 'Marta',
            lastName: 'deville',
            email: 'test@test.com',
            phone: '100000000',
            active: false,
        }
 
        const activatedUsers = filterByActivated([
            mockActivatedUser,
            mockDeactivatedUser,
        ])
 
        expect(activatedUsers).toEqual([mockActivatedUser])
    })
 
    it('filterByValidEmail', () => {
        const mockUserWithValidEmail = {
            id: '1',
            firstName: 'Jhon',
            lastName: 'deville',
            email: 'test@test.com',
            phone: '000000000',
            active: true,
        }
 
        const mockUserWithInvalidEmail = {
            id: '2',
            firstName: 'Marta',
            lastName: 'deville',
            email: 'test.com',
            phone: '100000000',
            active: true,
        }
 
        const usersWithValidEmails = filterByValidEmail([
            mockUserWithValidEmail,
            mockUserWithInvalidEmail,
        ])
 
        expect(usersWithValidEmails).toEqual([mockUserWithValidEmail])
    })
})

Apparently it seems like a quite innocuous test but it is hiding several problems that we will find when having to make some change in the User type, for example if we were asked to add a new field to it it would cause a forced modification of all the tests that depend on it. of it and have the fake data hardcoded.

This should not happen and the answer is very simple, why should I modify the test suite if I add a field, for example, to store the user's date of birth? What relationship does this field have with my tests? correct, the answer is none, hence it does not make sense to have to modify them, and this is not the worst, if we extend this practice through our code, as you will suppose, each change in the User type will cause cascading changes of all the tests related to it, which doesn't make any sense.


Solution

To solve this smell we will rely on two libraries, in our case when using typescript we will use the compatible versions of:

  • Fishery, will help us build factories based on types, in our case we will use User, in this way we will centralize the changes of it in a single point.
  • Faker-js, provides us with a very complete API to generate fake data of different types randomly.

In the event that your language is not compatible with these libraries, it will be necessary to look for alternatives or build your own small factory utility.

If we put both together we can create users based on a data model with fake random data.

Sounds good right? Let's see how this factory would look like:

// UserFactory.ts
import { Factory } from 'fishery'
import { faker } from '@faker-js/faker'
 
import { User } from '../types/User'
 
export default Factory.define<User>(({ sequence }) => ({
    id: `${sequence}`,
    firstName: faker.name.firstName(),
    lastName: faker.name.lastName(),
    email: faker.internet.email(),
    phone: faker.phone.number('+34 ### ### ###'),
    active: faker.datatype.boolean(),
}))

Now we will use the factory in our tests as follows:

// utils.test.ts
import userFactory from './factories/userFactory'
 
import { filterByActivated, filterByValidEmail } from './utils'
 
describe('userFilter', () => {
    it('filterByActivated', () => {
        const mockActivatedUser = userFactory.build({
            active: true,
        })
 
        const mockDeactivatedUser = userFactory.build({
            active: false,
        })
 
        const activatedUsers = filterByActivated([
            mockActivatedUser,
            mockDeactivatedUser,
        ])
 
        expect(activatedUsers).toEqual([mockActivatedUser])
    })
 
    it('filterByValidEmail', () => {
        const mockUserWithValidEmail = userFactory.build()
 
        const mockUserWithInvalidEmail = userFactory.build({
            email: 'test.test',
        })
 
        const usersWithValidEmails = filterByValidEmail([
            mockUserWithValidEmail,
            mockUserWithInvalidEmail,
        ])
 
        expect(usersWithValidEmails).toEqual([mockUserWithValidEmail])
    })
})

As can be seen, in addition to making our tests much cleaner, we gain a lot in flexibility and maintainability, since if we are now asked to add a field to our User type where we are going to store date of birth, we will only have one modification point, the factory of our tests, in this case our tests do not have to worry about said change.


Benefits

  • Flexibility
  • maintainability
  • Readability
  • Greater robustness by having to support the cases derived from the creation of content randomly
  • Type inference if we use typescript

Thanks for reading me 😊