Model Validation

Oleksii Koshkin
12 min readJan 2, 2024

--

Here I would like to present one approach to front-end data validation. I don’t think it’s anything new, but it’s pretty handy.

TLDR: Backend-like (but applicable on both frontend and backend) validation, where you have data (“model”), a validation model (“validation model”), and an engine that will execute the validation model on the data and then return the result.

The results are divided into three states: “error”, “warning” and “notice”, which are declared for each “validator” (elementary checking function) or “rule” (set of validators).

This is a simplified version of the validation engine, supporting only synchronous validation and using simplified syntax, but it is open-source, customizable and free.

Live demo application: use this link. Other links are at the bottom of this article.

What is “validation?

Usually, various frameworks like React, Angular or other have components and we have “component-level” granularity of a validation.

It means, if we have a data model like:

const UserData = {
personalData: {
name: 'John',
surname: 'Doe'
},
contacts: [
{type: 'email', value: 'johndoe@mail.com', default: true},
{type: 'cellular', value: '+5500123456678'},
{type: 'work-phone', value: '+55006543210'}
],
preferences: {
colorTheme: 'dark',
volume: 60,
}
}

We are storing this data somewhere, in a service, in a store, in some context. And, then, we are displaying these data values by different components:

export const PersonalData: React.FC = ({data: UserData}) => {
// Some code to determine if there any error
return <fieldset>
<div className={'some_class_to_reflect_the_error_state'}>
<label for="user_name">Name</label>
<input type="text"
name="user_name"
value={data.personalData.name}
onChange={(e) => changeData('name', e.target.value)}
/>
{/* Some error message(s) for the field */}
</div>

<div className={'some_class_to_reflect_the_error_state'}>
<label for="user_surname">Name</label>
<input type="text"
name="user_surname"
value={data.personalData.surname}
onChange={(e) => changeData('surname', e.target.value)}
/>
{/* Some error message(s) for the field */}
</div>
</fieldset>;
}

And for sure, we need to validate data before submitting the form.

With component granularity, we have to add such validations for every single field at the component level. We have to take care of user.personalData.name, user.contacts and every field we need to validate.

Then we need to check the overall state before sending. And then — check the transmitted data again, on the server side:

Typical data flow

It’s always a pain, and you know, where.

There are plenty of libraries we can use to simplify the process: e.g., Formik. But anyway, there –

  • We will have validation at the field level, and we need to aggregate it somehow. In other words, we have 3 “dumb” components that map to a common model, and each component has its own validation and error state, and we need some way to check if the “Submit” button — which is outside of the data component’s tree — should be disabled.
  • If we do have an all-in-one solution, it is usually user interface oriented and therefore not very user-friendly: we need to layout data — decompose fields — decompose states — layout overall state — layout data… it is clumsy and awkward. Pieces of data and state are constantly flying back and forth:
Smart component with validation

This is a dilemma: we need to encapsulate all the logic related to data in a single component like <GenericTextInput>, but it seems logical to keep the logic related to validation also inside… or outside? because the state of one component can only be determined based on other fields…

  • Or we can have a “smart” container component (or store or service) and more “dumb” view components, and then share the validation state as props. This is closer to the model level :
Container component with validation and data, Less Smart presentation component

And now we can do one step forward and create a generic validation engine which is a model-level validation. Instead of validating logic in various components at the UI level, we can store all the validation-related stuff in a dedicated service, passing data and rules for validation.

With model-level validation we have one, single data model, one single data validation model and one, single validation check result.

It was very long introduction… well, glad to finish.

A bit of a boring history

I used this approach for the first time in ~2014 in an AngularJS-based project. It was a very large fintech with very complex validation rules and huge forms, and regressions were always painful. For example, part of logic was asynchronous, and another one — asynchronous server-side. Yes, the user selects one value and we should display “Sorry, not enough items in stock to reserve” or something like that.

After that, I have been using model-level validation for a decade in various enterpise-grade projects based on AngularJS, Angular up to 16, React, Vue and Svelte. It turned out that this framework-independent method is very convenient and allows you to fully control the validation process, and to have literally the same code on server- and client sides. I’ve tried a lot of third-party validators and form-based components, but they’re not perfect. And invented somewhere not here, of course.

A bit of disclaimer

In this article, I promote my open source lx-model-validator package, which is not as powerful as enterprise-level solutions (which are unfortunately under NDA). But it is quite flexible and powerful.

What is included:

  • Rules and validators enough for everyday usage,
  • Custom validators to extend,
  • Typescript,
  • Unit tests,
  • React-bindings with lx-model-validator-react ,
  • Live demo/docs project,
  • Basic set of validators,
  • Simplified i18n integration,
  • Simplified syntax and flat structure,
  • Simplified post-validation,
  • Simplified iterators,
  • Simplified rule conditions.

What is not included:

  • Asynchronous validation models,
  • Complex nested and interdependent rules,
  • Automatic contexts,
  • Deep i18n integration,
  • Complex and conditional iterators.

For instance, with full-scaled validation engine you can use addressing like user.contacts[>1].titles[2..3].value, whereas simplified syntax allows only to have user.contacts[*].titles[2].value — only explicit indexes ([2]) and full scans ([*]).

Details

Model-level validation means that we have a special entity, the validation model, that describes how to validate a particular model, and a validation engine that performs that validation process.

Data flow diagram: Model-level Validation
const UserValidation = {
'personalData.name': {
validators: [
{
validator: ValidatorStringRequired,
message: 'Name is required'
}, {
validator: ValidatorStringLength,
params: {min: 2, skipIfEmpty: true},
message: 'At least 2 characters, please'
}
]
},
'personalData.surname': {
level: 'warning',
validators: [
{
validator: ValidatorStringRequired,
message: 'Please provide a value'
},
]
},
'contacts_aggregate': {
validators: [
{
validator: ValidatorArrayLength,
params: {min: 1},
message: 'There must be at least one entry'
}
]
},
'user_aggregate': {
message: 'User data is invalid',
postvalidator: (_, result) => {
return (countErrorsLike('personalData', result) +
countErrorsLike('contacts', result)) === 0;
}
}
}

Here we can reach the consistency between model and state, and have a single source of truth:

const result = ValidationEngine.validate(UserData, UserValidation);
{
state: 'completed',
level: 'none',
stats: {
started_at: '...',
finished_at: '...',
time: 0.32,
processed_rules: 4,
processed_validators: 4,
total_errors: 0,
total_warnings: 0,
total_notices: 0,
total_skipped: 0
},
errors: {},
warnings: {},
notices: {},
skipped: []
}

Or, with empty name field:

{
state: 'completed',
level: 'error',
stats: {
started_at: '...',
finished_at: '...',
time: 0.37766699492931366,
processed_rules: 4,
processed_validators: 4,
total_errors: 2,
total_warnings: 0,
total_notices: 0,
total_skipped: 1
},
errors: {
'personalData.name': ['Name is required'],
'user_aggregate': ['User data is invalid']
},
warnings: {},
notices: {},
skipped: []
}

All we need now to do is display the messages in appropriate place, and disable the “Submit” button based on the result.stats.total_errors.

It’s a bit like a state management architecture: instead of getting data, then decomposing it into pieces and binding it to a component by going through a tree of nested components, we have a single context that is available everywhere.

This is a centralized solution where the developer has full control over performing validation, retrieving results and displaying state. The state is completely separate from the view, but is highly optimized for use in the UI.

Due to the fact that the solution is framework-independent, we can run the same validation model on both client and server side. Moreover, we can keep the validation models on server and load them on-demand to the client, or construct the validation models on the fly.

And, of course, with this approach we can get very convenient unit testing and even create our own test automation framework (e.g., a set of data that should pass, a set of data that should fail, and an automated run).

Usage

Well, first we need to declare validation model. It is pretty simple, just a JS (TS) structure of rules:

const ValidationModel = {
'path.in.model': {
message: 'Optional message', // otherwise result will be a message from validator
level: 'error', // optional, 'error', 'warning', 'notice',
active: true, // true|false or (data, value) => boolean
validators: [/* array of validators... */],
postvalidator: (data, result) => true // ...or special post-validation function
},
'other.path.in.model': {}
}

Result of validations is a special structure which includes errors, warnings and notices data according to the levels of violation for each validator:

...
errors: {

'personalData.name': ['Required', 'The length is insufficient'],
'user': ['User data is invalid']
}

Here we have:

  1. path.in.model means, suddenly, a path in the model. Like personalData.name.
    It uses very common dot-separated JS notation. Valid paths: user.data.personal.name, user.contacts[], user.contacts[*], user.contacts[0].type, user.contacts*.value.
    As you can see, in addition to direct addressing, there is array addressing: contacts[] means the entire array (for statements like "must have at least one element"), contacts[0] means the first element in the array, contacts[*] means "for every element in the array".
  2. message. Validation engine supports 3 level of messaging, in order:
  3. level means 'violation level' and could be unknown ->none -> notice -> warning -> error. unknown means validation wasn't completed, none - there are no errors at all.
  4. active controls the rule execution. It could be a static boolean value, or a function that return boolean. With false rule will be ignored.
  5. validators field is the most interesting. validator is the function which receives a value (by path.in.the.model), optional params and returns true, false or undefined.
    Here true means 'validation passed, no errors', false – 'validation failed' and undefined – 'validation skipped for some reason'. The last one, undefined, is very important for cases like 'Password is required' + 'Min length is 8 characters': it makes no sense to output both messages to an empty field, so most validators include the skipIfEmpty option, which allows you to bypass further validation as long as there is no value.
  6. postvalidator is a special thing for aggregates (see below).

Validators

Validator is the function which receives a value, some options and returns the result of validation:

  • true for passed, no error,
  • false for violation,
  • undefined if unknown state (e.g., invalid datatype), or if validation skipped by some reason ('undefined' value with skipIfEmpty===true)

You can think of it as a filter that must be passed, so true means “passed”.

The signature:

type TValidatorFnResult = boolean | undefined 
// true - ok, false - violated, undefined - not applicable, skip
type TValidatorFnResult = boolean | undefined 
// true - ok, false - violated, undefined - not applicable, skip

type TValidatorFn = (value: any, params?: Record<string, any>, data?: any)
=> TValidatorFnResult

type TValidatorMessage = string | ((data: any, value: any) => string)

and

interface IValidator {
validator: TValidatorFn
level?: TValidationViolationLevel // default error
message?: TValidatorMessage
params?: Record<string, any>
}

And all together:

const UserValidation = {
'personalData.name': {
validators: [
{
validator: ValidatorStringRequired,
message: 'Name is required'
}, {
validator: ValidatorStringLength,
params: {min: 2, skipIfEmpty: true},
message: 'At least 2 characters, please'
}
]
},
}

Here we do have two validators, ValidatorStringRequired and ValidatorStringLength – both predefined – with static messages and default level: error if violated.

Regarding the message field. The validation engine uses it from the validator or from the parent rule to calculate the text for the associated violation. It could be a static string, or a function:

type TValidatorMessage = string | ((data: any, value: any) => string)

This means that you can use the “functional” syntax if you need, for example, i18n for messages at this level. In other words, for React apps you can provide –

{
validator: ValidatorStringRequired,
message: i18n('UserProfile__Name_is_required')
}

and for Angular –

{
validator: ValidatorStringRequired,
message: 'UserProfile__Name_is_required'
}
...
// somewhere in the markup:
{{ message | i18n }}

Predefined validators

The project includes 7 predefined validators. More complex or specific validators can be implemented manually using examples.

  • ValidatorStringRequired. Checks for the presence of a string.
  • ValidatorStringContains. Checks for the presence of a substring in a string.
  • ValidatorStringLength. Checks the minimum, or maximum, or both lengths for the given string.
  • ValidatorStringPattern. Check if the string matches given pattern (regexp).
  • ValidatorEmail. Pretty obvious, checks to see if the string contains a valid email address.
  • ValidatorNumberRange. Checks the given number whether it is in the range or not.
  • ValidatorArrayLength. Simple validator like ValidatorStringLength but for arrays.

Validation result

The result of a validation session is a data structure:

result = {
state: 'completed',
level: 'none',
stats: {
started_at: '...',
finished_at: '...',
time: 0.3203750103712082,
processed_rules: 4,
processed_validators: 4,
total_errors: 0,
total_warnings: 0,
total_notices: 0,
total_skipped: 0
},
errors: {},
warnings: {},
notices: {},
skipped: []
}

The most interesting fields here are:

  • level - overall level of violation,
  • errors, warnings and notices structures.

First one is quite obvious, level means 'violation level' and could be unknown -> none -> notice -> warning -> error. unknown means validation wasn't completed, none - there are no violations (or nothing were checked because of conditions).

The rest of fields are TViolation:

type TViolation = Record<string, Array<string>>

Or, in data terms,

...
level: 'error',
errors: {
'user.personal.data.name': ['Name is required'],
'user.personal.data.surname': [
'Minimal length is 4 characters',
'Should include "Addams"'
]
},
warnings: {
'user.avatar': ['Please upload a picture'],
'contacts[0].zip': ['First address should include ZIP code']
}
...

As you can see, the same path.in.the.model addressing is present here, where each address can have one or more associated messages of three different levels.

The validation engine runs the validator, and if there is a violation (returns false) — it puts the associated message into the corresponding structure (pushes it into an array).

Corresponding determined by the level of validator, or of the rule, or 'error', if nothing.

Message calculates as a message from the validator, or from the rule, or automatic as ‘Empty message, %path%’, if nothing.

skipped is a technical field (array) of rules that were skipped during validation because they are invalid (non-existent path, invalid rule declaration). E.g., if in model we have form.user.data field but in the validation model we have declared a rule with form.users.data, this rule will be placed to the skipped array as a form.users.data, users because form does not have a field users.

It is recommended to check stats.total_skipped and the contents of the skipped field while debugging. It may not be a bug if some rules were put there, depends on the model and rules, but it is better to double-check.

An example of a valid rule that may be missing but that is not an error:

'user.contacts[1]': {
validators: [ValidatorEmail]
}

It will legally put to skipped while we have only one element in contacts (which has index 0), and this is OK if we have the phone number for the first item, and optional email for second.

Aggregates

All of the above validators are granular, at the field level. This is usually sufficient because the user can check the overall result and create derived validators.

For example, we might want a user check: if any of the personalData fields has a violation, then the entire user subset/form section should have an error associated with it.

Or if there is no address in the contacts marked as default - contactData should be marked with 'error'.

To make a life a bit easier, there is so-called aggregates, or postvalidators — a special kind of validation rule which is executed after all the ‘normal’ validators and has access to the current state of validation result and the data:

...
'user_info_aggregate': {
level: 'warning',
message: 'User info data is incorrect',
postvalidator: (_, result) => {
return countErrorsLike('user.data', result) === 0;
}
},
...
'contacts_aggregate': {
message: 'There must be "default" address',
postvalidator: (data, result) => {
return !!data.contacts?.some(record => record.default === true);
}
}

There are some helpers functions available:

  • hasError(key, result),
  • hasWarning(key, result),
  • hasNotice(key, result),
  • hasErrors,
  • hasWarnings,
  • hasNotices,
  • countErrorsLike(key, result),
  • countWarningsLike(key, result),
  • countNoticesLike(key, result),
  • getValidationClass.

They usually receive the key (string) and result of the check and return a number of corresponding results (errors, warnings, notifications) that correspond to the passed key. Key itself could be pretty complicated — exact/partial matching, regex pattern — and because of that I have a special documentation page for helpers.

React bindings

Even though using the validator is fairly straightforward, it is still a low-level subsystem. So I created a small package, lx-model-validator-react, which includes several useful components that encapsulate common tasks.

Just bindings: headless components that use Model Validator helpers.

Headless means that they do not contain any CSS styles.

  • <ValidationMessageComponent>. This is the root component to display the violations.
  • <ValidationAnyMessageComponent>. This is the dispatching component. It just wraps up the calls of ValidationMessageComponent with corresponding settings.
  • <ValidationErrorMessageComponent>. Wrapper for <ValidationMessageComponent> with predefined type={'error'}
  • <ValidationWarningMessageComponent>. Wrapper for <ValidationMessageComponent> with predefined type={'warning'}
  • <ValidationNoticeMessageComponent>. Wrapper for <ValidationMessageComponent> with predefined type={'notice'}
  • <ValidationTooltipComponent>. This component is designed to work together with React tooltip component.

More details: page.

--

--

Oleksii Koshkin
Oleksii Koshkin

Written by Oleksii Koshkin

AWS certified Solution Architect

No responses yet