Data validation means ensuring that data has the desired structure and content.
With TypeScript, validation becomes relevant when we receive external data such as:
In these cases, we expect the data to fit static types we have, but we can’t be sure. Contrast that with data we create ourselves, where TypeScript continuously checks that everything is correct.
This blog post explains how to validate external data in TypeScript.
Before we can explore approaches for data validation in TypeScript, we need to take a look at JSON schema because several of the approaches are based on it.
The idea behind JSON schema is to express the schema (structure and content, think static type) of JSON data in JSON. That is, metadata is expressed in the same format as data.
The use cases for JSON schema are:
Validating JSON data: If we have a schema definition for data, we can use tools to check that the data is correct. One issue with data can also be fixed automatically: We can specify default values that can be used to add properties that are missing.
Documenting JSON data formats: On one hand, the core schema definitions can be considered documentation. But JSON schema additionally supports descriptions, deprecation notes, comments, examples, and more. These mechanisms are called annotations. They are not used for validation, but for documentation.
package.json
files is completely based on a JSON schema.This example is taken from the json-schema.org
website:
{
"$id": "https://example.com/geographical-location.schema.json",
"$schema": "http://json-schema.org/draft-07/schema#",
"title": "Longitude and Latitude Values",
"description": "A geographical coordinate.",
"required": [ "latitude", "longitude" ],
"type": "object",
"properties": {
"latitude": {
"type": "number",
"minimum": -90,
"maximum": 90
},
"longitude": {
"type": "number",
"minimum": -180,
"maximum": 180
}
}
}
The following JSON data is valid w.r.t. this schema:
{
"latitude": 48.858093,
"longitude": 2.294694
}
This section provides a brief overview of various approaches for validating data in TypeScript. For each approach, I list one or more libraries that support the approach. W.r.t. libraries, I don’t intend to be comprehensive because things change quickly in this space.
Approach: Invoking builder methods and functions to produce both validation functions (at runtime) and static types (at compile time). Libraries include:
github.com/vriad/zod
(demonstrated later in this blog post)github.com/pelotom/runtypes
github.com/gcanti/io-ts
Approach: Invoking builder methods and only producing validation functions. Libraries taking this approach often focus on making validation as versatile as possible:
Approach: Compiling TypeScript types to validation code at compile time. Libraries:
github.com/ts-type-makeup/superstruct-ts-transformer
github.com/vedantroy/typecheck.macro
Approach: Converting TypeScript types to JSON schema. Libraries:
Approach: Converting a JSON schema to TypeScript types. Libraries:
Approach: Validating JSON data via JSON schemas. Both of the previous approaches tend to also do this. npm packages:
Which approach and therefore library to use, depends on what we need:
If we are starting with TypeScript types and want to ensure that data (coming from configuration files, etc.) fits those types, then builder APIs that support static types are a good choice.
If our starting point is a JSON schema, then we should consider one of the libraries that support JSON schema.
If we are handling data that is more messy (e.g. submitted via forms), we may need a more flexible approach where static types play less of a role.
Zod has a builder API that produces both types and validation functions. That API is used as follows:
import * as z from 'zod';
const FileEntryInputSchema = z.union([
z.string(),
z.tuple([z.string(), z.string(), z.array(z.string())]),
z.object({
file: z.string(),
author: z.string().optional(),
tags: z.array(z.string()).optional(),
}),
]);
For larger schemas, it can make sense to break things up into multiple const
declarations.
Zod can produce a static type from FileEntryInputSchema
, but I decided to (redundantly!) manually maintain the static type FileEntryInput
:
type FileEntryInput =
| string
| [string, string, string[]]
| {file: string, author?: string, tags?: string[]}
;
Why the redundancy?
Zod’s generated type is still helpful because we can check if it’s assignable to FileEntryInput
. That will warn us about most problems related to the two getting out of sync.
The following function checks if the parameter data
conforms to FileEntryInputSchema
:
function validateData(data: unknown): FileEntryInput {
return FileEntryInputSchema.parse(data); // may throw an exception
}
validateData(['iceland.txt', 'me', ['vacation', 'family']]); // OK
assert.throws(
() => validateData(['iceland.txt', 'me']));
The static type of the result of FileEntryInputSchema.parse()
is what Zod derived from FileEntryInputSchema
. By making FileEntryInput
the return type of validateData()
, we ensure that the former type is assignable to the latter.
FileEntryInputSchema.check()
is a type guard:
function func(data: unknown) {
if (FileEntryInputSchema.check(data)) {
// %inferred-type: string
// | [string, string, string[]]
// | { author?: string | undefined; tags?: string[] | undefined; file: string; }
data;
}
}
It can make sense to define a custom type guard that supports FileEntryInput
instead of what Zod infers.
function isValidData(data: unknown): data is FileEntryInput {
return FileEntryInputSchema.check(data);
}
The parameterized type z.infer<Schema>
can be used to derive a type from a schema:
// %inferred-type: string
// | [string, string, string[]]
// | { author?: string | undefined; tags?: string[] | undefined; file: string; }
type FileEntryInputStatic = z.infer<typeof FileEntryInputSchema>;
When working with external data, it’s often useful to distinguish two types.
On one hand, there is the type that describes the input data. Its structure is optimized for being easy to author:
type FileEntryInput =
| string
| [string, string, string[]]
| {file: string, author?: string, tags?: string[]}
;
On the other hand, there is the type that is used in the program. Its structure is optimized for being easy to use in code:
type FileEntry = {
file: string,
author: null|string,
tags: string[],
};
After we have used Zod to ensure that the input data conforms to FileEntryInput
, we use a conversion function that converts the data to a value of type FileEntry
.
My use case for a data validation library was making sure that data matched a given TypeScript type. Therefore, I would have preferred to directly compile the type to a validation function. So far, only the Babel macro typecheck.macro
does that and it requiring Babel ruled it out for me. I think I would also be OK with a tool that compiles a TypeScript type to a separate module with a validation function. But that also has downsides, usability-wise.
Therefore, Zod currently is a good solution for me and I haven’t had any regrets.
For libraries that have a builder API, I’d like to have tools that compile TypeScript types to builder API invocations (online and via a command line). This would help in two ways: