Working with file system paths and file URLs on Node.js

[2022-07-15] dev, javascript, nodejs

Warning: This blog post is outdated. Instead, read chapter “Working with file system paths and file URLs on Node.js” in “Shell scripting with Node.js”.

In this blog post, we learn how to work with file system paths and file URLs on Node.js.

In this blog post, we explore path-related functionality on Node.js:

Most path-related functionality is in module 'node:path'.
The global variable process has methods for changing the current working directory (what that is, is explained soon).
Module 'node:os' has functions that return the paths of important directories.

The three ways of accessing the `'node:path'` API

Module 'node:path' is often imported as follows:

import * as path from 'node:path';

In this blog post, this import statement is occasionally omitted. We also omit the following import:

import * as assert from 'node:assert/strict';

We can access Node’s path API in three ways:

We can access platform-specific versions of the API:
- path.posix supports Unixes including macOS.
- path.win32 supports Windows.
path itself always supports the current platform. For example, this is a REPL interaction on macOS:
```
> path.parse === path.posix.parse
true
```

Let’s see how function path.parse(), which parses file system paths, differs for the two platforms:

> path.win32.parse(String.raw`C:\Users\jane\file.txt`)
{
  dir: 'C:\\Users\\jane',
  root: 'C:\\',
  base: 'file.txt',
  name: 'file',
  ext: '.txt',
}
> path.posix.parse(String.raw`C:\Users\jane\file.txt`)
{
  dir: '',
  root: '',
  base: 'C:\\Users\\jane\\file.txt',
  name: 'C:\\Users\\jane\\file',
  ext: '.txt',
}

We parse a Windows path – first correctly via the path.win32 API, then via the path.posix API. We can see that in the latter case, the path isn’t correctly split into its parts – for example, the basename of the file should be file.txt (more on what the other properties mean later).

Foundational path concepts and their API support

Path segments, path separators, path delimiters

Terminology:

A non-empty path consists of one or more path segments – most often names of directories or files.
A path separator is used to separate two adjacent path segments in a path:
```
> path.posix.sep
'/'
> path.win32.sep
'\\'
```

A path delimiter separates elements in a list of paths:

> path.posix.delimiter
':'
> path.win32.delimiter
';'

We can see path separators and path delimitors if we examine the PATH shell variable – which contains the paths where the operating system looks for executables when a command is entered in a shell.

This is an example of a macOS PATH (shell variable $PATH):

> process.env.PATH.split(/(?<=:)/)
[
  '/opt/homebrew/bin:',
  '/opt/homebrew/sbin:',
  '/usr/local/bin:',
  '/usr/bin:',
  '/bin:',
  '/usr/sbin:',
  '/sbin',
]

The split separator has a length of zero because the lookbehind assertion (?<=:) matches if a given location is preceded by a colon but it does not capture anything. Therefore, the path delimiter ':' is included in the preceding path.

This is an example of a Windows PATH (shell variable %Path%):

> process.env.Path.split(/(?<=;)/)
[
  'C:\\Windows\\system32;',
  'C:\\Windows;',
  'C:\\Windows\\System32\\Wbem;',
  'C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;',
  'C:\\Windows\\System32\\OpenSSH\\;',
  'C:\\ProgramData\\chocolatey\\bin;',
  'C:\\Program Files\\nodejs\\',
]

The current working directory

Many shells have the concept of the current working directory (CWD) – “the directory I’m currently in”:

If we use a command with a partially qualified path, that path is resolved against the CWD.
If we omit a path when a command expects a path, the CWD is used.
On both Unixes and Windows, the command to change the CWD is cd.

process is a global Node.js variable. It provides us with methods for getting and setting the CWD:

process.cwd() returns the CWD.
process.chdir(dirPath) changes the CWD to dirPath.
- There must be a directory at dirPath.
- That change does not affect the shell, only the currently running Node.js process.

Node.js uses the CWD to fill in missing pieces whenever a path isn’t fully qualified (complete). That enables us to use partially qualified paths with various functions – e.g. fs.readFileSync().

The current working directory on Unix

The following code demonstrates process.chdir() and process.cwd() on Unix:

process.chdir('/home/jane');
assert.equal(
  process.cwd(), '/home/jane'
);

The current working directory on Windows

So far, we have used the current working directory on Unix. Windows works differently:

Each drive has a current directory.
There is a current drive.

We can use path.chdir() to set both at the same time:

process.chdir('C:\\Windows');
process.chdir('Z:\\tmp');

When we revisit a drive, Node.js remembers the previous current directory of that drive:

assert.equal(
  process.cwd(), 'Z:\\tmp'
);
process.chdir('C:');
assert.equal(
  process.cwd(), 'C:\\Windows'
);

Fully vs. partially qualified paths, resolving paths

A fully qualified path does not rely on any other information and can be used as is.
A partially qualified path is missing information: We need to turn it into a fully qualified path before we can use it. That is done by resolving it against a fully qualified path.

Fully and partially qualified paths on Unix

Unix only knows two kinds of paths:

Absolute paths are fully qualified and start with a slash:
```
/home/john/proj
```

Relative paths are partially qualified and start with a filename or a dot:

.   (current directory)
..  (parent directory)
dir
./dir
../dir
../../dir/subdir

Let’s use path.resolve() (which is explained in more detail later) to resolve relative paths against absolute paths. The results are absolute paths:

> const abs = '/home/john/proj';

> path.resolve(abs, '.')
'/home/john/proj'
> path.resolve(abs, '..')
'/home/john'
> path.resolve(abs, 'dir')
'/home/john/proj/dir'
> path.resolve(abs, './dir')
'/home/john/proj/dir'
> path.resolve(abs, '../dir')
'/home/john/dir'
> path.resolve(abs, '../../dir/subdir')
'/home/dir/subdir'

Fully and partially qualified paths on Windows

Windows distinguishes four kinds of paths (for more information, see Microsoft’s documentation):

There are absolute paths and relative paths.
Each of those two kinds of paths can have a drive letter (“volume designator”) or not.

Absolute paths with drive letters are fully qualified. All other paths are partially qualified.

Resolving an absolute path without a drive letter against a fully qualified path full, picks up the drive letter of full:

> const full = 'C:\\Users\\jane\\proj';

> path.resolve(full, '\\Windows')
'C:\\Windows'

Resolving a relative path without a drive letter against a fully qualified path, can be viewed as updating the latter:

> const full = 'C:\\Users\\jane\\proj';

> path.resolve(full, '.')
'C:\\Users\\jane\\proj'
> path.resolve(full, '..')
'C:\\Users\\jane'
> path.resolve(full, 'dir')
'C:\\Users\\jane\\proj\\dir'
> path.resolve(full, '.\\dir')
'C:\\Users\\jane\\proj\\dir'
> path.resolve(full, '..\\dir')
'C:\\Users\\jane\\dir'
> path.resolve(full, '..\\..\\dir')
'C:\\Users\\dir'

Resolving a relative path rel with a drive letter against a fully qualified path full depends on the drive letter of rel:

Same drive letter as full? Resolve rel against full.
Different drive letter than full? Resolve rel against the current directory of rel’s drive.

That looks as follows:

// Configure current directories for C: and Z:
process.chdir('C:\\Windows\\System');
process.chdir('Z:\\tmp');

const full = 'C:\\Users\\jane\\proj';

// Same drive letter
assert.equal(
  path.resolve(full, 'C:dir'),
  'C:\\Users\\jane\\proj\\dir'
);
assert.equal(
  path.resolve(full, 'C:'),
  'C:\\Users\\jane\\proj'
);

// Different drive letter
assert.equal(
  path.resolve(full, 'Z:dir'),
  'Z:\\tmp\\dir'
);
assert.equal(
  path.resolve(full, 'Z:'),
  'Z:\\tmp'
);

Getting the paths of important directories via module `'node:os'`

The module 'node:os' provides us with the paths of two important directories:

os.homedir() returns the path to the home directory of the current user – for example:

> os.homedir() // macOS
'/Users/rauschma'
> os.homedir() // Windows
'C:\\Users\\axel'

os.tmpdir() returns the path of the operating system’s directory for temporary files – for example:

> os.tmpdir() // macOS
'/var/folders/ph/sz0384m11vxf5byk12fzjms40000gn/T'
> os.tmpdir() // Windows
'C:\\Users\\axel\\AppData\\Local\\Temp'

Concatenating paths

There are two functions for concatenating paths:

path.resolve() always returns fully qualified paths
path.join() preserves relative paths

`path.resolve()`: concatenating paths to create fully qualified paths

path.resolve(...paths: Array<string>): string

Concatenates the paths and return a fully qualified path. It uses the following algorithm:

Start with the current working directory.
Resolve path[0] against the previous result.
Resolve path[1] against the previous result.
Do the same for all remaining paths.
Return the final result.

Without arguments, path.resolve() returns the path of the current working directory:

> process.cwd()
'/usr/local'
> path.resolve()
'/usr/local'

One or more relative paths are used for resolution, starting with the current working directory:

> path.resolve('.')
'/usr/local'
> path.resolve('..')
'/usr'
> path.resolve('bin')
'/usr/local/bin'
> path.resolve('./bin', 'sub')
'/usr/local/bin/sub'
> path.resolve('../lib', 'log')
'/usr/lib/log'

Any fully qualified path replaces the previous result:

> path.resolve('bin', '/home')
'/home'

That enables us to resolve partially qualified paths against fully qualified paths:

> path.resolve('/home/john', 'proj', 'src')
'/home/john/proj/src'

`path.join()`: concatenating paths while preserving relative paths

path.join(...paths: Array<string>): string

Starts with paths[0] and interprets the remaining paths as instructions for ascending or descending. In contrast to path.resolve(), this function preserves partially qualified paths: If paths[0] is partially qualified, the result is partially qualified. If it is fully qualified, the result is fully qualified.

Examples of descending:

> path.posix.join('/usr/local', 'sub', 'subsub')
'/usr/local/sub/subsub'
> path.posix.join('relative/dir', 'sub', 'subsub')
'relative/dir/sub/subsub'

Double dots ascend:

> path.posix.join('/usr/local', '..')
'/usr'
> path.posix.join('relative/dir', '..')
'relative'

Single dots do nothing:

> path.posix.join('/usr/local', '.')
'/usr/local'
> path.posix.join('relative/dir', '.')
'relative/dir'

If arguments after the first one are fully qualified paths, they are interpreted as relative paths:

> path.posix.join('dir', '/tmp')
'dir/tmp'
> path.win32.join('dir', 'C:\\Users')
'dir\\C:\\Users'

Using more than two arguments:

> path.posix.join('/usr/local', '../lib', '.', 'log')
'/usr/lib/log'

Ensuring paths are normalized, fully qualified, or relative

`path.normalize()`: ensuring paths are normalized

path.normalize(path: string): string

On Unix, path.normalize():

Removes path segments that are single dots (.).
Resolves path segments that are double dots (..).
Turns multiple path separators into a single path separator.

For example:

// Fully qualified path
assert.equal(
  path.posix.normalize('/home/./john/lib/../photos///pet'),
  '/home/john/photos/pet'
);

// Partially qualified path
assert.equal(
  path.posix.normalize('./john/lib/../photos///pet'),
  'john/photos/pet'
);

On Windows, path.normalize():

Removes path segments that are single dots (.).
Resolves path segments that are double dots (..).
Converts each path separator slash (/) – which is legal – into a the preferred path separator (\).
Converts sequences of more than one path separator to single backslashes.

For example:

// Fully qualified path
assert.equal(
  path.win32.normalize('C:\\Users/jane\\doc\\..\\proj\\\\src'),
  'C:\\Users\\jane\\proj\\src'
);

// Partially qualified path
assert.equal(
  path.win32.normalize('.\\jane\\doc\\..\\proj\\\\src'),
  'jane\\proj\\src'
);

Note that path.join() with a single argument also normalizes and works the same as path.normalize():

> path.posix.normalize('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'
> path.posix.join('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'

> path.posix.normalize('./john/lib/../photos///pet')
'john/photos/pet'
> path.posix.join('./john/lib/../photos///pet')
'john/photos/pet'

`path.resolve()` (one argument): ensuring paths are normalized and fully qualified

We have already encountered path.resolve(). Called with a single argument, it both normalizes paths and ensures that they are fully qualified.

Using path.resolve() on Unix:

> process.cwd()
'/usr/local'

> path.resolve('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'
> path.resolve('./john/lib/../photos///pet')
'/usr/local/john/photos/pet'

Using path.resolve() on Windows:

> process.cwd()
'C:\\Windows\\System'

> path.resolve('C:\\Users/jane\\doc\\..\\proj\\\\src')
'C:\\Users\\jane\\proj\\src'
> path.resolve('.\\jane\\doc\\..\\proj\\\\src')
'C:\\Windows\\System\\jane\\proj\\src'

`path.relative()`: creating relative paths

path.relative(sourcePath: string, destinationPath: string): string

Returns a relative path that gets us from sourcePath to destinationPath:

> path.posix.relative('/home/john/', '/home/john/proj/my-lib/README.md')
'proj/my-lib/README.md'
> path.posix.relative('/tmp/proj/my-lib/', '/tmp/doc/zsh.txt')
'../../doc/zsh.txt'

On Windows, we get a fully qualified path if sourcePath and destinationPath are on different drives:

> path.win32.relative('Z:\\tmp\\', 'C:\\Users\\Jane\\')
'C:\\Users\\Jane'

This function also works with relative paths:

> path.posix.relative('proj/my-lib/', 'doc/zsh.txt')
'../../doc/zsh.txt'

Parsing paths: extracting various parts of a path (filename extension etc.)

`path.parse()`: creating an object with path parts

type PathObject = {
  dir: string,
    root: string,
  base: string,
    name: string,
    ext: string,
};
path.parse(path: string): PathObject

Extracts various parts of path and returns them in an object with the following properties:

.base: last segment of a path
- .ext: the filename extension of the base
- .name: the base without the extension. This part is also called the stem of a path.
.root: the beginning of a path (before the first segment)
.dir: the directory in which the base is located – the path without the base

Later, we’ll see function path.format() which is the inverse of path.parse(): It converts an object with path parts into a path.

Example: `path.parse()` on Unix

This is what using path.parse() on Unix looks like:

> path.posix.parse('/home/jane/file.txt')
{
  dir: '/home/jane',
  root: '/',
  base: 'file.txt',
  name: 'file',
  ext: '.txt',
}

The following diagram visualizes the extent of the parts:

  /      home/jane / file   .txt
| root |           | name | ext  |
| dir              | base        |

For example, we can see that .dir is the path without the base. And that .base is .name plus .ext.

Example: `path.parse()` on Windows

This is how path.parse() works on Windows:

> path.win32.parse(String.raw`C:\Users\john\file.txt`)
{
  dir: 'C:\\Users\\john',
  root: 'C:\\',
  base: 'file.txt',
  name: 'file',
  ext: '.txt',
}

This is a diagram for the result:

  C:\    Users\john \ file   .txt
| root |            | name | ext  |
| dir               | base        |

`path.basename()`: extracting the base of a path

path.basename(path, ext?)

Returns the base of path:

> path.basename('/home/jane/file.txt')
'file.txt'

Optionally, this function can also remove a suffix:

> path.basename('/home/jane/file.txt', '.txt')
'file'
> path.basename('/home/jane/file.txt', 'txt')
'file.'
> path.basename('/home/jane/file.txt', 'xt')
'file.t'

Removing the extension is case sensitive – even on Windows!

> path.win32.basename(String.raw`C:\Users\john\file.txt`, '.txt')
'file'
> path.win32.basename(String.raw`C:\Users\john\file.txt`, '.TXT')
'file.txt'

`path.dirname()`: extracting the parent directory of a path

path.dirname(path)

Returns the parent directory of the file or directory at path:

> path.win32.dirname(String.raw`C:\Users\john\file.txt`)
'C:\\Users\\john'
> path.win32.dirname('C:\\Users\\john\\dir\\')
'C:\\Users\\john'

> path.posix.dirname('/home/jane/file.txt')
'/home/jane'
> path.posix.dirname('/home/jane/dir/')
'/home/jane'

`path.extname()`: extracting the extension of a path

path.extname(path)

Returns the extension of path:

> path.extname('/home/jane/file.txt')
'.txt'
> path.extname('/home/jane/file.')
'.'
> path.extname('/home/jane/file')
''
> path.extname('/home/jane/')
''
> path.extname('/home/jane')
''

Categorizing paths

`path.isAbsolute()`: Is a given path absolute?

path.isAbsolute(path: string): boolean

Returns true if path is absolute and false otherwise.

The results on Unix are straightforward:

> path.posix.isAbsolute('/home/john')
true
> path.posix.isAbsolute('john')
false

On Windows, “absolute” does not necessarily mean “fully qualified” (only the first path is fully qualified):

> path.win32.isAbsolute('C:\\Users\\jane')
true
> path.win32.isAbsolute('\\Users\\jane')
true
> path.win32.isAbsolute('C:jane')
false
> path.win32.isAbsolute('jane')
false

`path.format()`: creating paths out of parts

type PathObject = {
  dir: string,
    root: string,
  base: string,
    name: string,
    ext: string,
};
path.format(pathObject: PathObject): string

Creates a path out of a path object:

> path.format({dir: '/home/jane', base: 'file.txt'})
'/home/jane/file.txt'

Example: changing the filename extension

We can use path.format() to change the extension of a path:

function changeFilenameExtension(pathStr, newExtension) {
  if (!newExtension.startsWith('.')) {
    throw new Error(
      'Extension must start with a dot: '
      + JSON.stringify(newExtension)
    );
  }
  const parts = path.parse(pathStr);
  return path.format({
    ...parts,
    base: undefined, // prevent .base from overriding .name and .ext
    ext: newExtension,
  });
}

assert.equal(
  changeFilenameExtension('/tmp/file.md', '.html'),
  '/tmp/file.html'
);
assert.equal(
  changeFilenameExtension('/tmp/file', '.html'),
  '/tmp/file.html'
);
assert.equal(
  changeFilenameExtension('/tmp/file/', '.html'),
  '/tmp/file.html'
);

If we know the original filename extension, we can also use a regular expression to change the filename extension:

> '/tmp/file.md'.replace(/\.md$/i, '.html')
'/tmp/file.html'
> '/tmp/file.MD'.replace(/\.md$/i, '.html')
'/tmp/file.html'

Using the same paths on different platforms

Sometimes we’d like to use the same paths on different platforms. Then there are two issues that we are facing:

The path separator may be different.
The file structure may be different: home directories and directories for temporary files may be in different locations, etc.

As an example, consider a Node.js app that operates on a directory with data. Let’s assume that the app can be configured with two kinds of paths:

Fully qualified paths anywhere on the system
Paths inside the data directory

Due to the aforementioned issues:

We can’t reuse fully qualified paths between platforms.
- Sometimes we need absolute paths. These have to be configured per “instance” of the data directory and stored externally (or inside it and ignored by version control). These paths stay put and are not moved with the data directory.
We can reuse paths that point into the data directory. Such paths may be stored in configuration files (inside the data directory or not) and in constants in the app’s code. To do that:
- We have to store them as relative paths.
- We have to ensure that the path separator is correct on each platform.
The next subsection explains how both can be achieved.

Relative platform-independent paths

Relative platform-independent paths can be stored as Arrays of path segments and turned into fully qualified platform-specific paths as follows:

const universalRelativePath = ['static', 'img', 'logo.jpg'];

const dataDirUnix = '/home/john/data-dir';
assert.equal(
  path.posix.resolve(dataDirUnix, ...universalRelativePath),
  '/home/john/data-dir/static/img/logo.jpg'
);

const dataDirWindows = 'C:\\Users\\jane\\data-dir';
assert.equal(
  path.win32.resolve(dataDirWindows, ...universalRelativePath),
  'C:\\Users\\jane\\data-dir\\static\\img\\logo.jpg'
);

To create relative platform-specific paths, we can use:

const dataDir = '/home/john/data-dir';
const pathInDataDir = '/home/john/data-dir/static/img/logo.jpg';
assert.equal(
  path.relative(dataDir, pathInDataDir),
  'static/img/logo.jpg'
);

The following function converts relative platform-specific paths into platform-independent paths:

import * as path from 'node:path';

function splitRelativePathIntoSegments(relPath) {
  if (path.isAbsolute(relPath)) {
    throw new Error('Path isn’t relative: ' + relPath);
  }
  relPath = path.normalize(relPath);
  const result = [];
  while (true) {
    const base = path.basename(relPath);
    if (base.length === 0) break;
    result.unshift(base);
    const dir = path.dirname(relPath);
    if (dir === '.') break;
    relPath = dir;
  }
  return result;
}

Using splitRelativePathIntoSegments() on Unix:

> splitRelativePathIntoSegments('static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]
> splitRelativePathIntoSegments('file.txt')
[ 'file.txt' ]

Using splitRelativePathIntoSegments() on Windows:

> splitRelativePathIntoSegments('static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]
> splitRelativePathIntoSegments('C:static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]

> splitRelativePathIntoSegments('file.txt')
[ 'file.txt' ]
> splitRelativePathIntoSegments('C:file.txt')
[ 'file.txt' ]

Using a library to match paths via globs

The npm module 'minimatch' lets us match paths against patterns that are called glob expressions, glob patterns, or globs:

import minimatch from 'minimatch';
assert.equal(
  minimatch('/dir/sub/file.txt', '/dir/sub/*.txt'), true
);
assert.equal(
  minimatch('/dir/sub/file.txt', '/**/file.txt'), true
);

Use cases for globs:

Specifying which files in a directory should be processed by a script.
Specifying which files to ignore.

More glob libraries:

multimatch extends minimatch with support for multiple patterns.
micromatch is an alternative to minimatch and multimatch that has a similar API.
globby is a library based on fast-glob that adds convenience features.

The minimatch API

The whole API of minimatch is documented in the project’s readme file. In this subsection, we look at the most important functionality.

Minimatch compiles globs to JavaScript RegExp objects and uses those to match.

`minimatch()`: compiling and matching once

minimatch(path: string, glob: string, options?: MinimatchOptions): boolean

Returns true if glob matches path and false otherwise.

Two interesting options:

.dot: boolean (default: false)
If true, wildcard symbols such as * and ** match “invisible” path segments (whose names begin with dots):

> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json')
false
> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json', {dot: true})
true

> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt')
false
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt', {dot: true})
true

.matchBase: boolean (default: false)
If true, a pattern without slashes is matched against the basename of a path:

> minimatch('/dir/file.txt', 'file.txt')
false
> minimatch('/dir/file.txt', 'file.txt', {matchBase: true})
true

`new minimatch.Minimatch()`: compiling once, matching multiple times

Class minimatch.Minimatch enables us to only compile the glob to a regular expression once and match multiple times:

new Minimatch(pattern: string, options?: MinimatchOptions)

This is how this class is used:

import minimatch from 'minimatch';
const {Minimatch} = minimatch;
const glob = new Minimatch('/dir/sub/*.txt');
assert.equal(
  glob.match('/dir/sub/file.txt'), true
);
assert.equal(
  glob.match('/dir/sub/notes.txt'), true
);

Syntax of glob expressions

This subsection covers the essentials of the syntax. But there are more features. These are documented here:

Minimatch’s unit tests have many examples of globs.
The Bash Reference manual has a section on filename expansion.

Matching Windows paths

Even on Windows, glob segments are separated by slashes – but they match both backslashes and slashes (which are legal path separators on Windows):

> minimatch('dir\\sub/file.txt', 'dir/sub/file.txt')
true

Minimatch does not normalize paths

Minimatch does not normalize paths for us:

> minimatch('./file.txt', './file.txt')
true
> minimatch('./file.txt', 'file.txt')
false
> minimatch('file.txt', './file.txt')
false

Therefore, we have to normalize paths if we don’t create them ourselves:

> path.normalize('./file.txt')
'file.txt'

Patterns without wildcard symbols: path separators must line up

Patterns without wildcard symbols (that match more flexibly) must match exactly. Especially the path separators must line up:

> minimatch('/dir/file.txt', '/dir/file.txt')
true
> minimatch('dir/file.txt', 'dir/file.txt')
true
> minimatch('/dir/file.txt', 'dir/file.txt')
false

> minimatch('/dir/file.txt', 'file.txt')
false

That is, we must decide on either absolute or relative paths.

With option .matchBase, we can match patterns without slashes against the basenames of paths:

> minimatch('/dir/file.txt', 'file.txt', {matchBase: true})
true

The asterisk (`*`) matches any (part of a) single segment

The wildcard symbol asterisk (*) matches any path segment or any part of a segment:

> minimatch('/dir/file.txt', '/*/file.txt')
true
> minimatch('/tmp/file.txt', '/*/file.txt')
true

> minimatch('/dir/file.txt', '/dir/*.txt')
true
> minimatch('/dir/data.txt', '/dir/*.txt')
true

The asterisk does not match “invisible files“ whose names start with dots. If we want to match those, we have to prefix the asterisk with a dot:

> minimatch('file.txt', '*')
true
> minimatch('.gitignore', '*')
false
> minimatch('.gitignore', '.*')
true
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt')
false

Option .dot lets us switch off this behavior:

> minimatch('.gitignore', '*', {dot: true})
true
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt', {dot: true})
true

The double asterisk (`**`) matches zero or more segments

´**/ matches zero or more segments:

> minimatch('/file.txt', '/**/file.txt')
true
> minimatch('/dir/file.txt', '/**/file.txt')
true
> minimatch('/dir/sub/file.txt', '/**/file.txt')
true

If we want to match relative paths, the pattern still must not start with a path separator:

> minimatch('file.txt', '/**/file.txt')
false

The double asterisk does not match “invisible” path segments whose names start with dots:

> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json')
false

We can switch off that behavior via option .dot:

> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json', {dot: true})
true

Negating globs

If we start a glob with an exclamation mark, it matches if the pattern after the exclamation mark does not match:

> minimatch('file.txt', '!**/*.txt')
false
> minimatch('file.js', '!**/*.txt')
true

Alternative patterns

Comma-separate patterns inside braces match if one of the patterns matches:

> minimatch('file.txt', 'file.{txt,js}')
true
> minimatch('file.js', 'file.{txt,js}')
true

Ranges of integers

A pair of integers separated by double dots defines a range of integers and matches if any of its elements matches:

> minimatch('file1.txt', 'file{1..3}.txt')
true
> minimatch('file2.txt', 'file{1..3}.txt')
true
> minimatch('file3.txt', 'file{1..3}.txt')
true
> minimatch('file4.txt', 'file{1..3}.txt')
false

Padding with zeros is supported, too:

> minimatch('file1.txt', 'file{01..12}.txt')
false
> minimatch('file01.txt', 'file{01..12}.txt')
true
> minimatch('file02.txt', 'file{01..12}.txt')
true
> minimatch('file12.txt', 'file{01..15}.txt')
true

Using `file:` URLs to refer to files

There are two common ways to refer to files in Node.js:

Paths in strings
Instances of URL with the protocol file:

For example:

assert.equal(
  fs.readFileSync(
    '/tmp/data.txt', {encoding: 'utf-8'}),
  'Content'
);
assert.equal(
  fs.readFileSync(
    new URL('file:///tmp/data.txt'), {encoding: 'utf-8'}),
  'Content'
);

Class `URL`

In this section, we take a closer look at class URL. More information on this class:

Node.js documentation: section “The WHATWG URL API”
Section “API“ of the WHATWG URL standard

In this blog post, we access class URL via a global variable because that’s how it’s used on other web platforms. But it can also be imported:

import {URL} from 'node:url';

URIs vs. relative references

URLs are a subset of URIs. RFC 3986, the standard for URIs, distinguishes two kinds of URI-references:

A URI starts with a scheme followed by a colon separator.
All other URI references are relative references.

Constructor of `URL`

Class URL can be instantiated in two ways:

new URL(uri: string)

uri must be a URI. It specifies the URI of the new instance.
new URL(uriRef: string, baseUri: string)

baseUri must be a URI. If uriRef is a relative reference, it is resolved against baseUri and the result becomes the URI of the new instance.

If uriRef is a URI, it completely replaces baseUri as the data on which the instance is based.

Here we can see the class in action:

// If there is only one argument, it must be a proper URI
assert.equal(
  new URL('https://example.com/public/page.html').toString(),
  'https://example.com/public/page.html'
);
assert.throws(
  () => new URL('../book/toc.html'),
  /^TypeError \[ERR_INVALID_URL\]: Invalid URL$/
);

// Resolve a relative reference against a base URI 
assert.equal(
  new URL(
    '../book/toc.html',
    'https://example.com/public/page.html'
  ).toString(),
  'https://example.com/book/toc.html'
);

Resolving relative references against instances of `URL`

Let’s revisit this variant of the URL constructor:

new URL(uriRef: string, baseUri: string)

The argument baseUri is coerced to string. Therefore, any object can be used – as long as it becomes a valid URL when coereced to string:

const obj = { toString() {return 'https://example.com'} };
assert.equal(
  new URL('index.html', obj).href,
  'https://example.com/index.html'
);

That enables us to resolve relative references against URL instances:

const url = new URL('https://example.com/dir/file1.html');
assert.equal(
  new URL('../file2.html', url).href,
  'https://example.com/file2.html'
);

Used this way, the constructor is loosely similar to path.resolve().

Properties of `URL` instances

Instances of URL have the following properties:

type URL = {
  protocol: string,
  username: string,
  password: string,
  hostname: string,
  port: string,
  host: string,
  readonly origin: string,
  
  pathname: string,
  
  search: string,
  readonly searchParams: URLSearchParams,
  hash: string,

  href: string,
  toString(): string,
  toJSON(): string,
}

Converting URLs to strings

There are three common ways in which we can convert URLs to strings:

const url = new URL('https://example.com/about.html');

assert.equal(
  url.toString(),
  'https://example.com/about.html'
);
assert.equal(
  url.href,
  'https://example.com/about.html'
);
assert.equal(
  url.toJSON(),
  'https://example.com/about.html'
);

Method .toJSON() enables us to use URLs in JSON data:

const jsonStr = JSON.stringify({
  pageUrl: new URL('https://2ality.com/p/subscribe.html')
});
assert.equal(
  jsonStr, '{"pageUrl":"https://2ality.com/p/subscribe.html"}'
);

Getting `URL` properties

The properties of URL instances are not own data properties, they are implemented via getters and setters. In the next example, we use the utility function pickProps() (whose code is shown at the end), to copy the values returned by those getters into a plain object:

const props = pickProps(
  new URL('https://jane:pw@example.com:80/news.html?date=today#misc'),
  'protocol', 'username', 'password', 'hostname', 'port', 'host',
  'origin', 'pathname', 'search', 'hash', 'href'
);
assert.deepEqual(
  props,
  {
    protocol: 'https:',
    username: 'jane',
    password: 'pw',
    hostname: 'example.com',
    port: '80',
    host: 'example.com:80',
    origin: 'https://example.com:80',
    pathname: '/news.html',
    search: '?date=today',
    hash: '#misc',
    href: 'https://jane:pw@example.com:80/news.html?date=today#misc'
  }
);
function pickProps(input, ...keys) {
  const output = {};
  for (const key of keys) {
    output[key] = input[key];
  }
  return output;
}

Alas, the pathname is a single atomic unit. That is, we can’t use class URL to access its parts (base, extension, etc.).

Setting parts of a URL

We can also change parts of a URL by setting properties such as .hostname:

const url = new URL('https://example.com');
url.hostname = '2ality.com';
assert.equal(
  url.href, 'https://2ality.com/'
);

We can use the setters to create URLs from parts (idea by Haroen Viaene):

// Object.assign() invokes setters when transferring property values
const urlFromParts = (parts) => Object.assign(
  new URL('https://example.com'), // minimal dummy URL
  parts // assigned to the dummy
);

const url = urlFromParts({
  protocol: 'https:',
  hostname: '2ality.com',
  pathname: '/p/about.html',
});
assert.equal(
  url.href, 'https://2ality.com/p/about.html'
);

Managing search parameters via `.searchParams`

We can use property .searchParams to manage the search parameters of URLs. Its value is an instance of URLSearchParams.

We can use it to read search parameters:

const url = new URL('https://example.com/?topic=js');
assert.equal(
  url.searchParams.get('topic'), 'js'
);
assert.equal(
  url.searchParams.has('topic'), true
);

We can also change search parameters via it:

url.searchParams.append('page', '5');
assert.equal(
  url.href, 'https://example.com/?topic=js&page=5'
);

url.searchParams.set('topic', 'css');
assert.equal(
  url.href, 'https://example.com/?topic=css&page=5'
);

Converting between URLs and file paths

It’s tempting to convert between file paths and URLs manually. For example, we can try to convert an URL instance myUrl to a file path via myUrl.pathname. However that doesn’t always work – it’s better to use this function:

url.fileURLToPath(url: URL | string): string

The following code compares the results of that function with the values of .pathname:

import * as assert from 'assert';
import * as url from 'node:url';

//::::: Unix :::::

const url1 = new URL('file:///tmp/with%20space.txt');
assert.equal(
  url1.pathname, '/tmp/with%20space.txt');
assert.equal(
  url.fileURLToPath(url1), '/tmp/with space.txt');

const url2 = new URL('file:///home/thor/Mj%C3%B6lnir.txt');
assert.equal(
  url2.pathname, '/home/thor/Mj%C3%B6lnir.txt');
assert.equal(
  url.fileURLToPath(url2), '/home/thor/Mjölnir.txt');

//::::: Windows :::::

const url3 = new URL('file:///C:/dir/');
assert.equal(
  url3.pathname, '/C:/dir/');
assert.equal(
  url.fileURLToPath(url3), 'C:\\dir\\');

This function is the inverse of url.fileURLToPath():

url.pathToFileURL(path: string): URL

It converts path to a file URL:

> url.pathToFileURL('/home/john/Work Files').href
'file:///home/john/Work%20Files'

Use case for URLs: accessing files relative to the current module

One important use case for URLs is accessing a file that is a sibling of the current module:

function readData() {
  const url = new URL('data.txt', import.meta.url);
  return fs.readFileSync(url, {encoding: 'UTF-8'});
}

This function uses import.meta.url which contains the URL of the current module (which is usually a file: URL on Node.js).

Using fetch() would have made the previous code even more cross-platform. However, as of Node.js 18.5, fetch() doesn’t work for file: URLs yet:

> await fetch('file:///tmp/file.txt')
TypeError: fetch failed
  cause: Error: not implemented... yet...

Use case for URLs: detecting if the current module is running as a script

See the blog post “Node.js: checking if an ESM module is ‘main’”.

Paths vs. `file:` URLs

When shell scripts receive references to files or export references to files (e.g. by logging them on screen), they are virtually always paths. However, there are two cases where we need URLs (as discussed in previous subsections):

To access files relative to the current module
To detect if the current module is running as a script

Working with file system paths and file URLs on Node.js

Path-related functionality on Node.js #

The three ways of accessing the 'node:path' API #

Foundational path concepts and their API support #

Path segments, path separators, path delimiters #

The current working directory #

The current working directory on Unix #

The current working directory on Windows #

Fully vs. partially qualified paths, resolving paths #

Fully and partially qualified paths on Unix #

Fully and partially qualified paths on Windows #

Getting the paths of important directories via module 'node:os' #

Concatenating paths #

path.resolve(): concatenating paths to create fully qualified paths #

path.join(): concatenating paths while preserving relative paths #

Ensuring paths are normalized, fully qualified, or relative #

path.normalize(): ensuring paths are normalized #

path.resolve() (one argument): ensuring paths are normalized and fully qualified #

path.relative(): creating relative paths #

Parsing paths: extracting various parts of a path (filename extension etc.) #

path.parse(): creating an object with path parts #

Example: path.parse() on Unix #

Example: path.parse() on Windows #

path.basename(): extracting the base of a path #

path.dirname(): extracting the parent directory of a path #

path.extname(): extracting the extension of a path #

Categorizing paths #

path.isAbsolute(): Is a given path absolute? #

path.format(): creating paths out of parts #

Example: changing the filename extension #

Using the same paths on different platforms #

Relative platform-independent paths #

Using a library to match paths via globs #

The minimatch API #

minimatch(): compiling and matching once #

new minimatch.Minimatch(): compiling once, matching multiple times #

Syntax of glob expressions #

Matching Windows paths #

Minimatch does not normalize paths #

Patterns without wildcard symbols: path separators must line up #

The asterisk (*) matches any (part of a) single segment #

The double asterisk (**) matches zero or more segments #

Negating globs #

Alternative patterns #

Ranges of integers #

Using file: URLs to refer to files #

Class URL #

URIs vs. relative references #

Constructor of URL #

Resolving relative references against instances of URL #

Properties of URL instances #

Converting URLs to strings #

Getting URL properties #

Setting parts of a URL #

Managing search parameters via .searchParams #

Converting between URLs and file paths #

Use case for URLs: accessing files relative to the current module #

Use case for URLs: detecting if the current module is running as a script #

Paths vs. file: URLs #

Path-related functionality on Node.js

The three ways of accessing the `'node:path'` API

Foundational path concepts and their API support

Path segments, path separators, path delimiters

The current working directory

The current working directory on Unix

The current working directory on Windows

Fully vs. partially qualified paths, resolving paths

Fully and partially qualified paths on Unix

Fully and partially qualified paths on Windows

Getting the paths of important directories via module `'node:os'`

Concatenating paths

`path.resolve()`: concatenating paths to create fully qualified paths

`path.join()`: concatenating paths while preserving relative paths

Ensuring paths are normalized, fully qualified, or relative

`path.normalize()`: ensuring paths are normalized

`path.resolve()` (one argument): ensuring paths are normalized and fully qualified

`path.relative()`: creating relative paths

Parsing paths: extracting various parts of a path (filename extension etc.)

`path.parse()`: creating an object with path parts

Example: `path.parse()` on Unix

Example: `path.parse()` on Windows

`path.basename()`: extracting the base of a path

`path.dirname()`: extracting the parent directory of a path

`path.extname()`: extracting the extension of a path

Categorizing paths

`path.isAbsolute()`: Is a given path absolute?

`path.format()`: creating paths out of parts

Example: changing the filename extension

Using the same paths on different platforms

Relative platform-independent paths

Using a library to match paths via globs

The minimatch API

`minimatch()`: compiling and matching once

`new minimatch.Minimatch()`: compiling once, matching multiple times

Syntax of glob expressions

Matching Windows paths

Minimatch does not normalize paths

Patterns without wildcard symbols: path separators must line up

The asterisk (`*`) matches any (part of a) single segment

The double asterisk (`**`) matches zero or more segments

Negating globs

Alternative patterns

Ranges of integers

Using `file:` URLs to refer to files

Class `URL`

URIs vs. relative references

Constructor of `URL`

Resolving relative references against instances of `URL`

Properties of `URL` instances

Converting URLs to strings

Getting `URL` properties

Setting parts of a URL

Managing search parameters via `.searchParams`

Converting between URLs and file paths

Use case for URLs: accessing files relative to the current module

Use case for URLs: detecting if the current module is running as a script

Paths vs. `file:` URLs