In this blog post, we learn how to work with file system paths and file URLs on Node.js.
In this blog post, we explore path-related functionality on Node.js:
'node:path'
.process
has methods for changing the current working directory (what that is, is explained soon).'node:os'
has functions that return the paths of important directories.'node:path'
API Module 'node:path'
is often imported as follows:
import * as path from 'node:path';
In this blog post, this import statement is occasionally omitted. We also omit the following import:
import * as assert from 'node:assert/strict';
We can access Node’s path API in three ways:
path.posix
supports Unixes including macOS.path.win32
supports Windows.path
itself always supports the current platform. For example, this is a REPL interaction on macOS:> path.parse === path.posix.parse
true
Let’s see how function path.parse()
, which parses file system paths, differs for the two platforms:
> path.win32.parse(String.raw`C:\Users\jane\file.txt`)
{
dir: 'C:\\Users\\jane',
root: 'C:\\',
base: 'file.txt',
name: 'file',
ext: '.txt',
}
> path.posix.parse(String.raw`C:\Users\jane\file.txt`)
{
dir: '',
root: '',
base: 'C:\\Users\\jane\\file.txt',
name: 'C:\\Users\\jane\\file',
ext: '.txt',
}
We parse a Windows path – first correctly via the path.win32
API, then via the path.posix
API. We can see that in the latter case, the path isn’t correctly split into its parts – for example, the basename of the file should be file.txt
(more on what the other properties mean later).
Terminology:
> path.posix.sep
'/'
> path.win32.sep
'\\'
> path.posix.delimiter
':'
> path.win32.delimiter
';'
We can see path separators and path delimitors if we examine the PATH shell variable – which contains the paths where the operating system looks for executables when a command is entered in a shell.
This is an example of a macOS PATH (shell variable $PATH
):
> process.env.PATH.split(/(?<=:)/)
[
'/opt/homebrew/bin:',
'/opt/homebrew/sbin:',
'/usr/local/bin:',
'/usr/bin:',
'/bin:',
'/usr/sbin:',
'/sbin',
]
The split separator has a length of zero because the lookbehind assertion (?<=:)
matches if a given location is preceded by a colon but it does not capture anything. Therefore, the path delimiter ':'
is included in the preceding path.
This is an example of a Windows PATH (shell variable %Path%
):
> process.env.Path.split(/(?<=;)/)
[
'C:\\Windows\\system32;',
'C:\\Windows;',
'C:\\Windows\\System32\\Wbem;',
'C:\\Windows\\System32\\WindowsPowerShell\\v1.0\\;',
'C:\\Windows\\System32\\OpenSSH\\;',
'C:\\ProgramData\\chocolatey\\bin;',
'C:\\Program Files\\nodejs\\',
]
Many shells have the concept of the current working directory (CWD) – “the directory I’m currently in”:
cd
.process
is a global Node.js variable. It provides us with methods for getting and setting the CWD:
process.cwd()
returns the CWD.process.chdir(dirPath)
changes the CWD to dirPath
.
dirPath
.Node.js uses the CWD to fill in missing pieces whenever a path isn’t fully qualified (complete). That enables us to use partially qualified paths with various functions – e.g. fs.readFileSync()
.
The following code demonstrates process.chdir()
and process.cwd()
on Unix:
process.chdir('/home/jane');
assert.equal(
process.cwd(), '/home/jane'
);
So far, we have used the current working directory on Unix. Windows works differently:
We can use path.chdir()
to set both at the same time:
process.chdir('C:\\Windows');
process.chdir('Z:\\tmp');
When we revisit a drive, Node.js remembers the previous current directory of that drive:
assert.equal(
process.cwd(), 'Z:\\tmp'
);
process.chdir('C:');
assert.equal(
process.cwd(), 'C:\\Windows'
);
Unix only knows two kinds of paths:
Absolute paths are fully qualified and start with a slash:
/home/john/proj
Relative paths are partially qualified and start with a filename or a dot:
. (current directory)
.. (parent directory)
dir
./dir
../dir
../../dir/subdir
Let’s use path.resolve()
(which is explained in more detail later) to resolve relative paths against absolute paths. The results are absolute paths:
> const abs = '/home/john/proj';
> path.resolve(abs, '.')
'/home/john/proj'
> path.resolve(abs, '..')
'/home/john'
> path.resolve(abs, 'dir')
'/home/john/proj/dir'
> path.resolve(abs, './dir')
'/home/john/proj/dir'
> path.resolve(abs, '../dir')
'/home/john/dir'
> path.resolve(abs, '../../dir/subdir')
'/home/dir/subdir'
Windows distinguishes four kinds of paths (for more information, see Microsoft’s documentation):
Absolute paths with drive letters are fully qualified. All other paths are partially qualified.
Resolving an absolute path without a drive letter against a fully qualified path full
, picks up the drive letter of full
:
> const full = 'C:\\Users\\jane\\proj';
> path.resolve(full, '\\Windows')
'C:\\Windows'
Resolving a relative path without a drive letter against a fully qualified path, can be viewed as updating the latter:
> const full = 'C:\\Users\\jane\\proj';
> path.resolve(full, '.')
'C:\\Users\\jane\\proj'
> path.resolve(full, '..')
'C:\\Users\\jane'
> path.resolve(full, 'dir')
'C:\\Users\\jane\\proj\\dir'
> path.resolve(full, '.\\dir')
'C:\\Users\\jane\\proj\\dir'
> path.resolve(full, '..\\dir')
'C:\\Users\\jane\\dir'
> path.resolve(full, '..\\..\\dir')
'C:\\Users\\dir'
Resolving a relative path rel
with a drive letter against a fully qualified path full
depends on the drive letter of rel
:
full
? Resolve rel
against full
.full
? Resolve rel
against the current directory of rel
’s drive.That looks as follows:
// Configure current directories for C: and Z:
process.chdir('C:\\Windows\\System');
process.chdir('Z:\\tmp');
const full = 'C:\\Users\\jane\\proj';
// Same drive letter
assert.equal(
path.resolve(full, 'C:dir'),
'C:\\Users\\jane\\proj\\dir'
);
assert.equal(
path.resolve(full, 'C:'),
'C:\\Users\\jane\\proj'
);
// Different drive letter
assert.equal(
path.resolve(full, 'Z:dir'),
'Z:\\tmp\\dir'
);
assert.equal(
path.resolve(full, 'Z:'),
'Z:\\tmp'
);
'node:os'
The module 'node:os'
provides us with the paths of two important directories:
os.homedir()
returns the path to the home directory of the current user – for example:
> os.homedir() // macOS
'/Users/rauschma'
> os.homedir() // Windows
'C:\\Users\\axel'
os.tmpdir()
returns the path of the operating system’s directory for temporary files – for example:
> os.tmpdir() // macOS
'/var/folders/ph/sz0384m11vxf5byk12fzjms40000gn/T'
> os.tmpdir() // Windows
'C:\\Users\\axel\\AppData\\Local\\Temp'
There are two functions for concatenating paths:
path.resolve()
always returns fully qualified pathspath.join()
preserves relative pathspath.resolve()
: concatenating paths to create fully qualified paths path.resolve(...paths: Array<string>): string
Concatenates the paths
and return a fully qualified path. It uses the following algorithm:
path[0]
against the previous result.path[1]
against the previous result.Without arguments, path.resolve()
returns the path of the current working directory:
> process.cwd()
'/usr/local'
> path.resolve()
'/usr/local'
One or more relative paths are used for resolution, starting with the current working directory:
> path.resolve('.')
'/usr/local'
> path.resolve('..')
'/usr'
> path.resolve('bin')
'/usr/local/bin'
> path.resolve('./bin', 'sub')
'/usr/local/bin/sub'
> path.resolve('../lib', 'log')
'/usr/lib/log'
Any fully qualified path replaces the previous result:
> path.resolve('bin', '/home')
'/home'
That enables us to resolve partially qualified paths against fully qualified paths:
> path.resolve('/home/john', 'proj', 'src')
'/home/john/proj/src'
path.join()
: concatenating paths while preserving relative paths path.join(...paths: Array<string>): string
Starts with paths[0]
and interprets the remaining paths as instructions for ascending or descending. In contrast to path.resolve()
, this function preserves partially qualified paths: If paths[0]
is partially qualified, the result is partially qualified. If it is fully qualified, the result is fully qualified.
Examples of descending:
> path.posix.join('/usr/local', 'sub', 'subsub')
'/usr/local/sub/subsub'
> path.posix.join('relative/dir', 'sub', 'subsub')
'relative/dir/sub/subsub'
Double dots ascend:
> path.posix.join('/usr/local', '..')
'/usr'
> path.posix.join('relative/dir', '..')
'relative'
Single dots do nothing:
> path.posix.join('/usr/local', '.')
'/usr/local'
> path.posix.join('relative/dir', '.')
'relative/dir'
If arguments after the first one are fully qualified paths, they are interpreted as relative paths:
> path.posix.join('dir', '/tmp')
'dir/tmp'
> path.win32.join('dir', 'C:\\Users')
'dir\\C:\\Users'
Using more than two arguments:
> path.posix.join('/usr/local', '../lib', '.', 'log')
'/usr/lib/log'
path.normalize()
: ensuring paths are normalized path.normalize(path: string): string
On Unix, path.normalize()
:
.
)...
).For example:
// Fully qualified path
assert.equal(
path.posix.normalize('/home/./john/lib/../photos///pet'),
'/home/john/photos/pet'
);
// Partially qualified path
assert.equal(
path.posix.normalize('./john/lib/../photos///pet'),
'john/photos/pet'
);
On Windows, path.normalize()
:
.
)...
)./
) – which is legal – into a the preferred path separator (\
).For example:
// Fully qualified path
assert.equal(
path.win32.normalize('C:\\Users/jane\\doc\\..\\proj\\\\src'),
'C:\\Users\\jane\\proj\\src'
);
// Partially qualified path
assert.equal(
path.win32.normalize('.\\jane\\doc\\..\\proj\\\\src'),
'jane\\proj\\src'
);
Note that path.join()
with a single argument also normalizes and works the same as path.normalize()
:
> path.posix.normalize('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'
> path.posix.join('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'
> path.posix.normalize('./john/lib/../photos///pet')
'john/photos/pet'
> path.posix.join('./john/lib/../photos///pet')
'john/photos/pet'
path.resolve()
(one argument): ensuring paths are normalized and fully qualified We have already encountered path.resolve()
. Called with a single argument, it both normalizes paths and ensures that they are fully qualified.
Using path.resolve()
on Unix:
> process.cwd()
'/usr/local'
> path.resolve('/home/./john/lib/../photos///pet')
'/home/john/photos/pet'
> path.resolve('./john/lib/../photos///pet')
'/usr/local/john/photos/pet'
Using path.resolve()
on Windows:
> process.cwd()
'C:\\Windows\\System'
> path.resolve('C:\\Users/jane\\doc\\..\\proj\\\\src')
'C:\\Users\\jane\\proj\\src'
> path.resolve('.\\jane\\doc\\..\\proj\\\\src')
'C:\\Windows\\System\\jane\\proj\\src'
path.relative()
: creating relative paths path.relative(sourcePath: string, destinationPath: string): string
Returns a relative path that gets us from sourcePath
to destinationPath
:
> path.posix.relative('/home/john/', '/home/john/proj/my-lib/README.md')
'proj/my-lib/README.md'
> path.posix.relative('/tmp/proj/my-lib/', '/tmp/doc/zsh.txt')
'../../doc/zsh.txt'
On Windows, we get a fully qualified path if sourcePath
and destinationPath
are on different drives:
> path.win32.relative('Z:\\tmp\\', 'C:\\Users\\Jane\\')
'C:\\Users\\Jane'
This function also works with relative paths:
> path.posix.relative('proj/my-lib/', 'doc/zsh.txt')
'../../doc/zsh.txt'
path.parse()
: creating an object with path parts type PathObject = {
dir: string,
root: string,
base: string,
name: string,
ext: string,
};
path.parse(path: string): PathObject
Extracts various parts of path
and returns them in an object with the following properties:
.base
: last segment of a path
.ext
: the filename extension of the base.name
: the base without the extension. This part is also called the stem of a path..root
: the beginning of a path (before the first segment).dir
: the directory in which the base is located – the path without the baseLater, we’ll see function path.format()
which is the inverse of path.parse()
: It converts an object with path parts into a path.
path.parse()
on Unix This is what using path.parse()
on Unix looks like:
> path.posix.parse('/home/jane/file.txt')
{
dir: '/home/jane',
root: '/',
base: 'file.txt',
name: 'file',
ext: '.txt',
}
The following diagram visualizes the extent of the parts:
/ home/jane / file .txt
| root | | name | ext |
| dir | base |
For example, we can see that .dir
is the path without the base. And that .base
is .name
plus .ext
.
path.parse()
on Windows This is how path.parse()
works on Windows:
> path.win32.parse(String.raw`C:\Users\john\file.txt`)
{
dir: 'C:\\Users\\john',
root: 'C:\\',
base: 'file.txt',
name: 'file',
ext: '.txt',
}
This is a diagram for the result:
C:\ Users\john \ file .txt
| root | | name | ext |
| dir | base |
path.basename()
: extracting the base of a path path.basename(path, ext?)
Returns the base of path
:
> path.basename('/home/jane/file.txt')
'file.txt'
Optionally, this function can also remove a suffix:
> path.basename('/home/jane/file.txt', '.txt')
'file'
> path.basename('/home/jane/file.txt', 'txt')
'file.'
> path.basename('/home/jane/file.txt', 'xt')
'file.t'
Removing the extension is case sensitive – even on Windows!
> path.win32.basename(String.raw`C:\Users\john\file.txt`, '.txt')
'file'
> path.win32.basename(String.raw`C:\Users\john\file.txt`, '.TXT')
'file.txt'
path.dirname()
: extracting the parent directory of a path path.dirname(path)
Returns the parent directory of the file or directory at path
:
> path.win32.dirname(String.raw`C:\Users\john\file.txt`)
'C:\\Users\\john'
> path.win32.dirname('C:\\Users\\john\\dir\\')
'C:\\Users\\john'
> path.posix.dirname('/home/jane/file.txt')
'/home/jane'
> path.posix.dirname('/home/jane/dir/')
'/home/jane'
path.extname()
: extracting the extension of a path path.extname(path)
Returns the extension of path
:
> path.extname('/home/jane/file.txt')
'.txt'
> path.extname('/home/jane/file.')
'.'
> path.extname('/home/jane/file')
''
> path.extname('/home/jane/')
''
> path.extname('/home/jane')
''
path.isAbsolute()
: Is a given path absolute? path.isAbsolute(path: string): boolean
Returns true
if path
is absolute and false
otherwise.
The results on Unix are straightforward:
> path.posix.isAbsolute('/home/john')
true
> path.posix.isAbsolute('john')
false
On Windows, “absolute” does not necessarily mean “fully qualified” (only the first path is fully qualified):
> path.win32.isAbsolute('C:\\Users\\jane')
true
> path.win32.isAbsolute('\\Users\\jane')
true
> path.win32.isAbsolute('C:jane')
false
> path.win32.isAbsolute('jane')
false
path.format()
: creating paths out of parts type PathObject = {
dir: string,
root: string,
base: string,
name: string,
ext: string,
};
path.format(pathObject: PathObject): string
Creates a path out of a path object:
> path.format({dir: '/home/jane', base: 'file.txt'})
'/home/jane/file.txt'
We can use path.format()
to change the extension of a path:
function changeFilenameExtension(pathStr, newExtension) {
if (!newExtension.startsWith('.')) {
throw new Error(
'Extension must start with a dot: '
+ JSON.stringify(newExtension)
);
}
const parts = path.parse(pathStr);
return path.format({
...parts,
base: undefined, // prevent .base from overriding .name and .ext
ext: newExtension,
});
}
assert.equal(
changeFilenameExtension('/tmp/file.md', '.html'),
'/tmp/file.html'
);
assert.equal(
changeFilenameExtension('/tmp/file', '.html'),
'/tmp/file.html'
);
assert.equal(
changeFilenameExtension('/tmp/file/', '.html'),
'/tmp/file.html'
);
If we know the original filename extension, we can also use a regular expression to change the filename extension:
> '/tmp/file.md'.replace(/\.md$/i, '.html')
'/tmp/file.html'
> '/tmp/file.MD'.replace(/\.md$/i, '.html')
'/tmp/file.html'
Sometimes we’d like to use the same paths on different platforms. Then there are two issues that we are facing:
As an example, consider a Node.js app that operates on a directory with data. Let’s assume that the app can be configured with two kinds of paths:
Due to the aforementioned issues:
We can’t reuse fully qualified paths between platforms.
We can reuse paths that point into the data directory. Such paths may be stored in configuration files (inside the data directory or not) and in constants in the app’s code. To do that:
The next subsection explains how both can be achieved.
Relative platform-independent paths can be stored as Arrays of path segments and turned into fully qualified platform-specific paths as follows:
const universalRelativePath = ['static', 'img', 'logo.jpg'];
const dataDirUnix = '/home/john/data-dir';
assert.equal(
path.posix.resolve(dataDirUnix, ...universalRelativePath),
'/home/john/data-dir/static/img/logo.jpg'
);
const dataDirWindows = 'C:\\Users\\jane\\data-dir';
assert.equal(
path.win32.resolve(dataDirWindows, ...universalRelativePath),
'C:\\Users\\jane\\data-dir\\static\\img\\logo.jpg'
);
To create relative platform-specific paths, we can use:
const dataDir = '/home/john/data-dir';
const pathInDataDir = '/home/john/data-dir/static/img/logo.jpg';
assert.equal(
path.relative(dataDir, pathInDataDir),
'static/img/logo.jpg'
);
The following function converts relative platform-specific paths into platform-independent paths:
import * as path from 'node:path';
function splitRelativePathIntoSegments(relPath) {
if (path.isAbsolute(relPath)) {
throw new Error('Path isn’t relative: ' + relPath);
}
relPath = path.normalize(relPath);
const result = [];
while (true) {
const base = path.basename(relPath);
if (base.length === 0) break;
result.unshift(base);
const dir = path.dirname(relPath);
if (dir === '.') break;
relPath = dir;
}
return result;
}
Using splitRelativePathIntoSegments()
on Unix:
> splitRelativePathIntoSegments('static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]
> splitRelativePathIntoSegments('file.txt')
[ 'file.txt' ]
Using splitRelativePathIntoSegments()
on Windows:
> splitRelativePathIntoSegments('static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]
> splitRelativePathIntoSegments('C:static/img/logo.jpg')
[ 'static', 'img', 'logo.jpg' ]
> splitRelativePathIntoSegments('file.txt')
[ 'file.txt' ]
> splitRelativePathIntoSegments('C:file.txt')
[ 'file.txt' ]
The npm module 'minimatch'
lets us match paths against patterns that are called glob expressions, glob patterns, or globs:
import minimatch from 'minimatch';
assert.equal(
minimatch('/dir/sub/file.txt', '/dir/sub/*.txt'), true
);
assert.equal(
minimatch('/dir/sub/file.txt', '/**/file.txt'), true
);
Use cases for globs:
More glob libraries:
The whole API of minimatch is documented in the project’s readme file. In this subsection, we look at the most important functionality.
Minimatch compiles globs to JavaScript RegExp
objects and uses those to match.
minimatch()
: compiling and matching once minimatch(path: string, glob: string, options?: MinimatchOptions): boolean
Returns true
if glob
matches path
and false
otherwise.
Two interesting options:
.dot: boolean
(default: false
)
If true
, wildcard symbols such as *
and **
match “invisible” path segments (whose names begin with dots):
> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json')
false
> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json', {dot: true})
true
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt')
false
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt', {dot: true})
true
.matchBase: boolean
(default: false
)
If true
, a pattern without slashes is matched against the basename of a path:
> minimatch('/dir/file.txt', 'file.txt')
false
> minimatch('/dir/file.txt', 'file.txt', {matchBase: true})
true
new minimatch.Minimatch()
: compiling once, matching multiple times Class minimatch.Minimatch
enables us to only compile the glob to a regular expression once and match multiple times:
new Minimatch(pattern: string, options?: MinimatchOptions)
This is how this class is used:
import minimatch from 'minimatch';
const {Minimatch} = minimatch;
const glob = new Minimatch('/dir/sub/*.txt');
assert.equal(
glob.match('/dir/sub/file.txt'), true
);
assert.equal(
glob.match('/dir/sub/notes.txt'), true
);
This subsection covers the essentials of the syntax. But there are more features. These are documented here:
Even on Windows, glob segments are separated by slashes – but they match both backslashes and slashes (which are legal path separators on Windows):
> minimatch('dir\\sub/file.txt', 'dir/sub/file.txt')
true
Minimatch does not normalize paths for us:
> minimatch('./file.txt', './file.txt')
true
> minimatch('./file.txt', 'file.txt')
false
> minimatch('file.txt', './file.txt')
false
Therefore, we have to normalize paths if we don’t create them ourselves:
> path.normalize('./file.txt')
'file.txt'
Patterns without wildcard symbols (that match more flexibly) must match exactly. Especially the path separators must line up:
> minimatch('/dir/file.txt', '/dir/file.txt')
true
> minimatch('dir/file.txt', 'dir/file.txt')
true
> minimatch('/dir/file.txt', 'dir/file.txt')
false
> minimatch('/dir/file.txt', 'file.txt')
false
That is, we must decide on either absolute or relative paths.
With option .matchBase
, we can match patterns without slashes against the basenames of paths:
> minimatch('/dir/file.txt', 'file.txt', {matchBase: true})
true
*
) matches any (part of a) single segment The wildcard symbol asterisk (*
) matches any path segment or any part of a segment:
> minimatch('/dir/file.txt', '/*/file.txt')
true
> minimatch('/tmp/file.txt', '/*/file.txt')
true
> minimatch('/dir/file.txt', '/dir/*.txt')
true
> minimatch('/dir/data.txt', '/dir/*.txt')
true
The asterisk does not match “invisible files“ whose names start with dots. If we want to match those, we have to prefix the asterisk with a dot:
> minimatch('file.txt', '*')
true
> minimatch('.gitignore', '*')
false
> minimatch('.gitignore', '.*')
true
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt')
false
Option .dot
lets us switch off this behavior:
> minimatch('.gitignore', '*', {dot: true})
true
> minimatch('/tmp/.log/events.txt', '/tmp/*/events.txt', {dot: true})
true
**
) matches zero or more segments ´**/
matches zero or more segments:
> minimatch('/file.txt', '/**/file.txt')
true
> minimatch('/dir/file.txt', '/**/file.txt')
true
> minimatch('/dir/sub/file.txt', '/**/file.txt')
true
If we want to match relative paths, the pattern still must not start with a path separator:
> minimatch('file.txt', '/**/file.txt')
false
The double asterisk does not match “invisible” path segments whose names start with dots:
> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json')
false
We can switch off that behavior via option .dot
:
> minimatch('/usr/local/.tmp/data.json', '/usr/**/data.json', {dot: true})
true
If we start a glob with an exclamation mark, it matches if the pattern after the exclamation mark does not match:
> minimatch('file.txt', '!**/*.txt')
false
> minimatch('file.js', '!**/*.txt')
true
Comma-separate patterns inside braces match if one of the patterns matches:
> minimatch('file.txt', 'file.{txt,js}')
true
> minimatch('file.js', 'file.{txt,js}')
true
A pair of integers separated by double dots defines a range of integers and matches if any of its elements matches:
> minimatch('file1.txt', 'file{1..3}.txt')
true
> minimatch('file2.txt', 'file{1..3}.txt')
true
> minimatch('file3.txt', 'file{1..3}.txt')
true
> minimatch('file4.txt', 'file{1..3}.txt')
false
Padding with zeros is supported, too:
> minimatch('file1.txt', 'file{01..12}.txt')
false
> minimatch('file01.txt', 'file{01..12}.txt')
true
> minimatch('file02.txt', 'file{01..12}.txt')
true
> minimatch('file12.txt', 'file{01..15}.txt')
true
file:
URLs to refer to files There are two common ways to refer to files in Node.js:
URL
with the protocol file:
For example:
assert.equal(
fs.readFileSync(
'/tmp/data.txt', {encoding: 'utf-8'}),
'Content'
);
assert.equal(
fs.readFileSync(
new URL('file:///tmp/data.txt'), {encoding: 'utf-8'}),
'Content'
);
URL
In this section, we take a closer look at class URL
. More information on this class:
In this blog post, we access class URL
via a global variable because that’s how it’s used on other web platforms. But it can also be imported:
import {URL} from 'node:url';
URLs are a subset of URIs. RFC 3986, the standard for URIs, distinguishes two kinds of URI-references:
URL
Class URL
can be instantiated in two ways:
new URL(uri: string)
uri
must be a URI. It specifies the URI of the new instance.
new URL(uriRef: string, baseUri: string)
baseUri
must be a URI. If uriRef
is a relative reference, it is resolved against baseUri
and the result becomes the URI of the new instance.
If uriRef
is a URI, it completely replaces baseUri
as the data on which the instance is based.
Here we can see the class in action:
// If there is only one argument, it must be a proper URI
assert.equal(
new URL('https://example.com/public/page.html').toString(),
'https://example.com/public/page.html'
);
assert.throws(
() => new URL('../book/toc.html'),
/^TypeError \[ERR_INVALID_URL\]: Invalid URL$/
);
// Resolve a relative reference against a base URI
assert.equal(
new URL(
'../book/toc.html',
'https://example.com/public/page.html'
).toString(),
'https://example.com/book/toc.html'
);
URL
Let’s revisit this variant of the URL
constructor:
new URL(uriRef: string, baseUri: string)
The argument baseUri
is coerced to string. Therefore, any object can be used – as long as it becomes a valid URL when coereced to string:
const obj = { toString() {return 'https://example.com'} };
assert.equal(
new URL('index.html', obj).href,
'https://example.com/index.html'
);
That enables us to resolve relative references against URL
instances:
const url = new URL('https://example.com/dir/file1.html');
assert.equal(
new URL('../file2.html', url).href,
'https://example.com/file2.html'
);
Used this way, the constructor is loosely similar to path.resolve()
.
URL
instances Instances of URL
have the following properties:
type URL = {
protocol: string,
username: string,
password: string,
hostname: string,
port: string,
host: string,
readonly origin: string,
pathname: string,
search: string,
readonly searchParams: URLSearchParams,
hash: string,
href: string,
toString(): string,
toJSON(): string,
}
There are three common ways in which we can convert URLs to strings:
const url = new URL('https://example.com/about.html');
assert.equal(
url.toString(),
'https://example.com/about.html'
);
assert.equal(
url.href,
'https://example.com/about.html'
);
assert.equal(
url.toJSON(),
'https://example.com/about.html'
);
Method .toJSON()
enables us to use URLs in JSON data:
const jsonStr = JSON.stringify({
pageUrl: new URL('https://2ality.com/p/subscribe.html')
});
assert.equal(
jsonStr, '{"pageUrl":"https://2ality.com/p/subscribe.html"}'
);
URL
properties The properties of URL
instances are not own data properties, they are implemented via getters and setters. In the next example, we use the utility function pickProps()
(whose code is shown at the end), to copy the values returned by those getters into a plain object:
const props = pickProps(
new URL('https://jane:pw@example.com:80/news.html?date=today#misc'),
'protocol', 'username', 'password', 'hostname', 'port', 'host',
'origin', 'pathname', 'search', 'hash', 'href'
);
assert.deepEqual(
props,
{
protocol: 'https:',
username: 'jane',
password: 'pw',
hostname: 'example.com',
port: '80',
host: 'example.com:80',
origin: 'https://example.com:80',
pathname: '/news.html',
search: '?date=today',
hash: '#misc',
href: 'https://jane:pw@example.com:80/news.html?date=today#misc'
}
);
function pickProps(input, ...keys) {
const output = {};
for (const key of keys) {
output[key] = input[key];
}
return output;
}
Alas, the pathname is a single atomic unit. That is, we can’t use class URL
to access its parts (base, extension, etc.).
We can also change parts of a URL by setting properties such as .hostname
:
const url = new URL('https://example.com');
url.hostname = '2ality.com';
assert.equal(
url.href, 'https://2ality.com/'
);
We can use the setters to create URLs from parts (idea by Haroen Viaene):
// Object.assign() invokes setters when transferring property values
const urlFromParts = (parts) => Object.assign(
new URL('https://example.com'), // minimal dummy URL
parts // assigned to the dummy
);
const url = urlFromParts({
protocol: 'https:',
hostname: '2ality.com',
pathname: '/p/about.html',
});
assert.equal(
url.href, 'https://2ality.com/p/about.html'
);
.searchParams
We can use property .searchParams
to manage the search parameters of URLs. Its value is an instance of URLSearchParams
.
We can use it to read search parameters:
const url = new URL('https://example.com/?topic=js');
assert.equal(
url.searchParams.get('topic'), 'js'
);
assert.equal(
url.searchParams.has('topic'), true
);
We can also change search parameters via it:
url.searchParams.append('page', '5');
assert.equal(
url.href, 'https://example.com/?topic=js&page=5'
);
url.searchParams.set('topic', 'css');
assert.equal(
url.href, 'https://example.com/?topic=css&page=5'
);
It’s tempting to convert between file paths and URLs manually. For example, we can try to convert an URL
instance myUrl
to a file path via myUrl.pathname
. However that doesn’t always work – it’s better to use this function:
url.fileURLToPath(url: URL | string): string
The following code compares the results of that function with the values of .pathname
:
import * as assert from 'assert';
import * as url from 'node:url';
//::::: Unix :::::
const url1 = new URL('file:///tmp/with%20space.txt');
assert.equal(
url1.pathname, '/tmp/with%20space.txt');
assert.equal(
url.fileURLToPath(url1), '/tmp/with space.txt');
const url2 = new URL('file:///home/thor/Mj%C3%B6lnir.txt');
assert.equal(
url2.pathname, '/home/thor/Mj%C3%B6lnir.txt');
assert.equal(
url.fileURLToPath(url2), '/home/thor/Mjölnir.txt');
//::::: Windows :::::
const url3 = new URL('file:///C:/dir/');
assert.equal(
url3.pathname, '/C:/dir/');
assert.equal(
url.fileURLToPath(url3), 'C:\\dir\\');
This function is the inverse of url.fileURLToPath()
:
url.pathToFileURL(path: string): URL
It converts path
to a file URL:
> url.pathToFileURL('/home/john/Work Files').href
'file:///home/john/Work%20Files'
One important use case for URLs is accessing a file that is a sibling of the current module:
function readData() {
const url = new URL('data.txt', import.meta.url);
return fs.readFileSync(url, {encoding: 'UTF-8'});
}
This function uses import.meta.url
which contains the URL of the current module (which is usually a file:
URL on Node.js).
Using fetch()
would have made the previous code even more cross-platform. However, as of Node.js 18.5, fetch()
doesn’t work for file:
URLs yet:
> await fetch('file:///tmp/file.txt')
TypeError: fetch failed
cause: Error: not implemented... yet...
See the blog post “Node.js: checking if an ESM module is ‘main’”.
file:
URLs When shell scripts receive references to files or export references to files (e.g. by logging them on screen), they are virtually always paths. However, there are two cases where we need URLs (as discussed in previous subsections):