In this blog post, we take a look at the ECMAScript 2025 feature “Duplicate named capturing groups” which was proposed by Kevin Gibbons.
It’s a feature for regular expressions that enables us to use the same capturing group name more than once.
It’s clear why duplicate capturing group names are normally not allowed: A capture can only have a single value and therefore would have to ignore the other groups – e.g.:
> /^(?<x>a)(?<x>b)$/
SyntaxError: Invalid regular expression: /^(?<x>a)(?<x>b)$/:
Duplicate capture group name
In a match, group x
can capture either 'a'
or 'b'
, not both.
However, if the duplicate names exist in different alternatives, then there is no such conflict. That was previously not allowed either but is allowed now – e.g.:
> /^((?<x>a)|(?<x>b))$/.exec('a').groups
{ x: 'a' }
> /^((?<x>a)|(?<x>b))$/.exec('b').groups
{ x: 'b' }
Why is that useful? It lets us reuse regular expression fragments and match-processing code between alternatives.
Let’s look at examples next.
Function parseMonth()
uses a regular expression to parse a string with one of two month formats:
const {raw} = String;
const {stringify} = JSON;
const RE_MONTH = new RegExp(
raw`^` +
raw`(?<year>[0-9]{4})-(?<month>[0-9]{2})` +
raw`|` +
raw`(?<month>[0-9]{2})\/(?<year>[0-9]{4})` +
raw`$`
);
function parseMonth(monthStr) {
const match = RE_MONTH.exec(monthStr);
if (match === null) {
throw new Error(
'Not a valid month string: ' + stringify(monthStr)
);
}
// Same code for both alternatives
return {
year: match.groups.year,
month: match.groups.month,
};
}
assert.deepEqual(
parseMonth('2024-05'),
{ year: '2024', month: '05'}
);
assert.deepEqual(
parseMonth('05/2024'),
{ year: '2024', month: '05'}
);
assert.throws(
() => parseMonth('2024/05')
);
The code below demonstrates that duplicate named capturing groups enable us to to reuse regular expression fragments – in this case: KEY
and VALUE
.
const {raw} = String;
const KEY = raw`(?<key>[a-z]+)`;
const VALUE = raw`(?<value>[a-z]+)`;
const RE_KEY_VALUE_PAIRS = new RegExp(
raw`\(${KEY}=${VALUE}\)` +
raw`|` +
raw`\[${KEY}:${VALUE}\]`,
'g'
);
const str = '[one:a] (two=b)';
const objects = Array.from(
str.matchAll(RE_KEY_VALUE_PAIRS),
// Same code for both alternatives
(match) => ({key: match.groups.key, value: match.groups.value})
);
assert.deepEqual(
objects,
[
{ key: 'one', value: 'a' },
{ key: 'two', value: 'b' },
]
);
Comments:
string.matchAll()
returns an iterable.Array.from()
to convert that iterable to an Array.Array.from()
is a callback that is applied to elements before they are put into the returned Array. Think array.map()
.Backreferences to duplicate named groups work as expected. The following example is contrived (because we could simply use a single named group) but it illustrates what’s possible:
const RE_DELIMITED = /^((?<delim>\_)|(?<delim>\*))[a-z]+\k<delim>$/;
assert.equal(
RE_DELIMITED.test('_abc_'), true
);
assert.equal(
RE_DELIMITED.test('*abc*'), true
);
assert.equal(
RE_DELIMITED.test('_abc*'), false
);
assert.equal(
RE_DELIMITED.test('*abc_'), false
);
In practice, duplicate named capturing groups are probably most useful to people who write parsers and tokenizers based on regular expressions. For those, it is a highly welcome addition.
Further reading:
re
for composing regular expressions interesting.