ECMAScript 2025 feature: duplicate named capturing groups for regular expressions

[2024-05-16] dev, javascript, es2025
(Ad, please don’t block)

In this blog post, we take a look at the ECMAScript 2025 feature “Duplicate named capturing groups” which was proposed by Kevin Gibbons.

It’s a feature for regular expressions that enables us to use the same capturing group name more than once.

Duplicate named capturing groups  

It’s clear why duplicate capturing group names are normally not allowed: A capture can only have a single value and therefore would have to ignore the other groups – e.g.:

> /^(?<x>a)(?<x>b)$/
SyntaxError: Invalid regular expression: /^(?<x>a)(?<x>b)$/:
Duplicate capture group name

In a match, group x can capture either 'a' or 'b', not both.

However, if the duplicate names exist in different alternatives, then there is no such conflict. That was previously not allowed either but is allowed now – e.g.:

> /^((?<x>a)|(?<x>b))$/.exec('a').groups
{ x: 'a' }
> /^((?<x>a)|(?<x>b))$/.exec('b').groups
{ x: 'b' }

Why is that useful? It lets us reuse regular expression fragments and match-processing code between alternatives.

Let’s look at examples next.

Use case: alternative formats with similar parts  

Function parseMonth() uses a regular expression to parse a string with one of two month formats:

const {raw} = String;
const {stringify} = JSON;

const RE_MONTH = new RegExp(
  raw`^` +
  raw`(?<year>[0-9]{4})-(?<month>[0-9]{2})` +
  raw`|` +
  raw`(?<month>[0-9]{2})\/(?<year>[0-9]{4})` +
  raw`$`
);

function parseMonth(monthStr) {
  const match = RE_MONTH.exec(monthStr);
  if (match === null) {
    throw new Error(
      'Not a valid month string: ' + stringify(monthStr)
    );
  }
  // Same code for both alternatives
  return {
    year: match.groups.year,
    month: match.groups.month,
  };
}

assert.deepEqual(
  parseMonth('2024-05'),
  { year: '2024', month: '05'}
);
assert.deepEqual(
  parseMonth('05/2024'),
  { year: '2024', month: '05'}
);
assert.throws(
  () => parseMonth('2024/05')
);

Use case: reusing regular expression fragments  

The code below demonstrates that duplicate named capturing groups enable us to to reuse regular expression fragments – in this case: KEY and VALUE.

const {raw} = String;
const KEY = raw`(?<key>[a-z]+)`;
const VALUE = raw`(?<value>[a-z]+)`;
const RE_KEY_VALUE_PAIRS = new RegExp(
  raw`\(${KEY}=${VALUE}\)` +
  raw`|` +
  raw`\[${KEY}:${VALUE}\]`,
  'g'
);

const str = '[one:a] (two=b)';
const objects = Array.from(
  str.matchAll(RE_KEY_VALUE_PAIRS),
  // Same code for both alternatives
  (match) => ({key: match.groups.key, value: match.groups.value})
);
assert.deepEqual(
  objects,
  [
    { key: 'one', value: 'a' },
    { key: 'two', value: 'b' },
  ]
);

Comments:

  • string.matchAll() returns an iterable.
  • We use Array.from() to convert that iterable to an Array.
  • The optional second parameter of Array.from() is a callback that is applied to elements before they are put into the returned Array. Think array.map().

Backreferences  

Backreferences to duplicate named groups work as expected. The following example is contrived (because we could simply use a single named group) but it illustrates what’s possible:

const RE_DELIMITED = /^((?<delim>\_)|(?<delim>\*))[a-z]+\k<delim>$/;
assert.equal(
  RE_DELIMITED.test('_abc_'), true
);
assert.equal(
  RE_DELIMITED.test('*abc*'), true
);
assert.equal(
  RE_DELIMITED.test('_abc*'), false
);
assert.equal(
  RE_DELIMITED.test('*abc_'), false
);

Support in JavaScript engines  

  • The proposal maintains a list of engines that already support duplicate named capturing groups.
  • I tested the code in this blog post with Markcheck and used a Babel plugin (because Node’s V8 doesn’t support the feature yet).
    • Caveat: It only transparently supports regular expression literals; I used a workaround:

Conclusion and further reading  

In practice, duplicate named capturing groups are probably most useful to people who write parsers and tokenizers based on regular expressions. For those, it is a highly welcome addition.

Further reading: