Eliminating duplicate objects: three approaches

[2020-07-17] dev, javascript
(Ad, please don’t block)

In this blog post, we look at three approaches for eliminating duplicate objects from Arrays.

Eliminating duplicates from Arrays  

Eliminating duplicate values from Arrays is simple – as long as the values are primitive (and therefore compared by value):

const values = ['jane', 'lars', 'jane'];
const uniqueValues = [...new Set(values)];
assert.deepEqual(
  uniqueValues,
  ['jane', 'lars']);

However, that approach doesn’t work with the following data because a Set would consider each object to be unique:

const members = [
  {
    first: 'Jane',
    last: 'Bond',
    id: '10yejma',
  },
  {
    first: 'Lars',
    last: 'Croft',
    id: '1hhs0k2',
  },
  {
    first: 'Jane',
    last: 'Bond',
    id: '1y15hhu',
  },
];

Approach 1: building a new Array without duplicates  

const uniqueMembers1 = [];

for (const m of members) {
  if (!containsMember(uniqueMembers1, m)) {
    uniqueMembers1.push(m);
  }
}

function containsMember(memberArray, member) {
  return memberArray.find(
    (m) => m.first === member.first && m.last === member.last);
}

assert.deepEqual(
  uniqueMembers1,
  [
    {
      first: 'Jane',
      id: '10yejma',
      last: 'Bond',
    },
    {
      first: 'Lars',
      id: '1hhs0k2',
      last: 'Croft',
    },
  ]);

We only add an object from members to uniqueMembers1 if it doesn’t already exist in that Array. That’s why the earlier Jane with ID '10yejma' “wins”.

Approach 2: using .filter()  

// Only keep members m if they appear for the first time
const uniqueMembers2 = members.filter(
  (m, index, ms) => getIndexOfMember(ms, m) === index);

function getIndexOfMember(memberArray, member) {
  return memberArray.findIndex(
    (m) => m.first === member.first && m.last === member.last);
}

assert.deepEqual(
  uniqueMembers2,
  [
    {
      first: 'Jane',
      id: '10yejma',
      last: 'Bond',
    },
    {
      first: 'Lars',
      id: '1hhs0k2',
      last: 'Croft',
    },
  ]);

getIndexOfMember() returns the first index where a member appears. We tell .filter() to only keep the members at those indices. That’s why the first Jane “wins”.

Approach 3: a Map from unique keys to members  

// Create a Map from [key,value] pairs:
const uniqueKeyToMember = new Map(
  members.map(m => [m.first+'\t'+m.last, m])); // [key, value]
const uniqueMembers3 = [...uniqueKeyToMember.values()];

assert.deepEqual(
  uniqueMembers3,
  [
    {
      first: 'Jane',
      id: '1y15hhu',
      last: 'Bond',
    },
    {
      first: 'Lars',
      id: '1hhs0k2',
      last: 'Croft',
    },
  ]);

If an object that is added to the Map has the same unique key as a previous object, it replaces that object. That’s why the later Jane with ID '1y15hhu' “wins”.

Variants of approach 1 and approach 2  

  • We use a Set to record unique keys of members that we have visited.
  • If the key of the current member is already in that Set, then we don’t push it (approach 1) or filter it out (approach 2).

This may help with large Arrays because we don’t have to search them. On the other hand, creating keys and maintaining a Set affects performance negatively.

Further reading