Blog

Filtering a nested array with another nested array

todayFebruary 26, 2021

I often write code that operates on Google Sheets, which means that I have to deal with nested arrays a lot. One of the things I have to do is to find rows in one array that don't have corresponding rows in another nested array.

For instance, I may have a Google Sheet with some user information, and another sheet with file information. Files are created for unique combinations of user departments and user roles. As more users are added to the users sheet, I need to find department-role combinations that don't already have a corresponding file.

Let's assume the following situation:

const users = [
    ["First Name", "Last Name", "Department ID", "Role ID"],
    ["John", "Smith", "D1020", "R17"],
    ["Dan", "Rhodes", "D1020", "R17"],
    ["Mary", "Jones", "D1010", "R20"],
    ["Steve", "Arden", "D1020", "R20"],
];

const files = [
  ["Department ID", "Role ID", "File ID", "File URL"],
  ["D1020", "R17", "ABC123", "https://..."],
];

I already have a file for the D1020-R17 combination, but not for D1010-R20 and not for D1020-R20.

Instead of using nested loops to iterate through the users array and compare each row to each row in the files array, I use higher-order functions that make the code cleaner.

To get the users whose departments-roles don't have a file, I use the .find() method inside the .filter() method:

const usersWithNoFiles = users.filter(
  ([_, __, deptId, roleId]) =>
    !files.find((row) => row[2] === deptId && row[3] === roleId)
  );

The find method is will be truthy if none of the rows contain the data I'm looking for. The result I get:

[
  [ 'Mary', 'Jones', 'D1010', 'R20' ],
  [ 'Steve', 'Arden', 'D1020', 'R20' ]
]

This works well, but in my world it's not very resilient. The data in my Google Sheets is populated by humans, who are not very consistent. I often encounter different casing as well as leading and trailing spaces. To normalize against this, I use a quick utility function that can be easily expanded to handle more variations:

function __c(value) {
  return value.toString().toLowerCase().trim();
}

Another issue, is that column order changes from time to time, as spreadsheet users shift columns around. So, yeah, I have another utility for that:


  function getColIdx(headersParam, headerParam) {
    if (!Array.isArray(headersParam)) {
      throw Error(`Expected an array and received ${typeof headersParam}`);
    }
    if (!headerParam) {
      throw Error(`Didn't receive a header to look up`);
    }
  
    const headers = headersParam.map((header) => __c(header));
    const header = __c(headerParam);
  
    const colIdx = headers.indexOf(header);
    if (colIdx === -1) {
      throw Error(`Can't find ${headerParam} in ${headersParam.join(', ')}`);
    }
    return colIdx;
  }

The getColIdx function uses the cleanup utility as well. What I normally do is slice (or splice, depending on the use case) the header row of the Google Sheet and pass it together with the header I'm after to the function.

To take advantage of these utilities, I revise the above code as follows:

  const usersHeaders = usersInSomeProject.slice(0, 1).flat();
  const filesHeaders = filesForSomeProject.slice(0, 1).flat();

  const uDeptColIdx = getColIdx(usersHeaders, 'Department ID');
  const uRoleColIdx = getColIdx(usersHeaders, 'Role ID');
  const fDeptColIdx = getColIdx(filesHeaders, 'Department ID');
  const fRoleColIdx = getColIdx(filesHeaders, 'Role ID');

  const usersWithNoFiles = usersInSomeProject.filter(
    (user) =>
      !filesForSomeProject.find(
        (file) =>
          __c(file[fDeptColIdx]) === __c(user[uDeptColIdx]) &&
          __c(file[fRoleColIdx]) === __c(user[uRoleColIdx])
      )
  );

  console.log(usersWithNoFiles);
}

I hope this helps you out.

– Ben