# Filtering Data using Multiple Criteria with O(n) Complexity

Can we filter data using multiple criteria with O(n) time complexity? Yes, it is easy.

A few days ago, I stumbled upon a question of how to filter an array of object data using multiple criteria with O(n) complexity. I think it's an excellent example of a question to answer with fun.

We have a list of objects representing items with several attributes. We also have a list of criteria. Items having one of those attributes should be excluded from the result. The original list is as below:

There is some discussion about whether the `excludes`

list is a constant or not. In my opinion, it is evident that excludes list is a dynamic list. Otherwise, there would be no question at all. So to make it obvious, I will add a touch to the question. Write the function as in the following code:

```
interface Item {
type: string;
name: string;
color: string;
}
interface Exclude {
key: string;
value: string;
}
const someItems: Item[] = [
{ type: "a", name: "a1", color: "yellow" },
{ type: "b", name: "b1", color: "orange" },
{ type: "c", name: "orange", color: "green" },
];
const someExcludes: Exclude[] = [
{ key: "color", value: "yellow" },
{ key: "name", value: "x" },
];
// Create the following function with O(n) complexity.
function excludeItems(items: Item[], excludes: Exclude[]): Item[] {}
excludeItems(items, excludes);
```

## Big *O* Notation

Wikipedia defines Big O notation as follows:

Bigis a mathematical notation that describes the limiting behavior of a function when the argument tends towards a particular value or infinity. Big O is a member of a family of notations invented by Paul Bachmann, Edmund Landau, and others, collectively calledOnotationBachmann–Landau notationorasymptotic notation. The letter O was chosen by Bachmann to stand forOrdnung, meaning the order of approximation.

In computer science, big O notation is used to classify algorithms according to how their run time or space requirements grow as the input size grows. In analytic number theory, big O notation is often used to express a bound on the difference between an arithmetical function and a better understood approximation.

In our example, you should solve the problem in a linear order. Roughly, the time we are allowed to spend is the total number of `items`

and `excludes`

.

## O(n^{2}) Solution

The first solution that comes to mind is as follows:

```
items.filter((item) =>
!excludes.some((exclude) => item[exclude.key] === exclude.value)
);
```

However, this has a complexity of O(n^{2}). The time we need is roughly the multiplication of the number of items and the number of criteria.

The following table is a rough comparison of the complexities:

Items | Criteria | O(n2) Time | O(n) Time |
---|---|---|---|

10 | 10 | 100 | 20 |

100 | 100 | 10.000 | 200 |

1.000 | 1.000 | 1.000.000 | 2000 |

1.000.000.000 | 10.000 | 10.000.000.000.000 | 1.000.010.000 |

Solving this problem with O(n) complexity will be much better.

## O(n) Solution

Reading the list `items`

already takes time **O(n)**. Now we need to reduce access time to the `excludes`

list to **O(1)**. Here we can inspire from database engines. I will create an index by iterating over `excludes`

list. Since the total steps in the loop only depend on the number of excluded criteria, this is a linear operation completed in **O(m)** time.

Usually, accessing a value using Hash or Map lookup takes **O(1)** time. We will iterate all items in **O(n)** time and check whether its three attributes exist in the excluded index in **O(3)** time.

```
interface ExcludeIndex {
type: Set<string>;
name: Set<string>;
color: Set<string>;
}
function excludeItems(items: Item[], excludes: Exclude[]): Item[] {
const excludeIndex: ExcludeIndex = { type: new Set(), name: new Set(), color: new Set() };
// Populate Exclude Index: O(m)
excludes.forEach((exclude) => excludeIndex[exclude.key].add(exclude.value));
// Filter Items: O(n)
return items.filter(
(item) => !excludeIndex.type.has(item.type)
&& !excludeIndex.name.has(item.name)
&& !excludeIndex.color.has(item.color),
);
}
```

Our total time complexity is **O(m) + (O(n) * O(3)) = O(m + 3n)**

## O(n) === O(m + 3n)

Couldn't we solve the problem in **O(n)** time complexity? **O(n)** means a linear solution. In this context, all below terms are equal in complexity:

**O(n) = O(3n)O(n) = O(m + n)O(n) = O(m + 3n)**

## Conclusion

Like the example above, most of our daily problems can be solved using a little data transformation.

What do you think about the above solution? Can you find an alternative? Please share with readers in the comments.

## Comments ()