STL algorithms
for programming interviews
C++ is packed with data structures and algorithms that can help experienced programmers solve problems during technical interviews. These data structures and algorithms are part of the Standard Template Library (STL). This post will focus on STL algorithms you should be familiar with if you are preparing for technical interviews and how they can be used to solve interesting problems.
There are cases for which solving a complete problem using a library algorithm isn’t acceptable as the interviewer may want you to implement the algorithm yourself, but most of the time, an algorithm is an implementation detail. The best way to determine this is by asking your interviewer.
Non-modifying algorithms
- std::count, std::min_element, std::max_element
- std::accumulate
- std::equal, std::mismatch
- std::find, std::find_if_not
- std::lower_bound, std::upper_bound
Modifying algorithms
std::count
std::min_element
std::max_element
I’m bundling these algorithms together because they serve the same purpose: iterating through a range and reducing it into a single value. This task comes up often in programming interviews and these three operations are especially common: counting the number of times an element appears in a range and finding out the position of the min/max element in a range.
std::count
has a single overload that takes a pair of iterators and the value we want to count. It’s using operator==
for equality by default. If you want to pass your own predicate, look no further than std::count_if
.
std::min_element
and std::max_element
in their basic form take two iterators. These algorithms use operator<
for comparison by default. Unlike std::count
, these algorithms return an iterator instead of a value which is useful if you need to perform additional operations on the range as is often the case. Also unlike std::count
, they provide an overload that takes a comparison function.
Implementing these algorithms is trivial but using built-in functions instead will simplify your code. For example, if you need to get the index of the smallest element in an array and you find yourself writing something like this:
int MinElementIdx(const vector<int>& vec) {
auto min_value = numeric_limits<int>::max();
auto min_value_idx = 0;
for (int i = 0; i < size(vec); ++i) {
if (vec[i] < min_value) {
min_value = vec[i];
min_value_idx = i;
}
}
return min_value_idx;
}
You can replace the entire block of code with a single call to std::min_element
:
int MinElementIdx(const vector<int>& vec) {
return min_element(begin(vec), end(vec)) - begin(vec);
}
The result isn’t just shorter, it’s also less susceptible to bugs, and it’s easier to understand.
std::accumulate
std::accumulate
is related to the algorithms presented above – it too summarizes a range into a single value – but it provides greater flexibility. This algorithm comes with warnings so we will give it the attention it deserves.
All the algorithms we’re going to cover live in <algorithm>
except for std::accumulate
that lives under <numeric>
. Despite its location, it can be used for non-numeric types.
In its basic form, it takes a pair of iterators and it returns the sum of the elements in the range using operator+
. Since operator+
requires two values, it takes a starting value as well. In another form, std::accumulate
also takes a function that lets you decide how to compute the sum.
It might seem straightforward, but there are a couple of things to keep in mind:
- The starting value type has to match the container type. If you provide a range of
double
and a starting value of typeint
, the return value will be reduced to anint
. - The second overload of
std::accumulate
takes a function that makes it possible to do all kinds of things with it but to maintain the expressivity of your code the return value should always represent a useful summary of the elements in the range.
If you’re asking yourself what kinds of things can be done with a single algorithm, you should watch “std::accumulate: Exploring an Algorithmic Empire” by Ben Deane, but we will keep things simple. Let’s write a function that takes an array of positive integers and returns the sum of all the even numbers in the array:
int sum_even_numbers(const vector<int>& arr) {
return accumulate(begin(arr), end(arr), 0, [](int acc, int x){
return x % 2 == 0 ? acc + x : acc;
});
}
std::equal
std::mismatch
These two algorithms are used for comparing ranges and you can figure out what they do by their names: testing for equality. While both are useful, std::mismatch
is a little more interesting because it provides details of the differences between the two ranges.
The obvious use-case for both algorithms is comparing ranges and ranges within different container types (when comparing entire containers of the same type, the equality operators are usually preferred) but both algorithms can also be used to solve a certain group of problems that come up often in interviews.
std::equal
takes three iterators: begin/end iterators of the first range, and a begin iterator of the comparison range. It uses operator==
to determine if two elements are equal and it returns a boolean value. There’s another form that takes a binary predicate for custom comparisons.
std::mismatch
takes the same arguments (including an overload that takes a binary predicate), but instead of returning a boolean value, it returns a pair of iterators. If a mismatch is found, it returns the first mismatching pair of elements from both ranges, otherwise, it returns end and the corresponding element from the comparison range.
Keep in mind that the caller must ensure the comparison range contains enough elements.
It may not seem obvious at first glance, but these algorithms are useful when it comes to solving one-away equality problems. For example, suppose you are given a non-empty string from which you may delete at most one character, and you need to determine if you can make it a palindrome:
bool valid_palindrome(const string& str) {
auto end = str.begin() + str.size() / 2;
auto diff = mismatch(str.begin(), end, str.rbegin());
return diff.first == end ||
equal(diff.first + 1, end + 1, diff.second) ||
equal(diff.first, end, diff.second + 1);
}
After using std::mismatch
there can be three outcomes that will determine if the string is a valid palindrome: 1. there’s no mismatch; 2. after skipping the first character in the first range, both ranges are equal; 3. after skipping the first character in the second range, both ranges are equal.
std::find
std::find_if
std:find_if_not
Looking for values in a container is one of the most common tasks you need to perform during interviews, and the STL gives you several options to choose from. The most important thing to keep in mind is that “find algorithms” should only be used for unsorted ranges. The next group of algorithms will deal with sorted ranges and cover the difference.
std::find
takes a pair of iterators and the value you are looking for. It’s using operator==
for equality and returns an iterator to the first match. If the value you are looking for can’t be found in the range, std::find
returns an iterator to the last element.
std::find_if
and std::find_if_not
are similar to std::find
except they take a unary predicate as well as a pair of iterators to determine equality. The difference between the two is find if looks for the first match while find if not looks for the first mismatch.
I already mentioned that algorithms that return iterators can be combined with other algorithms, but they can also be used to construct new containers. For example, suppose we need to write a function that trims leading spaces from a string:
string trim_leading_spaces(const string& str) {
return {
find_if_not(begin(str), end(str), [](char c){
return c == ' ';
}),
end(str)
};
}
std::lower_bound
std::upper_bound
If you are preparing for interviews you should know all about binary search. Many interview questions require you to implement specialized variations of binary search, but in some cases, a built-in algorithm will do.
The STL provides an algorithm called std::binary_search
but I never found it useful because it returns true/false if a given value exists in the range instead of an iterator that points to a value. Since a given value may appear more than once, we need two variations:
std::lower_bound
returns an iterator that points to the first element equal to or greater than the value.std::upper_bound
returns an iterator that points to the first element that’s greater than the value.
Both of these algorithms take a pair of iterators and a value to search for.
There’s a subtle but important difference between the two. Suppose you are given a sorted array of integers [1, 3, 3, 5, 6, 7]
and you want to find the position of the value 3, which index should be returned? Sometimes we want to find the lower-bound and sometimes the upper-bound:
int find_first_of_element(const vector<int>& vec, int val) {
return lower_bound(begin(vec), end(vec), val) - begin(vec);
}
int find_last_of_element(const vector<int>& vec, int val) {
return upper_bound(begin(vec), end(vec), val) - begin(vec) - 1;
}
int main() {
vector<int> vec = {1, 3, 3, 5, 6, 7};
cout << find_first_of_element(vec, 3) << '\n'; // prints 1
cout << find_last_of_element(vec, 3) << '\n'; // prints 2
}
It’s important to remember that to maintain logarithmic time complexity these algorithms require random-access iterators (array, vector, string, and deque). They would work otherwise but the time complexity will most likely be linear.
According to the standard, passing unsorted ranges to std::lower_bound and std::upper_bound is undefined behavior so if you are seeing unexpected results, make sure this precondition was met.
std::nth_element
The most ubiquitous STL algorithms are std::sort
and std::partition
, so I won’t be covering them, but there is a sorting algorithm that is usually glossed over. The reasons for this could be its non-descriptive name or the subtle differences that separate it from similar algorithms. Either way, it can be used to solve an entire class of interview problems with higher efficiency.
std::nth_element
expects three random access iterators: the beginning of the range, the nth position, and the end of the range. It sorts the range such that the nth position points to the element that would have been there if the range was sorted, and all the elements before the nth position are less than or equal to all the elements after it.
std::nth_element
differs from std::partition
since we specify the number of elements we want in the first part of the partition, instead of the sorting criteria that will be used to determine the location of the partition.
The best way to understand this algorithm is through an example. Suppose you are given a list of unsorted numbers and you want to find the kth largest element in the sorted order (a common interview question with real-world application).
Input: nums = [3,2,1,5,6,4], k = 2
Output: 5
The brute-force approach would be sorting the entire list and returning the element in the kth position:
int kth_largest(vector<int>& nums, int k) {
sort(begin(nums), end(nums));
return nums[size(nums) - k];
}
The time complexity of this approach is O(n log n) which is suboptimal in this case.
A common solution that improves on this approach uses a priority queue to sort a subset of the list. This approach will drive the time complexity down to O(n log k), which is acceptable in most cases, but we can do better.
Quickselect is a selection algorithm related to Quicksort that lets you select an element based on sorting criteria at a given position in a range. Unlike most solutions to this problem, the time complexity of Quickselect is linear on average. Interviewers won’t expect you to implement it yourself, but you don’t have to because that is what std::nth_element
is for (and how it is implemented by most library vendors).
int kth_largest(vector<int>& nums, int k) {
auto offset = size(nums) - k;
nth_element(begin(nums), begin(nums) + offset, end(nums));
return nums[offset];
}
Unlike the brute-force approach, this function has an average complexity of O(n).
std::shuffle
std::shuffle
comes in handy when you need to generate random permutations for a range such that each permutation has an equal probability. It has a single overload that takes a pair of iterators and a uniform random number generator.
std::shuffle
is usually implemented using the Fisher-Yates shuffle algorithm which is relatively trivial, however, if shuffling is an implementation detail, using a built-in function is preferrable.
Suppose we need to create a data structure initialized with an array of integers, that provides provide two methods:
Shuffle()
: which returns a random permutation of the original array.Reset()
: which resets the array to its original state and returns it.
We can use std::shuffle
to solve this problem:
class Array {
public:
explicit Array(const vector<int>& arr)
: arr(arr), src(arr), seed(random_device{}()){}
vector<int> Shuffle() {
shuffle(begin(arr), end(arr), seed);
return arr;
}
vector<int> Reset() { return src; }
private:
vector<int> arr, src;
default_random_engine seed;
};
If you look up shuffling in C++ you may come across std::random_shuffle
, the deprecated predecessor of std::shuffle
. It is simpler to use because you don’t need to give it a random number generator, but you should avoid it because it’s using rand()
to produce random numbers. If you want to learn about the problems of using rand()
, I strongly recommend watching rand() Considered Harmful by Stephan T. Lavavej.
std::next_permutation
Generating permutations come up often in technical interviews. Most permutations problems can be solved with an algorithm that returns the next lexicographically-ordered permutation of a range.
In its simplest form, std::next_permutation
takes a pair of iterators and it uses operator<
to determine the next lexicographically-ordered permutation based on the current ordering of the range. It modifies the range in place and returns true
if the next permutation is greater.
Since this algorithm can be applied to many problems I want to share two examples. The first one (and perhaps most common) suppose we need to write a function that takes an array of integers and returns all possible permutations. Most solutions involve backtracking, but you can use std::next_permutation
instead without compromising time complexity:
vector<vector<int>> Permutations(const vector<int>& nums) {
vector<vector<int>> output;
auto permutation = nums;
do {
next_permutation(begin(permutation), end(permutation));
output.emplace_back(permutation);
} while (permutation != nums);
return output;
}
In the second problem, suppose we are given a positive integer n, and we need to write a function that returns the smallest integer that contains the same digits as n and is greater in value than n. We can conclude from this problem statement that we need the next permutation of n if such permutation exists.
int NextGreaterElement(int n) {
auto permutation = to_string(n);
if (!next_permutation(begin(permutation), end(permutation))) {
return -1; // the next permutation isn't greater than n
}
return stoi(permutation);
}
std::rotate
Rotating elements in a container is a common building block to many algorithms. In this context, rotation means shifting n elements in a container to the left or the right in a circular manner. For example, suppose we are asked to shift three elements in the following array to the right [1, 2, 3, 4, 5, 6]
. The expected output of this operation is: [4, 5, 6, 1, 2, 3]
.
It may not seem obvious at first, but rotation can be used to solve many problems. Consider the following example of an insertion sort algorithm which I have taken directly from cppreference.com:
vector<int> v = {2, 4, 2, 0, 5, 10, 7};
for (auto i = v.begin(); i != v.end(); ++i) {
rotate(upper_bound(v.begin(), i, *i), i, i + 1);
}
std::rotate
takes three forward iterators: the beginning of the range, the element that should appear at the beginning of the “rotated” range, and the end of the range.
For a common interview question, suppose you are given a sorted array that was rotated k times, and you are asked to rotate it back into place. This problem can be solved with a single line of code:
void RotateBack(vector<int>& arr, int k) {
rotate(begin(arr), begin(arr) + size(arr) - k + 1, end(arr));
}
We can use a sorting algorithm to solve this problem but the time complexity of std::rotate
is linear in the distance of the original range, better than most standard sorting algorithms.
std::remove
std::remove
is one of the most confusing and commonly used algorithms in the library. It’s confusing because, despite its name, it doesn’t remove anything.
Every container has a different way of erasing elements, which depends on its underlying data-structure, but std::remove
doesn’t have a way to reflect on the container it is operating on.
So what does it do? std::remove
moves all the elements that should not be removed to the beginning of the range (while maintaining their relative order) and returns an iterator to the new logical end of the range.
When we use std::remove
, we usually want to remove elements from a range, so it’s often called in conjunction with an erase member function. Together, these two function calls make up the erase-remove idiom.
std::remove
takes a pair of iterators and the value that should be removed. If you want to provide a predicate, you can use std::remove_if
. Most containers provide an erase member function that takes a range, so if you are given an array of integers [1, 7, 0, 3, 5, 9, 0, 3, 4]
, and you are asked to remove all the 0’s from it, you can use erase-remove:
void RemoveZeroesFromArray(vector<int>& arr) {
arr.erase(remove(begin(arr), end(arr), 0), end(arr));
}
Combining STL algorithms through the use of iterators is what makes it so powerful. Once you get the hang of it you can come up with creative combinations to solve interesting problems, and I want to share an example of that. Suppose you are given an array of integers, and you need to write a function that moves all the 0s to the end of it while maintaining the relative order of the non-zero elements. As an added constraint, you need to do it in linear time and constant space.
One way to solve this problem using STL algorithms is:
void RemoveZeroesFromArray(vector<int>& arr) {
arr.erase(remove(begin(arr), end(arr), 0), end(arr));
}
Since std::remove
doesn’t delete any elements to maintain a linear time complexity, we can use it to move all the non-zero elements to the front of the array and combine it with std::fill
to reset the remainder with 0s.