Ah, the famous “Google-style” algorithmic coding interview.

If you’ve never had one of these interviews before, the idea is to see if you can write code that’s not only *correct*, but *efficient,* too. You can expect to spend lots of time diagramming data structures and talking about big O notation. Popular hits include “reverse a linked list in place,” “balance a binary search tree,” and “find the missing number in an array.”

This kind of interview is the standard at all the big tech companies—Facebook, Amazon, Microsoft, Twitter, and Google. And many smaller companies use this kind of interview, as well.

But the algorithmic coding interview is contentious these days. Some people say it’s too academic and tests things that have little to do with real-life software engineering. Not long ago, Max Howell, the author of Homebrew (software that basically *every* engineer with a Mac uses), famously quipped about being rejected from Google after being unable to invert a binary tree.

Like it or not, a “Google-style” coding interview may stand between you and your dream job. So it’s in your best interest to learn how to beat it.

And you can. The algorithmic coding interview is a game you can learn how to win.

### Patterns

One of the biggest fears candidates have is that coding interview questions rely on “blind insights”—fancy tricks you couldn’t *possibly* know unless you’d seen the answer already.

But these insights *aren’t* blind. They come from knowledge of a small set of *patterns* that come up again and again. If you can learn the patterns, you can unlock the insights.

Today, I’m going to teach you the *most common* pattern:** hashing**.

First, a couple disclaimers:

- To
*really*understand what we’re doing here, you’ll want a working understanding of big O notation. If you’re a bit rusty, you might want to read up on big O.

- We’ll be writing Python code, but we’ll avoid some fancy “Pythonisms” in favor of making our procedure as transparent as possible. Think of it as pseudocode that happens to be runnable Python.

Okay, on with the show! We’ll learn about hashing by solving a few problems with it.

### Problem 1: Mode

*Given an array of numbers, return the *mode*—the number that appears the most times.*

This is a common programming interview question. And as with many great interview questions, there are at least three ways to solve it.

### Solution 1: Brute Force

By “brute force” we mean “looking at *each possible answer.*“ In this case, that’s looking at each *number* and getting its count by looping through the array *again*. From all these counts, we return the number with the max count.

def get_mode_brute_force(nums):

mode = None

max_count = 0

for potential_mode in nums:

#get its count

count = 0

for num in nums:

if num == potential_mode:

count += 1

#is it a new mode?

if count > max_count:

max_count = count

mode = potential_mode

return mode

This’ll work, and it’s an okay place to start. It’ll take O(n²) time, since we’re nesting two loops that each go through all *n* items in our array.

How could we bring down that time cost? We’d need to avoid looping through the array twice…

But how can we know *how many times* each number appears, without looping through the whole array once for each number?

### Solution 2: Sorting

If we sort the array, all occurrences of a number will be *next to each other*. So in just one pass, we can count the occurrences for each number as we encounter it.

def get_mode_sorting(nums):

#edge cases

if len(nums) == 0:

return None

elif len(nums) == 1:

return nums[0]

sorted_nums = sorted(nums)

mode, previous_num = None, None

max_count, current_count = 0, 0

for current_num in sorted_nums:

#update count

if current_num == previous_num:

current_count += 1

#new mode?

if current_count > max_count:

max_count = current_count

mode = current_num

# reset for next iteration

if current_num != previous_num:

current_count = 1

previous_num = current_num

return mode

This’ll cost O(nlogn) time in total. That’s faster than O(n²), so it’s an improvement. But the driver of our time cost is that first sorting step, which costs O(nlogn) time.

If we avoid sorting, we can bring the time cost all the way down to O(n).

### Solution 3: Hashing

Let’s take *one* walk through the array, collecting the counts for each number in a dictionary (called a “hash map,” “hash table,” or “hash” in many non-Python languages).

*That*’s the key—using a dictionary. They have O(1)-time insertions and lookups (on average). So we can collect the counts for *all* of our numbers in just one pass, by updating each number’s count in our dictionary as we go.

def get_mode_hashing(nums):

# keys: numbers in our input array

# values: the number of times that key number appears

# e.g.: if 5 appears 3 times, then counts[5] == 3

counts = {}

# step 1: get the counts

for num in nums:

if num in counts:

counts[num] += 1

else:

counts[num] = 1

# step 2: get the number with the max count

max_count = 0

mode = None

for num, count in counts.iteritems():

if count > max_count:

max_count = count

mode = num

return mode

This takes O(n) time, which is the optimal runtime for this problem! And we unlocked it by exploiting the O(1)-time insertions and lookups that dictionaries give us.

Using a dictionary or similar data structure to save time is *so often* the “trick” or “blind insight” needed to get the optimal answer that it should just *always be our first thought* in a coding interview.

To help us build an intuition for how to apply hashing to other problems, we’ll look at a couple more examples.

### Problem 2: Repeat Number

*Given an array of numbers where one number appears twice, find the repeat number.*

If we make hashing our first thought, we can derive the optimal solution pretty quickly! We could even use the exact same “counts” dictionary again, returning a number as soon as its count crept above 1.

But we don’t really need the *count* for each number—we just need to know *whether it has appeared at all*. So it’s a bit cleaner to use a set in this case. The idea is similar—under the hood, sets use “hashing” to get O(1)-time insertions and lookups, just like hash maps. In fact, the common set data structure in Java is called “HashSet.”

def get_repeat_number(nums):

nums_seen = set()

for num in nums:

if num in nums_seen:

return num

nums_seen.add(num)

This takes O(n) time, which is the best we can do for this problem!

And just as with our first problem, we *could* have used a brute force approach (nesting two loops) to get O(n²) time. Or we could have started off by sorting, to get O(nlgn) time. See? There are repeating patterns here!

If the brute force and/or sorting solutions don’t come to you right away, try to derive them. Part of being great at algorithm design is being able to see the different ways to solve the same problem—including the *less efficient* ways!

### Problem 3: Sorting

*Given an array of numbers in the range 1..1000, return a new array with those same numbers, in sorted order. There may be repeats in the input array. If there are, you should include those repeats in your sorted answer.*

First thought: hash maps!

Let’s use that same “counts” dictionary we used in problem 1. Once we have the count for each number (which we can get in O(n) time) we just need to walk through the range 1..1000 once, adding each number to our “sorted” array *x* times, where x is its count from the counts dictionary.

def sort_nums(nums):

counts = {}

# step 1: get the counts

for num in nums:

if num in counts:

counts[num] += 1

else:

counts[num] = 1

# step 2: generate the sorted array

sorted_nums = []

for num in range(1, 1001):

if num in counts:

for _ in range(counts[num]):

sorted_nums.append(num)

return sorted_nums

This is called a counting sort. Importantly, it only works well if we can “bound” our numbers into a certain small-ish range (in this case, 1..1000). Otherwise step 2 above could be quite inefficient.

As with the previous two problems, we could use a “brute force” approach, by nesting two loops—we’d end up with a selection sort or an insertion sort, taking O(n²) time. See? Brute force is a pattern, too!

### Practice, And Look for More Patterns

Hashing isn’t *always* the answer. But it often is: It’s the *most common* pattern. It should always be your first thought.

If you really want to ace your coding interviews, you’ll need to invest time in running practice problems. There’s no way around it. But each practice problem will take you much farther if you pay careful attention to *patterns*. Don’t just read the answer and understand why it works—identify the pattern or “blind insight” at play, and think about how it could be applied to other problems.

Once you’ve learned some patterns, those “insights” won’t seem so “blind” anymore. *Then* you’ll really feel confident solving problems you’ve never seen before!

*Parker Phinney founded **Interview Cake**, a website that teaches software engineering job-seekers how to beat the coding interview. He’s sick of seeing engineers miss out on the jobs they deserve because they’re underprepared for coding interviews. He runs a **weekly email list** that sends engineers a new practice problem every Tuesday. He lives in San Francisco and loves riding his bicycle.*

Ok prerequisite for reading the article is understanding of O notation. Well should not this be also a prerequsite to writing such articles?

insertion of an integer into a dict could be done in O(1), but look up is definitely not an O(1) operation.

For your first problem, if you measure it, your ‘Sorting’ solution is actually *faster* than your ‘Hashing’ solution.

Details here: http://missingbytes.blogspot.com/2015/11/optimization-measure-performance-versus.html

Cite from the HashMap java doc: This implementation provides constant-time performance for the basic operations (get and put)

Cite from java doc of HashMap: This implementation provides constant-time performance for the basic operations (get and put)

LOL… this isn’t going to prevent anyone from getting onto a development team, except maybe the project manager.

With the hashing solution, you’d save a loop if the max_count and mode were set in the first loop instead. You would compare directly against the hash table using count[num]

@Slava:

Before you criticise someone, you should do some research. Given appropriate size and hash function, the average lookup time in a hash map is indeed O(1). We’re not talking worst-case complexity here.

If I cycle through a list of n elements twice, then this is O(2n) not O(n^2). An O(n^2) problem would be where I have an inner loop of n for each n elements I visit in the outer loop.

The second solution for mode above runs a sort( ) first. The expense of this needs to be included.

Otherwise with a language which has hashes you can do it in one pass and pop off the result.

foreach num in list

//bidirectional hash the frequency

back{hash{num}++}=num;

//sort frequency, pop the top, lookup the number it counts

return back{pop sort values(hash)}.

Nice if you could code and measure these for CPU, memory etc.

For the first problem, you only need 1 loop instead of two.

@Crion

We don’t? Well in that case I can say I can sort an array in O(N) time if it comes already or almost sorted to begin with.

I think that at this point in my career, if someone asked me in an interview to write sorting code, I would ask why we would want to do that when the language library (Python, STL, Java…) already contains heavily optimized sorting methods and classes. Or at worse, if we have to write our own, the provably fastest sort is quicksort and five seconds of googling got me http://interactivepython.org/runestone/static/pythonds/SortSearch/TheQuickSort.html .

My return question would be, “Why does the company value employees reinventing the wheel.” If they don’t like that question, that’s not a place I want to work.

Hey, when in Rome, do as the Romans do. Sometimes they want the wheel reinvented to prove you know how one is made. Sometimes they just want you to know how you can get the wheel that you need. It’s always good to ask. If they won’t say, then assume the normal panoply of any reasonable programming environment, use qsort() etc.

Sorry Hitech Redneck, I’ve got to agree with Glenn Glazer here. The problem is that employers are wasting time.

The only thing that should matter – and ITIL will tell you this – is what the business wants/needs. That’s it. How I get there should not matter, as long as those needs are met. That goes to resourcefulness and out-of-the-box thinking. You could argue that thinking about “Big O” is OOTB thinking; it’s not. You’re targeting one way of doing a thing. That’s a Steve Jobs way of thinking that, as we’ve seen recently, is not sustainable.

All that should matter to an employer is, “can you do X in Y amount of time?” That’s it. Trying to get me to program YOUR way should only matter AFTER I’ve started and gotten acclimated to your tools.

During an interview?

Should just be, “Write a program/website that does X. You’ve got Y minutes to finish it, then Z minutes to explain and demonstrate it/show your code.”

That tells you many things about me: That I can do, that I will do, that I follow instructions, that I can explain what I did (which means I didn’t cheat) and I can show my code (which validates essentially I know how to use an IDE). Also, it gives you a chance to tell these guys “hey look, here’s how I did this, did you know you could do it this way? Here’s how it works…” and really WOW them (if they’re open-minded).

Why is this important? Because that’s how the real world is.

You’re given a task and a deadline. Nobody really cares how you get there, as long as you get there; but they hopefully appreciate when you can build a better mousetrap, and save costs at the same time.

So why’s the interview so hell bent to focus on minutiae? Because they don’t really want to hire you if you’re a disruptor. Google it.

So I agree: if they’re not open to a simpler, faster, easier, leaner, less clunky way of achieving a defined objective…I wouldn’t want to work there, either. You’ll just get burnt out with their caveman mentality and resistance to new technologies after some time.

Tricks? Patterns? This is not what a good programming test looks like. As such, not many companies will use them. At least not any worth working for. Quality coding tests evaluate your problem solving abilities and efficiency. Companies either create their own coding questions or use a service that does that for them, for example: https://www.testdome.com/

The point is – there is no trick, only skill and experience.