Software Engineering Interview Prep

Dec, 10 2020
6 Minutes

I hate the Algorithm Interview, but it's here to stay.

This post is a set of useful resources and my notes to help you prepare for algorithm interviews.

Useful Resources

I've been impressed by the amount of excellent resources that have come out in the last three years to help prep. Gayle might have cracked the coding interview, but Abdul Bari and NeetCode, are busting them wide open.

Resources for Other types of Interviews

Not everyone is forcing people to do algorithms. An increasing number of interviews are looking like the one Firebase used to use - a legitimate take home problem. Still, lots of them resemble algorithm problems. And often, companies will sprinkle in an algorithm problem or two at the onsite.

Here are other resources to help you prepare for non-algorithm interviews:

Leetcode Notes

As I do more Leetcode problems, I'll add some notes here. This section exists to catalogue my struggles as well as record key "takeaways" such that I remember the high level strategies that help.

3Sum - this problem was painful for me. The optimal solution is so simple once you understand it, but I had to try a couple attempts at this (3 days in a row, after looking at a very well commented solution)

Takeaways: sorting, two pointers (converging from the left and right onto something in the middle)

Search in Rotated Sorted Array - binary search! My solution didn't match the solutions of others, but was fairly elegant.  A friend of mine said that "you need to be able to write binary search on a whim" for interviews. I think that's absolutely true.

Takeaways: practice implementing binary search every now and then. There are a few gotchas that can trip up an algorithm (i.e. lots of off by one errors).

Combination Sum - backtracking! I had never implemented a backtracking algorithm before, but after seeing a few examples, I got the hang of it. The Leetcode discussions have a very good generalized way of attacking backtracking problems. It makes sense that there is a collection and you "add", try an operation, then "remove"  (which aids in the name of backtracking).

Takeaways: when you need to find ALL solutions to a problem, there's a decent chance backtracking can help. As Abdul Bari says, if you need to find an optimal solution, there's a chance dynamic programming is your friend. Backtracking is also useful for "word search" type problems.

Valid Binary Search Tree - I feel like I've encountered this problem in interviews before. For some reason, I always reach for a recursive solution (which works just fine), but it's useful to remember the in-order, pre-order, and post-order traversals. Here's a great discussion link about in-order traversal and how it can also be used to find the "k-th smallest" item in a binary search tree (also a problem I feel like I've encountered).

Takeaways: Practice in-order traversal

Number of Islands - Ah, hello my old friend. I remember when I first attempted this, I spent over 2 hours and I couldn't solve it. Now, I can whip up a pretty tasty solution in 20 minutes. This go around, I forgot a couple of small insights. Reminder: iterate through the map (double for loop) in search for an island, do BFS once you hit an island mutating the island with breadcrumbs as you go, and make sure you count properly.

Takeaways: remember that BFS is for search, not necessarily for performance.

Word Search - I hadn't done this problem before and I found this to be significantly harder than Number of Islands or Combination Sum! This problem is a combination of backtracking plus depth first search. Like Number of Islands, you need to scan through the entire matrix to find a "starting location", and then perform a backtracked DFS. Backtracking is still quite unnatural for me, so I'm going to need to do some practice with this technique.

I'm realizing that the Leetcode "acceptance" rate is often a better proxy for difficulty than the difficulty rating itself.

Takeaways: practice backtracking! Add to path, check the path, remove from path if it didn't work out.

Binary Tree Maximum Subpath - I didn't complete this problem, and found the edge cases quite difficult. I looked at the solutions in the discussions and was horrified to find that the "elegant" solution involved recursion with mutation. That's the type of "clever" performant solution that gets you in trouble in production - maybe if you're writing low level libraries that won't get touched for a while, but this kind of code is not maintenance friendly. Performant yes, but too much of this stuff and you'll kill team performance. In my opinion, this question shows how algorithm interviews are actually harmful.

Takeaways: avoid companies that ask this question unless they pay extremely well or are working on highly performant systems software and the tradeoff between developer productivity/maintenance and performance is worth it.

Longest Consecutive Sequence - I used a variation of UnionFind to solve this, but once I saw the NeetCode solution, I felt like an absolute idiot.

Takeaways: Always draw out solutions. Sometimes drawing things out will help you see rather obvious invariants that you can utilize in your solution. This is true not just in algorithm interviews, but mathematics, proofs, and other "insight driven" disciplines.

Abdul Bari YouTube Notes

These are notes from watching Abdul Bari's video lectures.

Prim's and Kruskal's Algorithms (Minimum Spanning Trees)

While reviewing The Algorithm Design Manual, I realized I had never really attempted a solution at the Traveling Salesman Problem. This took me down a rabbit hole of reading about Minimum Spanning Trees, and later, UnionFind (aka the DisjointSets data structure).

Prim's and Kruskal's algorithms are ways to generate Minimum Spanning Trees.

The key thing to remember here are both of these algos are greedy. That is, they start with the smallest edge, and go from there.

  • Prim: start w/smallest edge, find next smallest connected edge that doesn't form a cycle, and continue
  • Kruskal: start w/smallest edge, find next smallest (can be unconnected from the first) that doesn't form a cycle, and continue

Prim is O(n^2) and Kruskal is O(m logm) where n is the number of vertices in the graph, and m is the number of edges.

UnionFind (aka DisjointSets)

This data structure is extremely useful for Prim's and Kruskal's algorithms becuase both rely on cycle detection.

The UnionFind data structure is excellent at helping us detect cycles.

UnionFind can be most efficiently implemented as an array (which in my opinion is difficult to understand). However, at it's core, it is a collection of Sets of vertices: Collection<Set<Vertex>>

It's hard to explain in words, but here's my best shot:

  • Build up UnionFind/DisjointSets by adding edges to the collection of disjoint sets
  • When an edge is added, one of 4 possibilities may occur
  • 1) The edge is disjoint from all existing sets. Add a new disjoint set to the collection
  • 2) The edge is connected to one of the sets. "Enlarge" the set.
  • 3) The edge joins two disjoint sets together. Perform a union of those two sets (make sure to delete the previously disjoint sets from the collection)
  • 4) The edge is already "spanned" by one of the sets. In this case, we do nothing. However, in the case of Prim's and Kruskal's - this condition triggers our cycle detected mechanism.

NP Problems

I've never understood all the stuff about P and NP. After watching Abdul, I'm still a little confused, but I think it's a little more clear. These don't really come up in interviews, but it's useful to recognize if an interviewer presents a problem that is NP.

Polynomial time: n, logn, n2, nlogn, n3

Not polynomial time: 2n, n!

Even n10 is much smaller for 2n for somewhat sizeable values of n.

NP stands for non-deterministic polynomial. What does that mean?

A non-deterministic algorithm is one that contains steps that are "non-deterministic". I don't fully understand these "non-deterministic" steps, but you can write algorithms using them and treat them essentially as magic.

If at some point in the future, we figure out how to implement those magical steps, then our non-deterministic algorithms can be solved in polynomial time.

Thus, a problem is NP if there exists an algorithm which uses a "non-deterministic" step to arrive at a polynomial time solution.

There's more about NP-Complete and NP-Hard, but I'll need to do another watch through to properly understand them.