This document contains an overview of the minimum skillset required for the general software engineering candidate. Many of the important points here are derived from the Cracking the Coding Interview, 6th Edition print.
We assume the use of the python3 programming language.
Use the STAR method for responding to non technical competency questions.
Situation
Task (Objective)
Action
Response/Result
Amazon Leadership Principles
Although these principles are tailored to Amazon, they form a solid foundation for competency interviews for all companies.
Customer Obsession
Leaders start with the customer and work backwards. They work vigorously to earn and keep customer trust. Although leaders pay attention to competitors, they obsess over customers.
Ownership
Leaders are owners. They think long term and don’t sacrifice long-term value for short-term results. They act on behalf of the entire company, beyond just their own team. They never say “that’s not my job.”
Invent and Simplify
Leaders expect and require innovation and invention from their teams and always find ways to simplify. They are externally aware, look for new ideas from everywhere, and are not limited by “not invented here.” As we do new things, we accept that we may be misunderstood for long periods of time.
Are Right, A Lot
Leaders are right a lot. They have strong business judgment and good instincts. They seek diverse perspectives and work to disconfirm their beliefs.
Learn and Be Curious
Leaders are never done learning and always seek to improve themselves. They are curious about new possibilities and act to explore them.
Hire and Develop the Best
Leaders raise the performance bar with every hire and promotion. They recognize exceptional talent, and willingly move them throughout the organization. Leaders develop leaders and take seriously their role in coaching others. We work on behalf of our people to invent mechanisms for development like Career Choice.
Insist on the Highest Standards
Leaders have relentlessly high standards—many people may think these standards are unreasonably high. Leaders are continually raising the bar and driving their teams to deliver high-quality products, services, and processes. Leaders ensure that defects do not get sent down the line and that problems are fixed so they stay fixed.
Think Big
Thinking small is a self-fulfilling prophecy. Leaders create and communicate a bold direction that inspires results. They think differently and look around corners for ways to serve customers.
Bias for Action
Speed matters in business. Many decisions and actions are reversible and do not need extensive study. We value calculated risk taking.
Frugality
Accomplish more with less. Constraints breed resourcefulness, self-sufficiency and invention. There are no extra points for growing headcount, budget size, or fixed expense.
Earn Trust
Leaders listen attentively, speak candidly, and treat others respectfully. They are vocally self-critical, even when doing so is awkward or embarrassing. Leaders do not believe their or their team’s body odor smells of perfume. They benchmark themselves and their teams against the best.
Dive Deep
Leaders operate at all levels, stay connected to the details, audit frequently, and are skeptical when metrics and anecdote differ. No task is beneath them.
Have Backbone; Disagree and Commit
Leaders are obligated to respectfully challenge decisions when they disagree, even when doing so is uncomfortable or exhausting. Leaders have conviction and are tenacious. They do not compromise for the sake of social cohesion. Once a decision is determined, they commit wholly.
Deliver Results
Leaders focus on the key inputs for their business and deliver them with the right quality and in a timely fashion. Despite setbacks, they rise to the occasion and never settle.
Runtime of typical recursive algorithms with multiple branches:
Reference Table
The notations are ranked from most efficient to least efficient.
Notation
Name
Notes
O(1)
constant
Reference on a look-up table
O(\log{n})
logarithmic
Binary search on sorted array
O(n)
linear
Search on an unsorted array
O(n\log{n})
loglinear
Merge sort runtime complexity
O(n^2)
quadratic
Bubble/Selection/Insertion sort
O(n^c)
polynomial
Tree-adjoining grammar parsing
O(c^n)
exponential
Exact Travelling Salesman (Dynamic Programming)
O(n!)
factorial
Brute Force Search Travelling Salesman
Examples
ins_sort
def ins_sort(arr): for s_idx in range(1, len(arr)): # 1. s_val = arr[s_idx] for i_idx, c_val in enumerate(arr[:s_idx]): # 2. if s_val < c_val: arr[i_idx + 1: s_idx + 1] = arr[i_idx:s_idx] # 3. arr[i_idx] = s_val break return arr
We iterate through the array to 1. find the elements to insert, then another time to 2. find the position to insert. Once the position is found, we 3. perform an array slice assignment. All inserts occur in place. This ins_sort has runtime complexity and space complexity.
$ python3 -m timeit -n 100 '([0] * 2000000)[1:2] = range(1)' # 100 loops, best of 5: 9.13 msec per loop $ python3 -m timeit -n 100 '([0] * 2000000)[1:20000] = range(19999, -1, -1)' # 100 loops, best of 5: 11 msec per loop $ python3 -m timeit -n 100 '([0] * 2000000)[1:200000] = range(199999, -1, -1)' # 100 loops, best of 5: 53.7 msec per loop $ python3 -m timeit -n 100 '([0] * 2000000)[1:2000000] = range(1999999, -1, -1)' # 100 loops, best of 5: 191 msec per loop
range_sum
def range_sum(n): """Return `n` + `n-1` + ... + `1`. """ if n > 0: return n + range_sum(n-1) return 0
Each recursion adds an additional level to the stack. This range_sum implementation has runtime and space complexity.
def memo_two_exp(o_n): """Memoized recursive calculation of `2**n` for `n >= 0`. """ cache = {0: 1} def _two_exp(n): if n in cache: return cache[n] if n > 0: cache[n] = _two_exp(n - 1) + _two_exp(n - 1) return cache.get(n, 1) return _two_exp(o_n)
Each recursion splits into two separate branches, but one of the branches is guaranteed to hit the cache. This memo_two_exp implementation has runtime and space complexity.
The standard library implementation, collections.deque, is a doubly linked list supporting efficient O(1) appends and pops for either side of the deque.
class LinkedList(object): """Custom linked list implementation. You probably want `collections.deque` instead. """
class Node(object): def __init__(self, data, prev_node=None, next_node=None): self.data = data self.prev = prev_node self.next = next_node
def __init__(self, iterable=[]): """ Initialize a linked list from an iterable. O(n) initialization time. """ self.len = len(iterable) iterator = iter(iterable) # Create the doubly linked list, requires a forward and backwards pass # head to tail iteration, creates tail -> head linked nodes while True: try: cur_val = next(iterator) try: t_cur_node = LinkedList.Node(cur_val, prev_node=t_cur_node) except NameError: # on the first element, t_cur_node is not defined t_cur_node = LinkedList.Node(cur_val) except StopIteration: break
# tail to head iteration, creates head -> linked nodes head_cur_node = self.tail while True: try: head_cur_node.next = h_cur_node except NameError: # on the last element, h_cur_node is not defined pass finally: h_cur_node = head_cur_node
if h_cur_node is None or h_cur_node.prev is None: # we've reached the head again head_cur_node = h_cur_node break head_cur_node = h_cur_node.prev
self.head = head_cur_node
def append(self, val): """ Append a new value to the right of the linked list. O(1) time. """ cur_tail = self.tail if cur_tail is None: new_tail = LinkedList.Node(val) self.head = new_tail else: new_tail = LinkedList.Node(val, prev_node=cur_tail) cur_tail.next = new_tail self.tail = new_tail self.len += 1
def pop(self): """ Remove and return an element from the right side of the linked list. If no elements are present, raises an IndexError. O(1) time. """ if self.tail is None: raise IndexError("pop from empty linked list") to_pop = self.tail self.tail = to_pop.prev if self.tail is not None: self.tail.next = None if to_pop == self.head: self.head = None self.len -= 1 return to_pop.data
def appendleft(self, val): """ Append a new value to the left of the linked list. O(1) time. """ cur_head = self.head if cur_head is None: new_head = LinkedList.Node(val) self.tail = new_head else: new_head = LinkedList.Node(val, next_node=cur_head) cur_head.prev = new_head self.head = new_head self.len += 1
def popleft(self): """ Remove and return an element from the left side of the linked list. If no elements are present, raises an IndexError. O(1) time. """ if self.head is None: raise IndexError("popleft from empty linked list") to_pop = self.head self.head = to_pop.next if self.head is not None: self.head.prev = None if to_pop == self.tail: self.tail = None self.len -= 1 return to_pop.data
def index(self, x): """ Return the position of x in the linked list in O(n) time. Returns the first match or raises ValueError if not found. """ cur_node = self.head idx = 0 while cur_node and cur_node.data != x: cur_node = cur_node.next idx += 1 if cur_node: return idx raise ValueError(f"{x} is not in linked list")
def __iter__(self): # return a head to tail travelling iterator. cur_node = self.head while type(cur_node) == LinkedList.Node: yield cur_node.data cur_node = cur_node.next
def __len__(self): return self.len
Stacks and Queues
The stack (last-in, first-out) and queue (first-in, first-out) data structures are trivially implemented using the list builtin or the linked list data structure. The appends and pops have O(1) time complexity.
Implementation
# Simple stack, using a list stack = list() stack.append("A") # add to stack stack.pop() # remove from stack
# Simple queue, using deque from collections import deque lifo_queue = deque() # LinkedList() will also work lifo_queue.appendleft("A") # add to queue lifo_queue.pop() # remove from queue
Graphs, Trees, Heaps, Tries
Python does not have a standard library graph implementation. A simple graph can be constructed using a dictionary and lists and serves as the base data structure.
A tree is a connected graph without cycles. A node is called a leaf node if it has no children. A binary tree is a tree where each node has up to two neighbors/children. A binary search tree is a binary tree where each node satisfies the condition: all_left_descendents <= n < all_right_descendents. A complete binary tree is where each level of the tree is fully filled from left to right. The last level may not be full filled, but the nodes must still satisfy the left to right property. A full binary tree is where all nodes have either zero or two children. A perfect binary tree is where all interior nodes have two children and all leaf nodes are on the same level.
A min-heap is a complete binary tree where each node is smaller than its children. A max-heap is a complete binary tree where each node is greater than its children.
A trie, or a prefix tree, is an n-ary tree where all leaf nodes are semantically terminating nodes. One common use case is to store of English language prefix lookups.
Implementation
class Graph(object): def __init__(self, edges=[], nodes=[]): """Construct a simple directional graph. Nodes can be inferred from edges. All values must be hashable. """ self.data = {} for node in nodes: self.add_node(node) for (from_node, to_node) in edges: self.add_edge(from_node, to_node)
def _rebalance(self): """Double the lookup table's size""" new_lut_len = self.lut_len * 2 new_lut = [[] for _ in range(new_lut_len)] for sub_lut in self.lut: for key, value in sub_lut: new_lut_idx = hash(key) % new_lut_len for new_lut_iidx, (lut_key, _) in enumerate(new_lut[new_lut_idx]): if lut_key == key: new_lut[new_lut_idx][new_lut_iidx] = (key, value) break else: # break was not called new_lut[new_lut_idx].append((key, value)) self.lut_len = new_lut_len self.lut = new_lut
Algorithms
BFS/DFS
Breadth First Search (BFS) and Depth First Search (DFS) are two different ways to traverse a graph.
Step
Breadth First Search
Depth-First-Search
0
Node 0
Node 0
1
Node 1
-Node 1
2
Node 4
--Node 3
3
Node 5
---Node 2
4
Node 3
---Node 4
5
Node 2
-Node 5
Implementation
Our BFS/DFS algorithm relies on the Graph implementation above.
def __iter__(self): while self.to_visit: cur = self.to_visit.pop() # First In, First out if cur in self.visited: continue for neighbor in self.graph.neighbors(cur): if neighbor not in self.visited: self.to_visit.appendleft(neighbor) self.visited.add(cur) yield cur
def __iter__(self): while self.to_visit: cur = self.to_visit.popleft() # First In, Last out if cur in self.visited: continue # reversed to preserve order of edges, but not 100% necessary for neighbor in reversed(self.graph.neighbors(cur)): if neighbor not in self.visited: self.to_visit.appendleft(neighbor) self.visited.add(cur) yield cur
The quicksort algorithm. The runtime of this implementation is with space complexity, due to the recursion increasing the stack respective to the length of the array. All operations are performed in place.
Implementation
def quicksort(m_arr): """Recursive implementation of Quicksort."""
def _partition(arr, lo, hi): pos = lo for i in range(lo, hi): if arr[i] < arr[hi]: arr[i], arr[pos] = arr[pos], arr[i] pos += 1 # put the pivot back arr[pos], arr[hi] = arr[hi], arr[pos] return pos
def _quicksort(arr, lo=0, hi=None): if hi is None: hi = len(arr) - 1 if lo < hi: p = _partition(arr, lo, hi) _quicksort(arr, lo=lo, hi=p - 1) _quicksort(arr, lo=p + 1, hi=hi)
_quicksort(m_arr) return m_arr
Dynamic Programming
Dynamic programming is an approach to more effectively compute recursive problems. The key takeaway is that these problems are built off of solutions to smaller sub problems.
knapsack
Given a list of weighted values and a capacity, return the maximum value that can be carried without exceeding the available capacity. For example, with [(2, 3), (4, 11), (3, 12)] and a capacity of 5, you can carry the items of weight 2, and 3 for a total value of (3 + 12) = 15. Each item can be taken at most one time.
Recursion Only
def knapsack(constraint, w_vals): """ w_vals is list of 2-tuples (weight, value) constraint is int representing maximum weight you can only choose an item once [.... .. (4, 4), (3, 12), (2, 100)] """ total_value = 0 for idx, (weight, value) in enumerate(w_vals): remaining = constraint - weight if remaining >= 0: total_value = max( total_value, knapsack(remaining, w_vals[idx + 1:]) + value ) return total_value
Memoized Recursion (Top-Down)
def knapsack_td(td_constraint, td_w_vals): """Memoized version of `knapsack`, top-down Dynamic Programming. """ cache = {}
def _knapsack(constraint, w_vals): if (constraint, w_vals) in cache: return cache[(constraint, w_vals)] total_value = 0 for idx, (weight, value) in enumerate(w_vals): remaining = constraint - weight if remaining >= 0: total_value = max( total_value, _knapsack(remaining, w_vals[idx + 1:]) + value ) cache[(constraint, w_vals)] = total_value return total_value
return _knapsack(td_constraint, tuple(td_w_vals))
Tabular (Bottom-Up)
def knapsack_bu(constraint, w_vals): """Tabularized version of `knapsack`, bottom-up Dynamic Programming. """ dp = [ [0 for wv in range(constraint + 1)] for c in range(len(w_vals)) ]
# Unnecessary, because everything initialized to 0 # for idx in range(constraint): # # Base case for top row. # # constraint == 0 is 0, as it cannot hold anything. # dp[0][idx] = 0
for c in range(1, constraint + 1): for w_idx in range(len(w_vals)): weight, value = w_vals[w_idx] if weight <= c: # The current weight can be held in the constraint dp[w_idx][c] = max( dp[max(0, w_idx-1)][c], # Previous item at current constraint dp[max(0, w_idx-1)][max(0, c-weight)] + value # Previous item + constraint diff best ) else: # No fit, extend from last best item dp[w_idx][c] = dp[w_idx-1][c] return dp[len(w_vals) - 1][constraint]