Why Trees Matter
Trees show up in 30%+ of Google interview questions. Google’s email specifically mentions binary trees, N-ary trees, tries, balanced trees, and all traversal algorithms.
But here’s what most tutorials get WRONG: they skip how data is represented and jump straight to algorithms. This chapter fixes that.
The Fundamental Question: How Do You Represent a Tree?
A tree is an abstract concept — nodes connected in a hierarchy. But in code, you must choose a concrete representation. This choice affects:
-
What operations are fast (O(1) vs O(n))
-
How much memory you use
-
How naturally the algorithm maps to the data
There are two primary representations for binary trees:
Representation 1: Node Objects with Pointers
class TreeNode:
"""Each node is an object with references to its children."""
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left # Pointer to left child (or None)
self.right = right # Pointer to right child (or None)
# Building a tree:
# 1
# / \
# 2 3
# / \
# 4 5
root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)
When to use:
-
Most interview problems use this representation
-
When the tree is sparse (not every node has both children)
-
When you need to traverse from parent to child (natural with
.left/.right) -
When the tree changes shape (inserts/deletes)
Tradeoffs:
Pros |
• Natural for recursive algorithms (just follow pointers) |
Cons |
• No O(1) access by position/index |
Representation 2: Array (Level-Order Layout)
# A binary tree can be stored in a flat array using level-order (BFS order).
# The index arithmetic lets you find parent and children in O(1):
#
# Parent of node at index i: (i - 1) // 2
# Left child of node at index i: 2 * i + 1
# Right child of node at index i: 2 * i + 2
#
# Tree: 1
# / \
# 2 3
# / \
# 4 5
#
# Array: [1, 2, 3, 4, 5]
# Index: 0 1 2 3 4
#
# For incomplete trees, we use None for missing nodes:
# Tree: 1
# / \
# 2 3
# \
# 5
# Array: [1, 2, 3, None, 5]
tree_array = [1, 2, 3, 4, 5]
def get_left_child_index(i):
return 2 * i + 1
def get_right_child_index(i):
return 2 * i + 2
def get_parent_index(i):
return (i - 1) // 2
# Access children of node at index 0 (value 1):
left_idx = get_left_child_index(0) # 1 → value 2
right_idx = get_right_child_index(0) # 2 → value 3
When to use:
-
Heaps are always arrays (Chapter 11)
-
When the tree is complete or nearly complete (no wasted space)
-
When you need O(1) access to parent (useful in heaps)
-
LeetCode often gives input in this format
Tradeoffs:
Pros |
• O(1) parent/child access via index math |
Cons |
• Wastes space for sparse/unbalanced trees (lots of |
Converting Between Representations
from collections import deque
def array_to_tree(arr: list) -> 'TreeNode | None':
"""
Convert level-order array [1, 2, 3, None, 5] to node objects.
This is how LeetCode inputs are typically given — you'll need this
to test your solutions locally.
Time: O(n) — process each element once
Space: O(n) — create n nodes
"""
if not arr or arr[0] is None:
return None
root = TreeNode(arr[0])
queue = deque([root]) # BFS queue to assign children level by level
i = 1 # Index into the array
while queue and i < len(arr):
node = queue.popleft() # O(1 )
# Assign left child
if i < len(arr) and arr[i] is not None:
node.left = TreeNode(arr[i])
queue.append(node.left)
i += 1
# Assign right child
if i < len(arr) and arr[i] is not None:
node.right = TreeNode(arr[i])
queue.append(node.right)
i += 1
return root
def tree_to_array(root: 'TreeNode | None') -> list:
"""
Convert node objects back to level-order array.
Time: O(n) — visit each node once (BFS)
Space: O(n) — the output array + BFS queue
"""
if root is None:
return []
result = []
queue = deque([root])
while queue:
node = queue.popleft() # O(1)
if node:
result.append(node.value)
queue.append(node.left) # May be None
queue.append(node.right) # May be None
else:
result.append(None)
# Remove trailing Nones (they're just padding)
while result and result[-1] is None:
result.pop()
return result
Tree Traversals: The Core Algorithms
There are four fundamental ways to visit every node in a binary tree. The first three are DFS (depth-first); the last is BFS (breadth-first).
1
/ \
2 3
/ \ \
4 5 6
| Traversal | Order | Result for example tree |
|---|---|---|
Inorder (LNR) |
Left → Node → Right |
[4, 2, 5, 1, 3, 6] |
Preorder (NLR) |
Node → Left → Right |
[1, 2, 4, 5, 3, 6] |
Postorder (LRN) |
Left → Right → Node |
[4, 5, 2, 6, 3, 1] |
Level-order (BFS) |
Level by level, left to right |
[1, 2, 3, 4, 5, 6] |
|
Memory trick: The name tells you where the root/current node is processed:
For BSTs specifically: Inorder traversal gives sorted order. This is a frequently tested property! |
Why Each Traversal Matters
-
Inorder: BST validation, "give me sorted elements"
-
Preorder: Tree serialization/copying (root first, then structure)
-
Postorder: "Process children before parent" — tree deletion, expression evaluation
-
Level-order: BFS, shortest path in unweighted tree, finding tree width
See binary_tree_traversals for both recursive and iterative implementations.
Binary Search Trees (BST)
A BST maintains an invariant: for every node, all left descendants < node < all right descendants.
8
/ \
3 10
/ \ \
1 6 14
/ \ /
4 7 13
This invariant gives us O(h) search/insert/delete where h = tree height.
-
Balanced tree (e.g., AVL, Red-Black): h = O(log n) → operations are O(log n)
-
Unbalanced tree (worst case: linked list): h = O(n) → operations are O(n)
See validate_bst for the classic Google interview problem.
Balanced Trees: Red-Black, AVL, Splay
Google’s email mentions these. Here’s what to know:
AVL Trees
-
Invariant: For every node, the heights of left and right subtrees differ by at most 1
-
Rebalancing: After insert/delete, walk up and rotate as needed
-
Lookup: O(log n) — strictly balanced, slightly faster lookup than Red-Black
-
Insert/Delete: O(log n) but with more rotations than Red-Black
-
Used in: Databases (when reads >> writes)
Red-Black Trees
-
Invariant: Nodes are colored red/black following rules that guarantee approximate balance
-
Rebalancing: Fewer rotations than AVL (at most 2 per insert, 3 per delete)
-
All operations: O(log n)
-
Used in: C++
std::map, JavaTreeMap, Linux kernel scheduler -
Python:
sortedcontainers.SortedListuses a B-tree variant internally
Splay Trees
-
Key idea: Recently accessed elements move to the root ("splaying")
-
Amortized O(log n) for all operations (individual ops can be O(n))
-
Used when: Some elements are accessed much more often than others (caching behavior)
|
What Google wants to hear: You don’t need to implement these from scratch (extremely unlikely in a 45-min interview). But you should:
|
N-ary Trees
N-ary trees have nodes with any number of children (not just two). They’re represented using a list of children:
class NaryTreeNode:
def __init__(self, value, children=None):
self.value = value
self.children = children or [] # List of child nodes
# Example: file system
root = NaryTreeNode("root", [
NaryTreeNode("Documents", [
NaryTreeNode("resume.pdf"),
NaryTreeNode("cover_letter.pdf"),
]),
NaryTreeNode("Pictures", [
NaryTreeNode("vacation.jpg"),
]),
])
Traversals are similar to binary trees but loop over children instead of separate .left/.right.
Tries (Prefix Trees)
A trie is a tree where each edge represents a character, and paths from root to nodes represent prefixes of words. Used for:
-
Autocomplete (Google Search!)
-
Spell checking
-
IP routing tables
See trie for full implementation.
Exercises
-
Binary Tree Traversals — Array↔Object conversion + recursive/iterative inorder/preorder/postorder/level-order (8 implementations)
-
Validate BST — Classic Google problem (recursive range-tracking + inorder traversal)
-
Lowest Common Ancestor — Recursive bottom-up + iterative parent map
-
Trie Implementation — Prefix tree with insert/search/startsWith + autocomplete