Chapter 08: Trees

Why Trees Matter

Trees show up in 30%+ of Google interview questions. Google’s email specifically mentions binary trees, N-ary trees, tries, balanced trees, and all traversal algorithms.

But here’s what most tutorials get WRONG: they skip how data is represented and jump straight to algorithms. This chapter fixes that.

The Fundamental Question: How Do You Represent a Tree?

A tree is an abstract concept — nodes connected in a hierarchy. But in code, you must choose a concrete representation. This choice affects:

What operations are fast (O(1) vs O(n))
How much memory you use
How naturally the algorithm maps to the data

There are two primary representations for binary trees:

Representation 1: Node Objects with Pointers

class TreeNode:
    """Each node is an object with references to its children."""
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left    # Pointer to left child (or None)
        self.right = right  # Pointer to right child (or None)

# Building a tree:
#        1
#       / \
#      2   3
#     / \
#    4   5

root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)

When to use:

Most interview problems use this representation
When the tree is sparse (not every node has both children)
When you need to traverse from parent to child (natural with .left/.right)
When the tree changes shape (inserts/deletes)

Tradeoffs:

Pros

• Natural for recursive algorithms (just follow pointers)
• Memory-efficient for sparse trees (only allocate nodes that exist)
• Easy to add/remove nodes

Cons

• No O(1) access by position/index
• Can’t find parent without extra pointer or recursive backtracking
• Each node has object overhead (class instance)

Representation 2: Array (Level-Order Layout)

# A binary tree can be stored in a flat array using level-order (BFS order).
# The index arithmetic lets you find parent and children in O(1):
#
#   Parent of node at index i:      (i - 1) // 2
#   Left child of node at index i:  2 * i + 1
#   Right child of node at index i: 2 * i + 2
#
# Tree:       1
#            / \
#           2   3
#          / \
#         4   5
#
# Array: [1, 2, 3, 4, 5]
# Index:  0  1  2  3  4
#
# For incomplete trees, we use None for missing nodes:
# Tree:       1
#            / \
#           2   3
#            \
#             5
# Array: [1, 2, 3, None, 5]

tree_array = [1, 2, 3, 4, 5]

def get_left_child_index(i):
    return 2 * i + 1

def get_right_child_index(i):
    return 2 * i + 2

def get_parent_index(i):
    return (i - 1) // 2

# Access children of node at index 0 (value 1):
left_idx = get_left_child_index(0)   # 1 → value 2
right_idx = get_right_child_index(0)  # 2 → value 3

When to use:

Heaps are always arrays (Chapter 11)
When the tree is complete or nearly complete (no wasted space)
When you need O(1) access to parent (useful in heaps)
LeetCode often gives input in this format

Tradeoffs:

Pros

• O(1) parent/child access via index math
• Cache-friendly (contiguous memory)
• Compact for complete/balanced trees

Cons

• Wastes space for sparse/unbalanced trees (lots of None entries)
• Insertions/deletions require shifting
• A pathological tree (linked list) would need O(2ⁿ) array slots

Converting Between Representations

from collections import deque

def array_to_tree(arr: list) -> 'TreeNode | None':
    """
    Convert level-order array [1, 2, 3, None, 5] to node objects.

    This is how LeetCode inputs are typically given — you'll need this
    to test your solutions locally.

    Time: O(n) — process each element once
    Space: O(n) — create n nodes
    """
    if not arr or arr[0] is None:
        return None

    root = TreeNode(arr[0])
    queue = deque([root])  # BFS queue to assign children level by level
    i = 1  # Index into the array

    while queue and i < len(arr):
        node = queue.popleft()  # O(1       )

        # Assign left child
        if i < len(arr) and arr[i] is not None:
            node.left = TreeNode(arr[i])
            queue.append(node.left)
        i += 1

        # Assign right child
        if i < len(arr) and arr[i] is not None:
            node.right = TreeNode(arr[i])
            queue.append(node.right)
        i += 1

    return root


def tree_to_array(root: 'TreeNode | None') -> list:
    """
    Convert node objects back to level-order array.

    Time: O(n) — visit each node once (BFS)
    Space: O(n) — the output array + BFS queue
    """
    if root is None:
        return []

    result = []
    queue = deque([root])

    while queue:
        node = queue.popleft()  # O(1)
        if node:
            result.append(node.value)
            queue.append(node.left)   # May be None
            queue.append(node.right)  # May be None
        else:
            result.append(None)

    # Remove trailing Nones (they're just padding)
    while result and result[-1] is None:
        result.pop()

    return result

Tree Traversals: The Core Algorithms

There are four fundamental ways to visit every node in a binary tree. The first three are DFS (depth-first); the last is BFS (breadth-first).

Traversal	Order	Result for example tree
Inorder (LNR)	Left → Node → Right	[4, 2, 5, 1, 3, 6]
Preorder (NLR)	Node → Left → Right	[1, 2, 4, 5, 3, 6]
Postorder (LRN)	Left → Right → Node	[4, 5, 2, 6, 3, 1]
Level-order (BFS)	Level by level, left to right	[1, 2, 3, 4, 5, 6]

Traversal

Order

Result for example tree

Inorder (LNR)

Left → Node → Right

[4, 2, 5, 1, 3, 6]

Preorder (NLR)

Node → Left → Right

[1, 2, 4, 5, 3, 6]

Postorder (LRN)

Left → Right → Node

[4, 5, 2, 6, 3, 1]

Level-order (BFS)

Level by level, left to right

[1, 2, 3, 4, 5, 6]

Memory trick: The name tells you where the root/current node is processed:

Inorder = root is in the middle (left, ROOT, right)
Preorder = root comes FIRST ("pre" = before children)
Postorder = root comes LAST ("post" = after children)

For BSTs specifically: Inorder traversal gives sorted order. This is a frequently tested property!

Why Each Traversal Matters

Inorder: BST validation, "give me sorted elements"
Preorder: Tree serialization/copying (root first, then structure)
Postorder: "Process children before parent" — tree deletion, expression evaluation
Level-order: BFS, shortest path in unweighted tree, finding tree width

See binary_tree_traversals for both recursive and iterative implementations.

Binary Search Trees (BST)

A BST maintains an invariant: for every node, all left descendants < node < all right descendants.

        8
       / \
      3   10
     / \    \
    1   6    14
       / \   /
      4   7 13

This invariant gives us O(h) search/insert/delete where h = tree height.

Balanced tree (e.g., AVL, Red-Black): h = O(log n) → operations are O(log n)
Unbalanced tree (worst case: linked list): h = O(n) → operations are O(n)

See validate_bst for the classic Google interview problem.

Balanced Trees: Red-Black, AVL, Splay

Google’s email mentions these. Here’s what to know:

AVL Trees

Invariant: For every node, the heights of left and right subtrees differ by at most 1
Rebalancing: After insert/delete, walk up and rotate as needed
Lookup: O(log n) — strictly balanced, slightly faster lookup than Red-Black
Insert/Delete: O(log n) but with more rotations than Red-Black
Used in: Databases (when reads >> writes)

Red-Black Trees

Invariant: Nodes are colored red/black following rules that guarantee approximate balance
Rebalancing: Fewer rotations than AVL (at most 2 per insert, 3 per delete)
All operations: O(log n)
Used in: C++ std::map, Java TreeMap, Linux kernel scheduler
Python: sortedcontainers.SortedList uses a B-tree variant internally

Splay Trees

Key idea: Recently accessed elements move to the root ("splaying")
Amortized O(log n) for all operations (individual ops can be O(n))
Used when: Some elements are accessed much more often than others (caching behavior)

What Google wants to hear: You don’t need to implement these from scratch (extremely unlikely in a 45-min interview). But you should:

Know they exist and what problem they solve (keeping the tree balanced)
Understand that balanced = O(log n) height guarantee
Know that Python’s dict uses a hash table, NOT a tree (unlike C++ std::map which uses Red-Black)
Be able to say: "We could use a balanced BST for O(log n) ordered operations, or a hash map for O(1) unordered operations"

N-ary Trees

N-ary trees have nodes with any number of children (not just two). They’re represented using a list of children:

class NaryTreeNode:
    def __init__(self, value, children=None):
        self.value = value
        self.children = children or []  # List of child nodes

# Example: file system
root = NaryTreeNode("root", [
    NaryTreeNode("Documents", [
        NaryTreeNode("resume.pdf"),
        NaryTreeNode("cover_letter.pdf"),
    ]),
    NaryTreeNode("Pictures", [
        NaryTreeNode("vacation.jpg"),
    ]),
])

Traversals are similar to binary trees but loop over children instead of separate .left/.right.

Tries (Prefix Trees)

A trie is a tree where each edge represents a character, and paths from root to nodes represent prefixes of words. Used for:

Autocomplete (Google Search!)
Spell checking
IP routing tables

See trie for full implementation.

Exercises

Binary Tree Traversals — Array↔Object conversion + recursive/iterative inorder/preorder/postorder/level-order (8 implementations)
Validate BST — Classic Google problem (recursive range-tracking + inorder traversal)
Lowest Common Ancestor — Recursive bottom-up + iterative parent map
Trie Implementation — Prefix tree with insert/search/startsWith + autocomplete