Why Trees Matter

Trees show up in 30%+ of Google interview questions. Google’s email specifically mentions binary trees, N-ary trees, tries, balanced trees, and all traversal algorithms.

But here’s what most tutorials get WRONG: they skip how data is represented and jump straight to algorithms. This chapter fixes that.

The Fundamental Question: How Do You Represent a Tree?

A tree is an abstract concept — nodes connected in a hierarchy. But in code, you must choose a concrete representation. This choice affects:

  1. What operations are fast (O(1) vs O(n))

  2. How much memory you use

  3. How naturally the algorithm maps to the data

There are two primary representations for binary trees:

Representation 1: Node Objects with Pointers

class TreeNode:
    """Each node is an object with references to its children."""
    def __init__(self, value, left=None, right=None):
        self.value = value
        self.left = left    # Pointer to left child (or None)
        self.right = right  # Pointer to right child (or None)

# Building a tree:
#        1
#       / \
#      2   3
#     / \
#    4   5

root = TreeNode(1)
root.left = TreeNode(2)
root.right = TreeNode(3)
root.left.left = TreeNode(4)
root.left.right = TreeNode(5)

When to use:

  • Most interview problems use this representation

  • When the tree is sparse (not every node has both children)

  • When you need to traverse from parent to child (natural with .left/.right)

  • When the tree changes shape (inserts/deletes)

Tradeoffs:

Pros

• Natural for recursive algorithms (just follow pointers)
• Memory-efficient for sparse trees (only allocate nodes that exist)
• Easy to add/remove nodes

Cons

• No O(1) access by position/index
• Can’t find parent without extra pointer or recursive backtracking
• Each node has object overhead (class instance)

Representation 2: Array (Level-Order Layout)

# A binary tree can be stored in a flat array using level-order (BFS order).
# The index arithmetic lets you find parent and children in O(1):
#
#   Parent of node at index i:      (i - 1) // 2
#   Left child of node at index i:  2 * i + 1
#   Right child of node at index i: 2 * i + 2
#
# Tree:       1
#            / \
#           2   3
#          / \
#         4   5
#
# Array: [1, 2, 3, 4, 5]
# Index:  0  1  2  3  4
#
# For incomplete trees, we use None for missing nodes:
# Tree:       1
#            / \
#           2   3
#            \
#             5
# Array: [1, 2, 3, None, 5]

tree_array = [1, 2, 3, 4, 5]

def get_left_child_index(i):
    return 2 * i + 1

def get_right_child_index(i):
    return 2 * i + 2

def get_parent_index(i):
    return (i - 1) // 2

# Access children of node at index 0 (value 1):
left_idx = get_left_child_index(0)   # 1 → value 2
right_idx = get_right_child_index(0)  # 2 → value 3

When to use:

  • Heaps are always arrays (Chapter 11)

  • When the tree is complete or nearly complete (no wasted space)

  • When you need O(1) access to parent (useful in heaps)

  • LeetCode often gives input in this format

Tradeoffs:

Pros

• O(1) parent/child access via index math
• Cache-friendly (contiguous memory)
• Compact for complete/balanced trees

Cons

• Wastes space for sparse/unbalanced trees (lots of None entries)
• Insertions/deletions require shifting
• A pathological tree (linked list) would need O(2ⁿ) array slots

Converting Between Representations

from collections import deque

def array_to_tree(arr: list) -> 'TreeNode | None':
    """
    Convert level-order array [1, 2, 3, None, 5] to node objects.

    This is how LeetCode inputs are typically given — you'll need this
    to test your solutions locally.

    Time: O(n) — process each element once
    Space: O(n) — create n nodes
    """
    if not arr or arr[0] is None:
        return None

    root = TreeNode(arr[0])
    queue = deque([root])  # BFS queue to assign children level by level
    i = 1  # Index into the array

    while queue and i < len(arr):
        node = queue.popleft()  # O(1       )

        # Assign left child
        if i < len(arr) and arr[i] is not None:
            node.left = TreeNode(arr[i])
            queue.append(node.left)
        i += 1

        # Assign right child
        if i < len(arr) and arr[i] is not None:
            node.right = TreeNode(arr[i])
            queue.append(node.right)
        i += 1

    return root


def tree_to_array(root: 'TreeNode | None') -> list:
    """
    Convert node objects back to level-order array.

    Time: O(n) — visit each node once (BFS)
    Space: O(n) — the output array + BFS queue
    """
    if root is None:
        return []

    result = []
    queue = deque([root])

    while queue:
        node = queue.popleft()  # O(1)
        if node:
            result.append(node.value)
            queue.append(node.left)   # May be None
            queue.append(node.right)  # May be None
        else:
            result.append(None)

    # Remove trailing Nones (they're just padding)
    while result and result[-1] is None:
        result.pop()

    return result

Tree Traversals: The Core Algorithms

There are four fundamental ways to visit every node in a binary tree. The first three are DFS (depth-first); the last is BFS (breadth-first).

        1
       / \
      2   3
     / \   \
    4   5   6
Traversal Order Result for example tree

Inorder (LNR)

Left → Node → Right

[4, 2, 5, 1, 3, 6]

Preorder (NLR)

Node → Left → Right

[1, 2, 4, 5, 3, 6]

Postorder (LRN)

Left → Right → Node

[4, 5, 2, 6, 3, 1]

Level-order (BFS)

Level by level, left to right

[1, 2, 3, 4, 5, 6]

Memory trick: The name tells you where the root/current node is processed:

  • Inorder = root is in the middle (left, ROOT, right)

  • Preorder = root comes FIRST ("pre" = before children)

  • Postorder = root comes LAST ("post" = after children)

For BSTs specifically: Inorder traversal gives sorted order. This is a frequently tested property!

Why Each Traversal Matters

  • Inorder: BST validation, "give me sorted elements"

  • Preorder: Tree serialization/copying (root first, then structure)

  • Postorder: "Process children before parent" — tree deletion, expression evaluation

  • Level-order: BFS, shortest path in unweighted tree, finding tree width

See binary_tree_traversals for both recursive and iterative implementations.

Binary Search Trees (BST)

A BST maintains an invariant: for every node, all left descendants < node < all right descendants.

        8
       / \
      3   10
     / \    \
    1   6    14
       / \   /
      4   7 13

This invariant gives us O(h) search/insert/delete where h = tree height.

  • Balanced tree (e.g., AVL, Red-Black): h = O(log n) → operations are O(log n)

  • Unbalanced tree (worst case: linked list): h = O(n) → operations are O(n)

See validate_bst for the classic Google interview problem.

Balanced Trees: Red-Black, AVL, Splay

Google’s email mentions these. Here’s what to know:

AVL Trees

  • Invariant: For every node, the heights of left and right subtrees differ by at most 1

  • Rebalancing: After insert/delete, walk up and rotate as needed

  • Lookup: O(log n) — strictly balanced, slightly faster lookup than Red-Black

  • Insert/Delete: O(log n) but with more rotations than Red-Black

  • Used in: Databases (when reads >> writes)

Red-Black Trees

  • Invariant: Nodes are colored red/black following rules that guarantee approximate balance

  • Rebalancing: Fewer rotations than AVL (at most 2 per insert, 3 per delete)

  • All operations: O(log n)

  • Used in: C++ std::map, Java TreeMap, Linux kernel scheduler

  • Python: sortedcontainers.SortedList uses a B-tree variant internally

Splay Trees

  • Key idea: Recently accessed elements move to the root ("splaying")

  • Amortized O(log n) for all operations (individual ops can be O(n))

  • Used when: Some elements are accessed much more often than others (caching behavior)

What Google wants to hear: You don’t need to implement these from scratch (extremely unlikely in a 45-min interview). But you should:

  1. Know they exist and what problem they solve (keeping the tree balanced)

  2. Understand that balanced = O(log n) height guarantee

  3. Know that Python’s dict uses a hash table, NOT a tree (unlike C++ std::map which uses Red-Black)

  4. Be able to say: "We could use a balanced BST for O(log n) ordered operations, or a hash map for O(1) unordered operations"

N-ary Trees

N-ary trees have nodes with any number of children (not just two). They’re represented using a list of children:

class NaryTreeNode:
    def __init__(self, value, children=None):
        self.value = value
        self.children = children or []  # List of child nodes

# Example: file system
root = NaryTreeNode("root", [
    NaryTreeNode("Documents", [
        NaryTreeNode("resume.pdf"),
        NaryTreeNode("cover_letter.pdf"),
    ]),
    NaryTreeNode("Pictures", [
        NaryTreeNode("vacation.jpg"),
    ]),
])

Traversals are similar to binary trees but loop over children instead of separate .left/.right.

Tries (Prefix Trees)

A trie is a tree where each edge represents a character, and paths from root to nodes represent prefixes of words. Used for:

  • Autocomplete (Google Search!)

  • Spell checking

  • IP routing tables

See trie for full implementation.

Exercises