The Night Before My Amazon Interview, I Finally Understood BFS vs DFS
11 PM: The Panic Sets In
It was 11 PM on a Wednesday in October. My Amazon coding interview was at 9 AM the next morning. I had been preparing for weeks - arrays, strings, hash maps, sliding windows - and I felt reasonably confident about most topics. But there was one area I had been avoiding like a student dodging a final exam: graphs.
Every time I opened a graph problem on LeetCode, the same thing happened. I would stare at the adjacency list. I would read the problem statement three times. I would try to write some kind of traversal. And then I would get lost in a tangle of visited sets, queues, stacks, and recursive calls that seemed to spiral in every direction. The code never worked on the first try. Sometimes it never worked at all.
Graphs felt like hieroglyphics. I could recognize them, vaguely sense they meant something important, but I could not actually read them.
The problem was not intelligence. The problem was that every resource I had read explained BFS and DFS as abstract algorithms on abstract data structures. "Visit all neighbors." "Use a queue." "Use a stack." The words were technically correct but completely unhelpful for building intuition. I could memorize the pseudocode, but the moment a problem deviated even slightly from the textbook example, I was lost.
At 11:15 PM, I made myself a cup of coffee, opened a blank document, and decided I was not going to sleep until I actually understood this. Not memorized it. Understood it.
What happened over the next few hours changed the way I think about graph algorithms permanently. And it started with a building.
BFS - The Methodical Floor-by-Floor Search
Imagine you are a firefighter. You arrive at a ten-story office building and someone tells you there is a person trapped somewhere inside. You have no idea which floor, which room. How do you search?
One strategy: you walk into the lobby and check every single room on the ground floor first. Every closet, every office, every bathroom. Only when you have cleared the entire first floor do you take the stairs to the second floor and repeat the process. Floor by floor. Systematic. Exhaustive at each level before moving deeper.
That is BFS. Breadth-First Search.
You explore everything at the current "distance" from your starting point before moving to the next distance. In the building, distance is measured in floors. In a graph, distance is measured in edges.
The Mechanics
BFS uses a queue - a first-in, first-out data structure. The queue is your to-do list. When you visit a room (node), you add all of its connected rooms (neighbors) to the back of the queue. Then you process rooms from the front of the queue. Because earlier-discovered rooms are processed first, you naturally explore level by level.
Here is the core BFS template:
from collections import deque
def bfs(graph, start):
visited = set()
queue = deque([start])
visited.add(start)
while queue:
node = queue.popleft()
# Process node here
for neighbor in graph[node]:
if neighbor not in visited:
visited.add(neighbor)
queue.append(neighbor)
Let me trace through a concrete example. Suppose you have this graph:
A
/ \
B C
/ \ \
D E F
Represented as an adjacency list:
graph = {
'A': ['B', 'C'],
'B': ['A', 'D', 'E'],
'C': ['A', 'F'],
'D': ['B'],
'E': ['B'],
'F': ['C']
}
Starting BFS from A:
- Queue: [A] - Visit A. Add B and C to the queue. Queue becomes [B, C].
- Queue: [B, C] - Visit B (front of queue). Add D and E (A is already visited). Queue becomes [C, D, E].
- Queue: [C, D, E] - Visit C. Add F. Queue becomes [D, E, F].
- Queue: [D, E, F] - Visit D. No new neighbors. Queue becomes [E, F].
- Queue: [E, F] - Visit E. No new neighbors. Queue becomes [F].
- Queue: [F] - Visit F. No new neighbors. Queue is empty. Done.
Order of visitation: A → B → C → D → E → F. Notice how we fully explored distance-1 nodes (B, C) before distance-2 nodes (D, E, F). That is the BFS guarantee.
Why BFS Gives You Shortest Paths
Here is the critical insight that clicked for me at 11:30 PM: because BFS explores nodes in order of their distance from the source, the first time BFS reaches a node is guaranteed to be via the shortest path (in an unweighted graph).
Think about the firefighter again. If the firefighter finds the trapped person on the 4th floor, they know there was no one on floors 1, 2, or 3 - because those were already fully searched. The 4th floor is genuinely the closest floor where the person could be.
This property makes BFS the go-to algorithm for:
- Shortest path in unweighted graphs - "What is the minimum number of moves to reach this state?"
- Level-order traversal of trees - Process all nodes at depth
dbefore depthd+1. - Finding all nodes within a certain distance - "Which cities are within 3 flights of New York?"
- Multi-source BFS - Problems like "rotting oranges" where you start from multiple sources simultaneously.
The Classic Interview Problem: Shortest Path in a Grid
One of the most common BFS interview questions is finding the shortest path in a grid. You are given a 2D grid where 0 represents open cells and 1 represents walls. Find the shortest path from the top-left corner to the bottom-right corner.
from collections import deque
def shortest_path(grid):
if not grid or grid[0][0] == 1:
return -1
rows, cols = len(grid), len(grid[0])
queue = deque([(0, 0, 1)]) # row, col, distance
visited = {(0, 0)}
directions = [(0, 1), (0, -1), (1, 0), (-1, 0)]
while queue:
row, col, dist = queue.popleft()
if row == rows - 1 and col == cols - 1:
return dist
for dr, dc in directions:
new_row, new_col = row + dr, col + dc
if (0 <= new_row < rows and
0 <= new_col < cols and
grid[new_row][new_col] == 0 and
(new_row, new_col) not in visited):
visited.add((new_row, new_col))
queue.append((new_row, new_col, dist + 1))
return -1 # No path found
The key insight: we track distance as part of the queue entry. Because BFS processes nodes level by level, the first time we reach the destination is guaranteed to be the shortest path. No need to explore further.
DFS - The Curious Room-by-Room Deep Dive
Now imagine a different kind of search. You are not a firefighter. You are a curious explorer in a mysterious mansion. You walk into the first room and see three doors. Instead of checking all three systematically, you pick the leftmost door and walk through it. That room has two more doors. You pick the leftmost again and walk through. You keep going deeper and deeper until you hit a dead end - a room with no unexplored doors. Only then do you backtrack to the last room where you had an unexplored option, and try the next door.
That is DFS. Depth-First Search.
You go as deep as possible along one path before backtracking. Where BFS is the methodical firefighter, DFS is the curious explorer who follows each thread to its end.
The Mechanics
DFS can be implemented two ways: recursively (using the call stack) or iteratively (using an explicit stack). The recursive version is more common in interviews because it is cleaner to write.
Recursive DFS:
def dfs(graph, node, visited=None):
if visited is None:
visited = set()
visited.add(node)
# Process node here
for neighbor in graph[node]:
if neighbor not in visited:
dfs(graph, neighbor, visited)
Iterative DFS (using an explicit stack):
def dfs_iterative(graph, start):
visited = set()
stack = [start]
while stack:
node = stack.pop()
if node in visited:
continue
visited.add(node)
# Process node here
for neighbor in graph[node]:
if neighbor not in visited:
stack.append(neighbor)
Using the same graph from before, starting DFS from A:
- Visit A. Recurse into B (first neighbor).
- Visit B. Recurse into D (first unvisited neighbor; A is visited).
- Visit D. No unvisited neighbors. Backtrack to B.
- At B, recurse into E (next unvisited neighbor).
- Visit E. No unvisited neighbors. Backtrack to B, then to A.
- At A, recurse into C (next unvisited neighbor).
- Visit C. Recurse into F.
- Visit F. No unvisited neighbors. Done.
Order of visitation: A → B → D → E → C → F. Notice how we went deep (A → B → D) before backtracking and exploring other branches. That is the DFS guarantee.
Why DFS Is Your Swiss Army Knife
DFS is less specialized than BFS - it does not guarantee shortest paths - but it is extraordinarily versatile. Here is what it excels at:
Cycle detection. In a directed graph, if during DFS you encounter a node that is currently on the recursion stack (not just visited, but actively being explored), you have found a cycle. This distinction between "visited" and "currently on the stack" is subtle but critical.
def has_cycle(graph, n):
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * n
def dfs(node):
color[node] = GRAY # Currently exploring
for neighbor in graph[node]:
if color[neighbor] == GRAY: # Back edge = cycle
return True
if color[neighbor] == WHITE and dfs(neighbor):
return True
color[node] = BLACK # Fully explored
return False
for i in range(n):
if color[i] == WHITE:
if dfs(i):
return True
return False
The three-color approach (WHITE = unvisited, GRAY = in progress, BLACK = complete) is a pattern you will use over and over.
Connected components. Run DFS from each unvisited node; each DFS call discovers one connected component.
Path finding. DFS naturally explores paths. Combined with backtracking, it can enumerate all paths between two nodes.
Topological sort. We will cover this in depth below, but DFS is the foundation.
Island counting. The classic "number of islands" problem is just running DFS from each unvisited land cell and counting how many times you trigger a new DFS.
def num_islands(grid):
if not grid:
return 0
rows, cols = len(grid), len(grid[0])
count = 0
def dfs(r, c):
if r < 0 or r >= rows or c < 0 or c >= cols or grid[r][c] == '0':
return
grid[r][c] = '0' # Mark as visited
dfs(r + 1, c)
dfs(r - 1, c)
dfs(r, c + 1)
dfs(r, c - 1)
for r in range(rows):
for c in range(cols):
if grid[r][c] == '1':
count += 1
dfs(r, c)
return count
The Cheat Sheet: Which One When?
By 12:30 AM, I had written out a decision framework on a sticky note. I still have that sticky note on my monitor. Here it is.
The Decision Framework
Use BFS when:
- You need the shortest path in an unweighted graph
- You need to process nodes level by level
- You are working with multiple starting points simultaneously
- The problem mentions "minimum number of steps/moves/transformations"
Use DFS when:
- You need to detect cycles
- You need to explore all possible paths (backtracking)
- You need topological ordering
- You need to find connected components
- You are working with tree traversals (preorder, inorder, postorder)
- The problem asks "is there any path" (not shortest)
Comparison Table
| Property | BFS | DFS |
|---|---|---|
| Data structure | Queue (FIFO) | Stack/Recursion (LIFO) |
| Exploration order | Level by level | Branch by branch |
| Shortest path guarantee | Yes (unweighted) | No |
| Space complexity | O(width of graph) | O(depth of graph) |
| Cycle detection (directed) | Harder | Natural (3-color) |
| Topological sort | Yes (Kahn's) | Yes (reverse postorder) |
| Best for grids | Shortest path problems | Flood fill / connectivity |
Common Problem Types Mapped
| Problem Pattern | Algorithm | Why |
|---|---|---|
| Shortest path in unweighted graph | BFS | Distance guarantee |
| Shortest path in weighted graph | Dijkstra (modified BFS) | Needs priority queue |
| Number of islands | DFS | Flood fill / connectivity |
| Word ladder | BFS | Minimum transformations |
| Course schedule (cycle detection) | DFS | Three-color cycle detection |
| Course schedule (ordering) | Topological sort (DFS or BFS) | Dependency ordering |
| Clone graph | BFS or DFS | Either works |
| Surrounded regions | DFS from borders | Marking what to keep |
| Rotting oranges | Multi-source BFS | Simultaneous spread |
| All paths from source to target | DFS + backtracking | Path enumeration |
This table alone saved me during the interview. When you see a graph problem, the first question is not "how do I write BFS?" It is "which exploration pattern does this problem need?"
Trees Are Just Polite Graphs
At about 1 AM, I had another breakthrough that I wish someone had told me months earlier: tree traversals are just special cases of graph traversal.
A tree is a connected, acyclic graph with a designated root. That means:
- You never need a
visitedset (no cycles, so you cannot revisit nodes) - There is exactly one path between any two nodes
- Every node except the root has exactly one parent
Because of these constraints, DFS on a tree simplifies beautifully into the three classic traversals: preorder, inorder, and postorder. These are not different algorithms. They are DFS with different timing for when you process the current node.
Preorder (Root → Left → Right)
Process the node before recursing into children. This is standard DFS - you see the node the moment you arrive.
def preorder(node):
if not node:
return
print(node.val) # Process BEFORE children
preorder(node.left)
preorder(node.right)
Use case: serializing a tree (you need the structure top-down), copying a tree, building prefix expressions.
Inorder (Left → Root → Right)
Process the node between the left and right children. This only makes sense for binary trees.
def inorder(node):
if not node:
return
inorder(node.left)
print(node.val) # Process BETWEEN children
inorder(node.right)
Use case: Binary Search Tree traversal. Inorder traversal of a BST visits nodes in sorted order. This is the single most important fact about BSTs in interviews. If someone asks you to validate a BST, find the kth smallest element, or convert a BST to a sorted list, inorder traversal is your tool.
Postorder (Left → Right → Root)
Process the node after all children have been processed.
def postorder(node):
if not node:
return
postorder(node.left)
postorder(node.right)
print(node.val) # Process AFTER children
Use case: deleting a tree (you must delete children before their parent), evaluating expression trees, calculating subtree properties (like subtree sums or heights).
BFS on Trees = Level-Order Traversal
BFS on a tree gives you level-order traversal - process all nodes at depth 0, then depth 1, then depth 2, and so on. This is extremely common in interviews.
from collections import deque
def level_order(root):
if not root:
return []
result = []
queue = deque([root])
while queue:
level_size = len(queue)
current_level = []
for _ in range(level_size):
node = queue.popleft()
current_level.append(node.val)
if node.left:
queue.append(node.left)
if node.right:
queue.append(node.right)
result.append(current_level)
return result
The trick here is the for _ in range(level_size) loop. By capturing the queue length at the start of each iteration, you process exactly one level per outer while-loop iteration. This lets you group nodes by level, which is required for problems like "zigzag level order traversal," "right side view of a binary tree," and "maximum width of a binary tree."
Once I saw trees as a special case of graphs, a dozen problem types that had felt disconnected suddenly clicked together. Tree problems and graph problems are the same family. The only difference is that trees are polite enough not to have cycles.
Topological Sort: The One That Sounds Scary But Isn't
At 2 AM, I opened the "Course Schedule" problem on LeetCode. I had avoided it for weeks because "topological sort" sounded like something from a graduate-level mathematics course. It is not. It is one of the most intuitive algorithms once you see what it actually does.
The Problem
You have n courses numbered from 0 to n-1. Some courses have prerequisites. For example, to take course 1, you must first take course 0. Given a list of prerequisite pairs, determine if you can finish all courses. If you can, return a valid ordering.
This is a real-world problem: think of it as a dependency graph. Course A depends on Course B. Module X depends on Module Y. Task 3 cannot start until Task 1 and Task 2 are finished. Topological sort gives you a linear ordering of nodes in a directed acyclic graph (DAG) such that for every directed edge (u, v), u comes before v in the ordering.
If the graph has a cycle (Course A requires Course B, Course B requires Course A), no valid ordering exists.
Approach 1: Kahn's Algorithm (BFS-Based)
Kahn's algorithm is the BFS approach to topological sort. The idea is simple and elegant:
- Calculate the in-degree of every node (how many edges point to it).
- Add all nodes with in-degree 0 to a queue (these have no prerequisites).
- Process nodes from the queue: for each processed node, reduce the in-degree of its neighbors by 1. If any neighbor's in-degree drops to 0, add it to the queue.
- If you process all nodes, you have a valid topological order. If not, there is a cycle.
from collections import deque
def topological_sort_kahn(num_courses, prerequisites):
# Build adjacency list and in-degree array
graph = [[] for _ in range(num_courses)]
in_degree = [0] * num_courses
for course, prereq in prerequisites:
graph[prereq].append(course)
in_degree[course] += 1
# Start with all nodes that have no prerequisites
queue = deque()
for i in range(num_courses):
if in_degree[i] == 0:
queue.append(i)
order = []
while queue:
node = queue.popleft()
order.append(node)
for neighbor in graph[node]:
in_degree[neighbor] -= 1
if in_degree[neighbor] == 0:
queue.append(neighbor)
# If we processed all courses, we have a valid order
if len(order) == num_courses:
return order
else:
return [] # Cycle detected
Why does this work? Think about it concretely. If a course has no prerequisites (in-degree 0), you can take it right now. Once you take it, any course that depended only on that one prerequisite now also has no prerequisites. You are peeling off layers of the dependency graph, starting from the courses with no dependencies and working inward.
The cycle detection falls out naturally: if there is a cycle, the nodes in the cycle will never reach in-degree 0 (they are always waiting on each other), so they will never be added to the queue, and you will process fewer than num_courses nodes.
Approach 2: DFS-Based Topological Sort
The DFS approach is more subtle but equally powerful. The key insight: in a DFS, a node's recursive call finishes only after all of its descendants have been fully explored. So if you add each node to the result after its DFS call completes (postorder), and then reverse the result, you get a valid topological ordering.
def topological_sort_dfs(num_courses, prerequisites):
graph = [[] for _ in range(num_courses)]
for course, prereq in prerequisites:
graph[prereq].append(course)
WHITE, GRAY, BLACK = 0, 1, 2
color = [WHITE] * num_courses
order = []
has_cycle = False
def dfs(node):
nonlocal has_cycle
if has_cycle:
return
color[node] = GRAY
for neighbor in graph[node]:
if color[neighbor] == GRAY:
has_cycle = True
return
if color[neighbor] == WHITE:
dfs(neighbor)
color[node] = BLACK
order.append(node) # Add in postorder
for i in range(num_courses):
if color[i] == WHITE:
dfs(i)
if has_cycle:
return []
return order[::-1] # Reverse postorder = topological order
Notice the three-color technique again. GRAY nodes are currently on the recursion stack. If we encounter a GRAY node during DFS, we have found a back edge, which means a cycle. This is the standard cycle detection technique for directed graphs.
When to Use Which Approach
Both approaches have the same time and space complexity: O(V + E) where V is the number of vertices and E is the number of edges.
- Kahn's (BFS): Easier to understand. Naturally gives you the order front-to-back. Better when you need to process nodes level by level (e.g., "minimum semesters to complete all courses" - each BFS level is one semester).
- DFS: More naturally combines with cycle detection. Better when you are already doing DFS for other reasons. The reverse-postorder idea generalizes to other algorithms (like finding strongly connected components).
For interviews, I recommend starting with Kahn's algorithm. It is more intuitive to explain to an interviewer, and the in-degree approach is easy to reason about verbally. Switch to DFS-based topological sort when the problem also requires cycle detection or when it is a subproblem within a larger DFS.
Real Applications
Topological sort is not an academic exercise. It is used everywhere:
- Build systems (Make, Gradle, Webpack): compile dependencies in the right order
- Package managers (npm, pip): install dependencies before the packages that need them
- Task scheduling: determine valid execution orders for tasks with dependencies
- Spreadsheet calculation: evaluate cells in dependency order
- Course planning: determine a valid semester-by-semester schedule
When an interviewer asks about topological sort, they are testing whether you can model a real-world dependency problem as a directed graph and produce a valid ordering. The algorithm itself is straightforward once you see it as a dependency resolution problem.
8 AM: The Interview (How It Actually Went)
I slept for about four hours. Not great, not terrible. I woke up at 7 AM, splashed water on my face, made another coffee, and reviewed the sticky note one more time. BFS for shortest paths. DFS for cycles, components, topological sort. Trees are just polite graphs. Queue vs stack. I could feel the concepts sitting more solidly than they had the night before.
The Amazon interview started at 9 AM. After the behavioral questions, the interviewer shared a coding problem:
"You are given a grid representing a maze. 0 is an open cell, 1 is a wall. Find the shortest path from the top-left to the bottom-right corner, moving only in four cardinal directions."
I almost laughed. Shortest path. Unweighted graph (each step costs 1). Grid.
BFS. Immediately.
I told the interviewer exactly that: "This is a shortest-path problem on an unweighted graph. BFS guarantees the first time we reach the destination is via the shortest path, because it explores nodes in order of their distance from the source."
I wrote the BFS grid traversal almost exactly as I had practiced it at 12:30 AM. Queue with (row, col, distance). Visited set. Four-directional movement. Check bounds and walls. Return distance when we reach the target. Return -1 if the queue empties.
The interviewer asked a follow-up: "What if some cells have different weights - like mud cells that cost 2 to cross?"
I paused for a moment, then explained: "BFS only works for unweighted graphs. With variable edge weights, we would need Dijkstra's algorithm, which is essentially BFS with a priority queue (min-heap) instead of a regular queue. We always process the node with the smallest cumulative distance next."
I did not code a full Dijkstra implementation (the interviewer did not ask for one), but explaining the transition from BFS to Dijkstra clearly showed that I understood the underlying principle rather than just memorizing templates.
The second problem involved detecting if a directed graph had a cycle - essentially the "Course Schedule" problem. I used the three-color DFS approach. WHITE, GRAY, BLACK. A back edge to a GRAY node means a cycle. I explained the reasoning behind each color as I coded.
I got the offer three weeks later.
Here is what I want you to take away from this story: I did not succeed because I am some kind of algorithm genius. I succeeded because at 11 PM, instead of trying to memorize more code, I stopped and built a mental model. The building analogy for BFS. The explorer analogy for DFS. Trees as polite graphs. Topological sort as dependency resolution. These mental models turned abstract algorithms into tools I could reason about under pressure.
You do not need to cram every LeetCode problem. You need to understand the underlying patterns deeply enough that you can adapt them on the fly. If you want structured practice that builds this kind of deep understanding - where you're not just solving problems but learning to recognize which pattern a problem needs - that is exactly what Levelop's coding practice module is designed for.
Graph problems will keep showing up in your interviews. BFS vs DFS will keep being the fundamental decision point. But once you have the mental models, that decision stops being terrifying and starts being automatic.
It is 3 AM. The coffee is cold. But for the first time in months, graphs do not feel like hieroglyphics anymore. They feel like a building waiting to be explored.
If you found this guide helpful, check out our breakdown of the two-pointer technique - another pattern that shows up in nearly every coding interview.
Frequently Asked Questions
When should I use BFS instead of DFS?
Use BFS whenever you need the shortest path in an unweighted graph or when you need to process nodes level by level. BFS explores all nodes at distance d before any node at distance d+1, which guarantees that the first time you reach a target node is via the shortest path. Classic examples include shortest path in a grid, word ladder (minimum transformations from one word to another), and rotting oranges (simultaneous spread from multiple sources). If the problem says "minimum number of steps," "fewest moves," or "shortest path" and the graph is unweighted, BFS is almost certainly the right choice. DFS, by contrast, has no such guarantee - it may find a path, but not necessarily the shortest one, because it explores one branch fully before backtracking.
What data structure does BFS use?
BFS uses a queue, which is a first-in, first-out (FIFO) data structure. In Python, the most efficient implementation is collections.deque, which provides O(1) append and popleft operations. In Java, you would use LinkedList or ArrayDeque. The queue ensures that nodes discovered earlier (closer to the source) are processed before nodes discovered later (farther from the source). This is what gives BFS its level-by-level exploration property. By contrast, DFS uses a stack (LIFO) - either an explicit stack data structure or the implicit call stack when implemented recursively. The choice between queue and stack is the fundamental mechanical difference between BFS and DFS, and it directly determines the exploration order.
How do I detect a cycle in a directed graph?
The most reliable method for cycle detection in a directed graph is DFS with three-color marking. Assign each node one of three states: WHITE (not yet visited), GRAY (currently being explored - on the recursion stack), and BLACK (fully explored - all descendants processed). During DFS, if you encounter a GRAY node, you have found a back edge, which means a cycle exists. This distinction between GRAY and BLACK is critical: a BLACK node has been fully explored and is not part of the current path, so reaching it is not a cycle. An alternative approach is Kahn's algorithm (BFS-based topological sort): compute in-degrees, repeatedly remove nodes with in-degree 0, and if you cannot remove all nodes, a cycle exists. Both methods run in O(V + E) time. For undirected graphs, cycle detection is simpler - any edge that leads to an already-visited node (that is not the direct parent) indicates a cycle.
What is topological sorting used for in real applications?
Topological sorting is used whenever you need to determine a valid execution order for tasks with dependencies. The most common real-world applications include build systems (Make, Gradle, Bazel) that must compile source files in dependency order, package managers (npm, pip, apt) that install library dependencies before the packages that depend on them, task schedulers that sequence jobs with prerequisite relationships, spreadsheet engines that evaluate cells in the correct order when formulas reference other cells, and course planning systems that help students sequence courses with prerequisites. In interviews, topological sort most commonly appears as the "Course Schedule" problem (can you finish all courses given prerequisites?) and its variant (return a valid ordering). It also appears in problems involving compilation order, recipe dependencies, and workflow scheduling. Any problem where you need to order items respecting "X must come before Y" constraints is a topological sort problem.