Algorithms from zero
Toolbox: capstone problem-solving challenge
A real problem does not announce which technique it wants, and the hard ones need more than one. This capstone is a single multi-stage system where each stage points at a different tool from the track — and the whole only works if you choose each tool from the constraints and wire them together. Solving it is the proof that the toolbox is yours.
Turn the track into a working decision process: read a problem, derive the complexity budget, decompose it into stages, pick the right technique and data structure for each stage, combine them, and justify every choice — including why you rejected the tempting alternative.
Build a command-line 'package build planner'. Given a set of packages with dependencies, sizes, and an integer 'value', it must (a) detect dependency cycles and refuse them, (b) compute a valid build order, (c) given a disk budget B, select the maximum-value buildable subset that respects both the budget and the dependency closure, and (d) answer fast repeated 'is package X selected?' queries. Each stage forces a deliberate technique choice; you must name and justify each one.
- Cycles are detected and reported with the offending packages; acyclic inputs produce a valid topological build order, verified against the dependency edges.
- The selected subset is dependency-closed, within the disk budget B, and provably maximum-value for the test cases — including the case constructed to defeat greedy-by-ratio.
- Membership queries run in O(1) average and the code demonstrates the difference against a linear-scan baseline on a large selected set.
- The technique-selection log names a distinct, correct technique per stage with its complexity, and the overall pipeline complexity is stated and matches the derived budget.
- A one-paragraph reflection: which single clue (constraint size, problem verb, or structure) was decisive for each stage's choice.
- Replace the n ≤ 30 bitmask selection with a true dependency-aware knapsack DP and compare both on correctness and runtime, documenting where each becomes the better choice as n grows.
- Add shortest-build-time scheduling: given per-package build times and unlimited parallel workers, compute the minimum makespan using the longest path through the DAG (critical path) — another toposort-driven technique.
- Add a second query type — 'which packages depend, transitively, on X?' — and choose the right traversal (reverse-graph BFS or DFS), justifying it against the constraints.
- Profile the pipeline on a 10^5-node synthetic graph and confirm each stage's measured scaling matches its claimed complexity; note any stage that dominates and why.
This is the toolbox in one build: you read constraints into a budget, decomposed a system into stages, and chose a different technique per stage — toposort for ordering and cycle detection, constraint-justified subset search or DP for closure-aware selection, a hash set for O(1) membership — then combined them and defended each choice against its tempting alternative. The deliverable that matters most is the selection log: in real engineering, naming why this technique and not that one, backed by the constraint that decided it, is the skill the whole track was building toward.