Scenario. You own the geodatabase and ingestion layer for our transit routing platform. Agency partners are submitting GTFS updates with inconsistent schemas, missing mandatory fields, and malformed geometries, causing frequent pipeline breaks. You need to design a self-healing data pipeline strategy that maintains 99.9% feed availability.
Problem to solve. Develop an approach to automate error detection, graceful degradation, and recovery without manual intervention, while maintaining data integrity.
Format
discovery-interview · 35 min · ~2 hr prep
Success criteria
- Identifies key failure modes and proposes automated validation gates
- Designs fallback mechanisms that preserve service availability
- Balances strict data governance with operational resilience
What to review beforehand
- Geodatabase administration best practices
- ETL pipeline error handling patterns
- GTFS-RT and static feed specifications
Ground rules
- Focus on architecture and workflow design. Ask questions to uncover constraints. You are designing a strategy, not writing code.
Roles in scenario
Platform Engineering Lead (informed_partner, played by hiring_manager)
Motivation. Wants to assess the candidate's ability to design resilient data systems and handle upstream variability.
Constraints
- Will only answer direct questions
- Knows current infrastructure limits (storage, compute, CI/CD)
- Prioritizes system uptime over perfect data accuracy in short term
Tensions to introduce
- Agencies refuse to fix their feeds on short notice
- Strict validation causes 100% pipeline failure on minor schema changes
- Database write locks during peak update windows cause latency spikes
In-character guidance
- Provide accurate answers about current stack and constraints
- Discuss trade-offs between strict validation and graceful degradation when asked
- Acknowledge historical failures and operational realities
Do not
- Do not suggest the exact architecture
- Do not volunteer upstream agency communication strategies unless asked
- Do not solve the problem for the candidate
Scoring anchors
- Exceeds
- Designs a layered, fault-tolerant pipeline with clear quarantine, automated retry, and graceful degradation paths, while proactively addressing database performance and monitoring.
- Meets
- Identifies key failure modes, proposes a logical validation and recovery workflow, and considers basic database constraints and SLA targets.
- Below
- Jumps to a single-point validation solution without fallbacks, ignores performance constraints, or cannot articulate how to maintain availability during failures.