Transit TechExpert-built kit

GTFS-Realtime Engineer

Parses raw transit telemetry into real-time data feeds, maps agency fields to standard schemas, and optimizes stream distribution.

Calibrated for the level you’re hiring

What’s inside this kit

18Competency interview questions
11Attitude interview questions
8Resume screening criteria
3Video screening prompts
1Hands-on work simulations
1Presentation prompts
Progression framework, Junior–Principal
Ready-to-use job description

Why this role is hard · Ryan Mahoney

Finding the right person for this role is tough because the work sits exactly where fragile code meets messy real-world transit operations. We need someone who can actually listen when an agency engineer describes a broken vehicle tracker, and then turn that chaotic situation into a steady data stream. Plenty of candidates can write fast parsers, but they usually stumble when they ignore the early warning signs in a failing feed. We judge the right candidates by watching them connect a real transit delay to a practical streaming design instead of hiding behind heavy abstractions.

Core Evaluation

Critical questions for this role

The competency and attitude questions below are where the hiring decision is made. They run in the live interview rounds and are calibrated to the level selected above.

18 Competency Questions

1 of 18

Discipline
Data Ingestion & Feed Architecture
Job requirement
Feed Aggregation & Multi-Source Fusion
Develops conflict-resolution logic to prioritize authoritative data sources when aggregating multi-operator feeds.
Expected at Mid
3 / 5
Highly relevant for regional integrations, but at mid-level, conflict resolution is often scoped to specific agency hand-offs rather than full-scale multi-fusion architecture.

Interview round: Hiring Manager Technical Deep Dive

Describe a situation where you had to reconcile conflicting real-time updates from multiple transit operators covering the same routes.

Positive indicators

Defines authoritative sources
Uses timestamp logic
Implements deduplication

Negative indicators

Arbitrary merging approaches
Ignores timestamps
Duplicates records

11 Attitude Questions

1 of 11

Active Listening

The disciplined practice of fully concentrating on, comprehending, and accurately interpreting spoken information from technical peers, transit operators, and stakeholders to extract precise operational requirements, resolve conflicting constraints, and translate nuanced verbal feedback into resilient real-time data architectures without premature judgment or interruption.

Interview round: Recruiter Screen

How would you approach a kickoff call where agency engineers describe inconsistent GPS polling rates and legacy dispatch software quirks?

Positive indicators

Structures notes into actionable ingestion specs
Confirms understanding with operators in real-time
Identifies fallback logic needs early

Negative indicators

Pushes standard polling expectations rigidly
Interrupts with premature technical solutions
Overlooks documented hardware constraints

Supporting Evaluation

How candidates earn the selection conversation

The goal is to reduce effort for everyone by collecting more useful signal before adding more interviews. Lightweight application prompts and structured screens help the panel focus live time on the candidates most likely to succeed.

Stage 1 · Application

Filter at the door

Runs the moment a candidate hits Submit. Disqualifying answers end the application; everything else is captured for review.

Knock-out Questions

1 of 2

Application Screen: Knock-out

Do you have at least 2 years of production experience building, validating, or maintaining GTFS and GTFS-Realtime data feeds?

Yes

Qualifies

Auto-decline

Video-Response Questions

1 of 3

Application Screen: Video Response

Describe how you would handle a situation where a major transit agency insists on bypassing our GTFS-RT validation gates to avoid a service alert during a known upstream vendor outage. What specific steps would you take to address their operational pressure while protecting feed integrity?

Candidate experience

REC

0:42 / 2:00

1Record

2Review

3Submit

Response time

2 min

Format

Recorded video

Stage 2 · Resume Screening

Read the resume against fixed criteria

Reviewers score every application that clears the door against the same criteria. Stronger reviews advance to live interviews; weaker ones are archived without further screening.

Resume Review Criteria

8 criteria

Evidence of owning ingestion and delivery pipelines, implementing retry/backoff logic, and ensuring reliable real-time feed distribution.

Evidence of aligning dynamic trip updates with static GTFS schedules, handling operational edge cases, and building matching logic.

Evidence of configuring monitoring stacks, authoring operational runbooks, and facilitating post-incident reviews to protect feed SLAs.

Evidence of integrating automated validation gates into continuous integration pipelines to maintain spec compliance and reduce deployment friction.

Does the cover letter or personal statement convey clear relevance and familiarity with the job?

Does the resume indicate required academic credentials, relevant certifications, or necessary training?

Is the resume complete, well-organized, and free from formatting, spelling, and grammar mistakes?

Does the resume show relevant prior work experience?

Stage 3 · During Interviews

Where the hire is decided

Interview rounds use the competency and attitude questions outlined above, then add tests, work simulations, and presentations that reveal deeper evidence about how the candidate thinks and works.

Coding Test

Live Interview · Coding Test

Without AI

Write the solution independently. Focus on validation gates, quarantine routing, and clear diagnostic reporting. You will discuss trade-offs with the interviewer.
Extend the provided parser to validate incremental GTFS-RT updates against a mock schema. Implement a quarantine mechanism for non-compliant payloads that captures diagnostic metadata without blocking the main ingestion stream.

With AI

You may use an AI coding assistant. Critically evaluate its validation logic and quarantine design against GTFS-RT incremental update behaviors. Refine and justify your modifications.
Use an AI assistant to draft the validation and quarantine logic. Audit the output for incremental feed compatibility, diagnostic completeness, and pipeline resilience. Explain your refinements.

Response time

20 min

Positive indicators

Clear separation of validation logic from parsing
Structured quarantine with actionable metadata
Non-blocking ingestion flow for malformed payloads
Explicit handling of incremental update edge cases
Critical assessment of AI-generated validation rules for incremental feeds
Enhanced quarantine diagnostics with spec-aligned metadata
Clear pipeline isolation and non-blocking design
Explicit reasoning for AI modifications and trade-offs

Negative indicators

Halts ingestion on first validation failure
Quarantine lacks diagnostic context for debugging
Validation logic tightly coupled to parsing
No consideration for incremental vs full feed differences
Accepts AI validation logic without testing incremental edge cases
Quarantine design blocks main ingestion or lacks metadata
No justification for AI-driven architectural choices
Fails to address spec versioning or drift scenarios

Presentation Prompt

Prepare a short deck walking us through your approach to designing a resilient stream processing pipeline that transforms raw transit telemetry into real-time GTFS-RT updates under strict SLA requirements. Discuss how you would handle upstream vendor outages, implement retry/backoff strategies, and maintain downstream feed consistency. Walk the panel through your architecture, operational safeguards, and monitoring approach.

Format

deck-and-walkthrough · 20 min · ~2 hr prep

Audience

Engineering hiring panel and senior reliability engineer

What to prepare

3-5 slides outlining your pipeline architecture, resilience patterns, and monitoring strategy.
Notes on how you would align pipeline design with specific SLA targets and downstream consumer needs.

Deliverables

A short conceptual deck and structured walkthrough of your pipeline design and operational safeguards.

Ground rules

Focus on your reasoning and past experience patterns; conceptual architecture is sufficient.
Do not build a production-ready diagram or write deployment scripts.
Use only work you are permitted to share or discuss hypothetically.

Scoring anchors

Exceeds: Presents a highly resilient, well-architected pipeline with explicit failure-mode handling, clear SLA alignment, and practical monitoring/rollback strategies that protect downstream consumers.
Meets: Provides a coherent pipeline design with standard retry logic and basic monitoring, adequately addressing SLA requirements and data consistency.
Below: Proposes a fragile or overly simplistic pipeline, overlooks upstream outage handling, or fails to connect architecture to SLA constraints.

Response time

20 min

Positive indicators

Clearly maps out data flow and explicit failure points.
Proposes concrete retry/backoff and fallback mechanisms tied to SLA targets.
Anticipates downstream consumer impact and designs monitoring/alerting accordingly.
Balances throughput optimization with data consistency guarantees.

Negative indicators

Overcomplicates the architecture without clear justification or operational need.
Ignores upstream failure scenarios or lacks concrete recovery strategies.
Fails to connect design choices to specific SLA or latency constraints.
Presents disjointed steps without a cohesive pipeline narrative.

Work Simulation Scenario

Scenario. You own the end-to-end data pipeline for a multi-agency GTFS-RT feed. Downstream transit apps are reporting prediction drift and intermittent stale updates during peak commute hours. You must diagnose the stream processing bottlenecks and design an optimization strategy to restore reliability.

Problem to solve. Identify the root cause of latency and staleness in the real-time transformation pipeline, then design a resilient stream processing architecture with appropriate retry, backoff, and monitoring mechanisms.

Format

discovery-interview · 35 min · ~2 hr prep

Success criteria

Ask diagnostic questions to isolate latency vs. staleness vs. backpressure
Propose a structured optimization strategy covering stream processing, error handling, and observability
Define measurable success criteria and rollback safeguards

What to review beforehand

Kafka/Kinesis stream processing fundamentals
Exponential backoff and retry pattern design

Ground rules

Drive a structured discovery conversation to map the problem space before jumping to solutions.
The interviewer will provide honest answers to your questions but will not volunteer system details unprompted.

Roles in scenario

SRE / Data Platform Lead (informed_partner, played by peer)

Motivation. Restore feed reliability without introducing cascading failures or increasing infrastructure costs unnecessarily.

Constraints

Peak traffic increases payload volume by 3x within 30 minutes
Downstream consumers enforce strict 5-second update intervals
Current pipeline uses Python-based stream processors with limited horizontal scaling

Tensions to introduce

Upstream vendor APIs occasionally return 503s, causing backpressure
Existing retry logic lacks exponential backoff, leading to thundering herds
Monitoring metrics only track throughput, not end-to-end latency or staleness

In-character guidance

Provide precise technical details only when asked
Confirm or clarify candidate assumptions about system behavior
Highlight operational constraints when discussing scaling or retry strategies

Do not

Do not reveal the exact backpressure configuration or missing metrics unless asked
Do not suggest specific tuning parameters or architectural patterns
Do not validate or invalidate the candidate's approach prematurely

Scoring anchors

Exceeds: Methodically isolates bottlenecks through targeted questions, designs a resilient, observable stream architecture, and explicitly addresses backpressure, retry safety, and rollback paths.
Meets: Identifies likely latency and staleness causes, proposes reasonable retry and monitoring improvements, and acknowledges infrastructure constraints.
Below: Guesses at root causes, suggests untested or risky scaling/retry patterns, or neglects observability and rollback planning.

Response time

35 min

Positive indicators

Asks high-information diagnostic questions to isolate latency sources, backpressure triggers, and metric gaps
Designs a resilient stream processing strategy with explicit retry/backoff logic and observability hooks
Balances performance optimization with infrastructure constraints and rollback safety
Articulates clear success metrics and validation steps before implementation

Negative indicators

Jumps to scaling solutions without diagnosing root causes or metric gaps
Proposes aggressive retry logic without considering thundering herd or backpressure risks
Overlooks end-to-end latency tracking and staleness detection
Fails to define rollback safeguards or measurable success criteria

Progression Framework

This table shows how competencies evolve across experience levels. Each cell shows competency at that level.

Data Ingestion & Feed Architecture

4 competencies

Competency	Junior	Mid	Senior	Principal
Feed Aggregation & Multi-Source Fusion	Merges duplicate trip updates from overlapping feeds using simple timestamp and ID matching rules to produce clean, unified datasets.	Develops conflict-resolution logic to prioritize authoritative data sources when aggregating multi-operator feeds.	Builds scalable multi-source fusion systems that normalize disparate data formats into a unified, consistent GTFS-Realtime output.	Architects regional data exchange frameworks that enable seamless, real-time federation of GTFS-Realtime feeds across dozens of transit agencies and mobility providers.
Feed Parsing & Schema Validation	Parses GTFS-Realtime protobuf payloads and runs basic schema validators against specification rules to ensure data contract compliance.	Implements custom validation pipelines to handle malformed entities and aligns realtime updates with static GTFS references.	Designs robust ingestion architectures that auto-heal schema drifts and enforce strict data contracts across multiple agency feeds.	Defines enterprise-wide schema governance strategies and leads specification evolution initiatives to standardize cross-platform feed interoperability across diverse transit data sources.
Predictive ETA & Disruption Modeling	Applies baseline statistical formulas to calculate estimated arrival times from historical and current feed data, supporting basic trip update generation.	Integrates machine learning models into the pipeline to adjust ETAs based on traffic, weather, and historical delay patterns.	Engineers predictive disruption models that simulate cascading delays and proactively recommend schedule adjustments.	Leads advanced research on mobility simulation algorithms and integrates multi-modal predictive capabilities into core routing engines to enhance network resilience.
Stream Processing & Real-Time Transformation	Consumes real-time telemetry streams and applies basic transformations using established processing frameworks to normalize vehicle and trip data.	Configures stream processing jobs to enrich vehicle positions and trip updates with low-latency requirements.	Architects fault-tolerant streaming topologies that dynamically scale to handle peak transit events and backpressure scenarios.	Pioneers next-generation stream processing paradigms and optimizes global data mesh architectures to achieve sub-second real-time transit analytics at continental scale.

System Operations & Ecosystem Integration

4 competencies

Competency	Junior	Mid	Senior	Principal
Alert Lifecycle & Incident Management	Configures alert generation rules and routes basic service disruption notifications to downstream systems and on-call personnel.	Implements alert lifecycle workflows, managing severity escalation, resolution tracking, and consumer subscription filtering.	Engineers automated incident response pipelines that correlate feed anomalies with real-world operational alerts and dispatches remediation actions.	Establishes enterprise-wide incident governance frameworks and integrates AI-driven anomaly detection to enable proactive transit network resilience.
Developer Ecosystem & API Gateway Integration	Maintains API documentation and assists developers with basic authentication and query syntax for transit data endpoints.	Develops SDKs, rate-limiting policies, and developer portals to streamline third-party application integration with real-time feeds.	Architects comprehensive API gateways that support GraphQL/REST, webhook subscriptions, and secure developer ecosystem management.	Drives strategic platform partnerships and designs open mobility data ecosystems that foster third-party innovation and cross-vendor transit tech integration.
Enterprise Governance & Compliance Monitoring	Applies standard access controls and monitors basic audit logs for GTFS-Realtime feed distribution systems to ensure security hygiene.	Implements PII redaction pipelines, role-based access controls, and compliance checks aligned with transit data privacy regulations.	Designs enterprise governance frameworks that automate regulatory reporting, data lineage tracking, and cross-agency compliance enforcement.	Champions industry-wide data sovereignty standards, advises regulatory bodies, and establishes secure, auditable mobility data exchange protocols.
High-Availability Feed Distribution & Caching	Deploys and monitors standard HTTP endpoints and caches to serve validated GTFS-Realtime feeds to consumers with low-latency requirements.	Optimizes CDN configurations and implements tiered caching strategies to reduce latency and handle traffic spikes.	Designs highly available, globally distributed feed distribution networks with automated failover and rigorous SLA enforcement.	Defines strategic distribution architectures for edge-computing transit networks, ensuring zero-downtime, sub-second data delivery at continental scale.

GTFS-Realtime Engineer

Critical questions for this role

18 Competency Questions

Feed Aggregation & Multi-Source Fusion

11 Attitude Questions

Active Listening

How candidates earn the selection conversation

Filter at the door

Knock-out Questions

Video-Response Questions

Read the resume against fixed criteria

Resume Review Criteria

Where the hire is decided

Coding Test

Presentation Prompt

Format

Audience

What to prepare

Deliverables

Ground rules

Scoring anchors

Work Simulation Scenario

Format

Success criteria

What to review beforehand

Ground rules

Roles in scenario

SRE / Data Platform Lead (informed_partner, played by peer)

Scoring anchors

Progression Framework

Data Ingestion & Feed Architecture

System Operations & Ecosystem Integration

Sample Job Description Content

GTFS-Realtime Engineer

What you'll do

Who you are

Why this role will be interesting

Our Process