UX Researcher

Ryan Mahoney

Why this role is hard · Ryan Mahoney

The hard part isn't finding a researcher or an engineer. It is finding someone who can handle both responsibilities without dropping one when pressure mounts. You need a candidate who builds evaluation systems that work with real user data while sticking to strict privacy rules. Most people fail because they treat human feedback like just another dataset instead of considering the person behind it. The right candidate can keep product teams on the same page without losing accuracy and will tell you when the data is too messy to trust.

Core Evaluation

Critical questions for this role

The competency and attitude questions below are where the hiring decision is made. They run in the live interview rounds and are calibrated to the level selected above.

16 Competency Questions

1 of 16
  1. Discipline

    UX Research & Operations

  2. Job requirement

    Analysis & Synthesis

    Synthesizes findings into coherent narratives and identifies root causes, bridging qualitative insight with scalable data pipelines for actionable product recommendations.

  3. Expected at Mid

    Independent synthesis of mixed-methods data into coherent narratives and root-cause identification is core to driving cost-per-insight efficiency and actionable product recommendations. Failing to bridge qualitative insights with scalable data pipelines risks superficial findings, missed cross-source patterns, and delayed engineering decisions.

Interview round: Hiring Manager Portfolio Review

You have a large dataset of interview transcripts but limited time. What is your process?

Positive indicators

  • Mentions affinity mapping or tagging
  • Describes sampling strategy
  • Sets clear timeboxes for analysis

Negative indicators

  • Attempts to read every word linearly
  • No clear system for organizing data
  • Misses deadline due to perfectionism

13 Attitude Questions

1 of 13

Active Listening

The intentional cognitive and behavioral process of fully concentrating on, understanding, and responding to a speaker, characterized by withholding immediate judgment, validating spoken and unspoken cues, and accurately reflecting content to ensure alignment between user realities, stakeholder constraints, and technical possibilities.

Interview round: Recruiter Screen

Walk me through a research session where a participant said something unexpected. How did you handle it?

Positive indicators

  • Describes adapting questions in real-time
  • Mentions noticing participant discomfort or enthusiasm
  • Connects unexpected input to insights

Negative indicators

  • Stick rigidly to script despite participant cues
  • Dismisses unexpected input as off-topic
  • Cannot recall specific unexpected moments

Supporting Evaluation

How candidates earn the selection conversation

The goal is to reduce effort for everyone by collecting more useful signal before adding more interviews. Lightweight application prompts and structured screens help the panel focus live time on the candidates most likely to succeed.

Stage 1 · Application

Filter at the door

Runs the moment a candidate hits Submit. Disqualifying answers end the application; everything else is captured for review.

Knock-out Questions

1 of 3

Application Screen: Knock-out

Do you have formal training and documented experience designing research protocols that comply with IRB standards and PHI/HIPAA regulations?

Yes
Qualifies
No
Auto-decline

Video-Response Questions

1 of 3

Application Screen: Video Response

A product manager pressures you to bypass ethical review steps to meet an imminent launch deadline and requests access to identifiable user session recordings outside your approved consent protocol. Walk us through how you would handle this conversation, including how you set boundaries while maintaining the partnership.

Candidate experience

REC
0:42 / 2:00
1Record
2Review
3Submit

Response time

2 min

Format

Recorded video

Stage 2 · Resume Screening

Read the resume against fixed criteria

Reviewers score every application that clears the door against the same criteria. Stronger reviews advance to live interviews; weaker ones are archived without further screening.

Resume Review Criteria

8 criteria
Evidence of selecting and implementing appropriate research methods, A/B tests, or evaluation harnesses to answer specific product or model performance questions.
Evidence of building, refining, and quality-controlling datasets used for model training, evaluation, or RLHF pipelines, including annotation workflow design.
Evidence of combining telemetry, survey data, and interview insights into structured decision briefs, usability scorecards, or sprint-ready recommendations.
Evidence of partnering with engineering and design to instrument features, hosting research office hours, or coaching product squads on lightweight research methods.

Does the resume show relevant prior work experience?

Does the cover letter or personal statement convey clear relevance and familiarity with the job?

Does the resume indicate required academic credentials, relevant certifications, or necessary training?

Is the resume complete, well-organized, and free from formatting, spelling, and grammar mistakes?

Stage 3 · During Interviews

Where the hire is decided

Interview rounds use the competency and attitude questions outlined above, then add tests, work simulations, and presentations that reveal deeper evidence about how the candidate thinks and works.

Presentation Prompt

Prepare a short deck walking us through an evaluation stream you owned. Discuss how you translated ambiguous user sentiment into weighted metrics for engineering, how you aligned timelines across squads, and what trade-offs you made between qualitative nuance and quantitative rigor.

Format

deck-and-walkthrough · 20 min · ~2 hr prep

Audience

Research Engineering Leads, Product Managers, and Data Scientists

What to prepare

  • A 3-5 slide deck summarizing your chosen evaluation stream
  • Focus on your methodology, stakeholder communication cadence, and final impact

Deliverables

  • A concise presentation of your past evaluation stream
  • A discussion defending your metric translation and stakeholder alignment choices

Ground rules

  • Use only work you are permitted to share; anonymize sensitive data if necessary.
  • Keep slides focused on your decision-making process, not just final outputs.

Scoring anchors

Exceeds
Seamlessly bridges qualitative nuance and quantitative engineering metrics, demonstrating exceptional stakeholder alignment and clear, defensible trade-off rationale.
Meets
Walks through a coherent evaluation stream with clear methodology, reasonable metric translation, and documented stakeholder communication steps.
Below
Struggles to explain how insights informed engineering metrics, lacks clear stakeholder communication strategy, or presents disconnected data without synthesis.

Response time

20 min

Positive indicators

  • Clearly articulates how qualitative insights were operationalized into engineering metrics
  • Demonstrates structured stakeholder communication throughout the project lifecycle
  • Surfaces and defends trade-offs between speed, rigor, and qualitative nuance
  • Shows how feedback loops were maintained and adapted post-delivery

Negative indicators

  • Presents data without explaining the translation from sentiment to metrics
  • Ignores stakeholder alignment challenges or timeline constraints
  • Fails to check for understanding or adapt communication to technical audiences
  • Uses jargon without clarifying how it maps to product decisions

Work Simulation Scenario

Scenario. Your team is launching an evaluation stream for a new AI-driven UX feature. Engineering wants rapid A/B testing with telemetry, Product wants deep qualitative interviews to understand user intent, and Data Science insists on a controlled statistical evaluation before rollout. You must facilitate a decision on the primary evaluation methodology and resource allocation for the next quarter.

Problem to solve. Drive a cross-functional tradeoff discussion to align on a unified evaluation strategy, secure resource commitments, and establish governance for the evaluation stream.

Format

cross-functional-decision · 40 min · ~2 hr prep

Success criteria

  • Consensus on a primary methodology with clear tradeoffs documented
  • Agreed-upon resource allocation and timeline
  • Established guardrails for data quality and ethical review
  • Clear next steps for cross-functional handoff

What to review beforehand

  • Overview of mixed-methods evaluation frameworks
  • Company's standard cross-functional alignment process

Ground rules

  • Facilitate the discussion, do not dominate it
  • Focus on aligning incentives and documenting decisions
  • You may propose compromises but must earn stakeholder buy-in

Roles in scenario

Engineering Lead (cross_functional_partner, played by cross_functional)

Motivation. Wants to ship telemetry quickly to unblock the release pipeline and minimize manual research overhead.

Constraints

  • Limited engineering bandwidth for custom instrumentation
  • Must maintain system performance under load
  • Cannot delay launch by more than 2 weeks

Tensions to introduce

  • Resists qualitative methods as 'too slow'
  • Questions the ROI of deep user studies
  • Pushes for fully automated evaluation

In-character guidance

  • Be pragmatic about technical feasibility
  • Push back on requests that impact system stability
  • Relent if a clear, automated guardrail is proposed

Do not

  • Do not agree to everything immediately
  • Do not derail the conversation with unrelated technical details
  • Do not solve the alignment problem for the candidate
  • Do not withhold technical constraints when asked directly

Product Manager (skeptical_stakeholder, played by leadership)

Motivation. Needs to understand user intent and emotional response to justify the feature's roadmap placement.

Constraints

  • Must deliver user insights within 4 weeks
  • Cannot compromise on understanding edge-case user frustration
  • Limited budget for external research vendors

Tensions to introduce

  • Skeptical of purely quantitative metrics capturing nuance
  • Pushes for broader participant scope than engineering allows
  • Worries about misinterpreting telemetry as intent

In-character guidance

  • Advocate strongly for qualitative depth
  • Challenge assumptions that data alone tells the full story
  • Accept a mixed-method approach if qualitative phase is prioritized

Do not

  • Do not concede without clear justification
  • Do not introduce new constraints late in the discussion
  • Do not coach the candidate on stakeholder management
  • Do not dominate the conversation without allowing facilitation

Scoring anchors

Exceeds
Skillfully navigates competing priorities, synthesizes a hybrid methodology that satisfies all parties, and establishes clear operational governance.
Meets
Facilitates a productive discussion, reaches a workable compromise, and documents agreed-upon tradeoffs and timelines.
Below
Struggles to manage conflicting incentives, defaults to a single methodology without consensus, or leaves action items ambiguous.

Response time

40 min

Positive indicators

  • Facilitates structured tradeoff discussion, surfacing competing incentives
  • Translates qualitative and quantitative needs into a cohesive evaluation framework
  • Secures explicit commitments on resource allocation and timeline
  • Documents decisions and clarifies operational governance next steps

Negative indicators

  • Allows one stakeholder to dominate the conversation without balancing others
  • Fails to document decisions or clarify next steps
  • Imposes a methodology without addressing core constraints
  • Avoids addressing conflicting incentives or defers decision to leadership

Progression Framework

This table shows how competencies evolve across experience levels. Each cell shows competency at that level.

UX Research & Operations

7 competencies

CompetencyJuniorMidSeniorPrincipal
Analysis & Synthesis

Tags data and identifies basic themes using provided frameworks, processing raw data to surface initial patterns.

Synthesizes findings into coherent narratives and identifies root causes, bridging qualitative insight with scalable data pipelines for actionable product recommendations.

Connects insights across multiple studies to inform product strategy.

Develops new synthesis frameworks and drives industry-level insight generation, creating novel evaluation paradigms for AGI systems.

Data Collection & Method Execution

Moderates simple sessions and collects data using established scripts, ensuring consistency and quality in data gathering.

Moderates complex sessions and adapts scripts in real-time based on participant responses, ensuring data quality across qualitative and quantitative collection methods.

Oversees data collection quality across multiple studies and trains junior researchers.

Establishes best practices for data collection integrity across the organization, ensuring research quality standards for high-stakes AI safety studies.

Domain Adaptation & Emerging Technology

Learns domain-specific terminology and assists in specialized studies, building contextual knowledge for domain-specific research execution.

Executes research in specialized domains using adapted methodologies, applying research methods to AI/ML contexts and evaluating emerging technology impact on UX.

Leads research for emerging technology products and defines domain-specific best practices.

Anticipates industry shifts and defines research strategy for new technology categories, particularly human-AI interaction paradigms.

Ethics, Privacy & Compliance

Follows consent protocols and anonymizes data as instructed, ensuring research activities meet legal and ethical standards.

Reviews study plans for compliance risks and manages consent forms, ensuring research activities meet legal, ethical, and privacy standards for human evaluation frameworks.

Advises product teams on ethical implications of research and data usage.

Sets organizational ethics standards and liaises with legal/compliance teams, providing existential safety and ethics leadership for human-AI research.

Operational Governance

Organizes files and maintains research repositories according to team standards, supporting workflow efficiency and auditability.

Optimizes workflows and manages tool subscriptions for the team, driving pipeline latency reduction and research velocity improvement through automation and repository management.

Designs operational systems to scale research output and reduce redundancy.

Defines organizational research ops strategy and budget allocation, enabling scalable research infrastructure for AGI safety programs.

Reporting & Stakeholder Communication

Creates summary reports and presents findings to immediate team, communicating research outcomes through documented artifacts.

Tailors communication styles for different stakeholders and facilitates workshops, enabling metric adoption by engineering teams and research velocity improvement.

Influences product roadmap through persuasive storytelling and executive presentations.

Establishes communication standards and advocates for research impact at the board level, influencing industry-wide adoption of alignment metrics.

Research Planning & Fundamentals

Assists in drafting research plans and screening participants under guidance, supporting study logistics aligned with product goals.

Independently creates research plans and selects methods for standard product questions, aligning evaluation protocols with product goals and ML team requirements.

Designs complex longitudinal studies and aligns research strategy with business objectives.

Defines organizational research frameworks and mentors others on strategic planning, establishing long-term research direction aligned with AGI alignment goals.