Web appDatadog · Software Engineer Intern · 2022

Internal on-call handoff dashboard

Shared by M. Chen · ex-Datadog SWE Intern

The sharer told this exact project in their Datadog return-offer conversation and went on to receive a return offer.

Step into this interview

4 real follow-ups from the actual loop · 1 hard · ~12 min

You answer each question first — only then does the sharer's real take open up.

How they told it

An internal tool that let on-call engineers see the state of a service handoff at a glance instead of digging through three separate systems.

Read the full telling

My team's on-call handoffs were a mess. The outgoing engineer would paste a Slack summary, but active incidents, recent deploys, and muted alerts all lived in different places, so people missed context. My intern project was a small internal dashboard that pulled those three things into one page. I built the frontend in React and a thin Node/Express backend that called our internal incident API and the deploy service, then cached responses in Redis with a 60-second TTL so we didn't hammer those APIs on every refresh. The data model was basically one 'handoff' view assembled from three sources at request time; I deliberately didn't store it because the source-of-truth systems already did. The hardest part wasn't the code, it was figuring out what on-call people actually looked at first, so I shadowed two handoffs and cut the page down to the four things they named. It shipped to my team of about nine engineers by the end of the internship and two other teams asked to try it. I kept it read-only on purpose to avoid becoming a system people depended on for writes.

What they actually got asked

Why cache in Redis with a TTL instead of just calling the APIs live?

medium

You assemble the handoff view from three systems at request time. What breaks if one of them is down?

hard

Why React for an internal tool nine people use?

easy

At 10x the teams, what would you change?

medium