Normalizing Flight Statuses Across 7 Languages: What I Learned Building a Global Airport API
When you scrape flight data from airport websites across 85+ countries, you quickly discover that nobody agrees on anything. Not the data format, not the field names, not the status strings. "Depar...

Source: DEV Community
When you scrape flight data from airport websites across 85+ countries, you quickly discover that nobody agrees on anything. Not the data format, not the field names, not the status strings. "Departed" is SAL, FLY, CER, AIR, DEP, DEPARTED, Departed, Отправлен, Väljunud, DESPEGADO, gestartet, or partito, depending on which airport website you're looking at. Building MyAirports meant writing a status normalizer that handles all of these — and maps them to a single, predictable enum. Here's how it works and what surprised me along the way. The target schema Every flight the API returns has a status field from this set: scheduled | boarding | departed | arrived | delayed | cancelled | unknown Seven values. Simple, predictable, usable in any frontend without additional mapping. The hard part is getting there from the real world. Layer 1: Exact string matches The first layer is the most straightforward — a lookup table of known status strings to their canonical values: const EXACT = { // Eng