← dean's list
data engineering · 2026 · ● live

KitchenSync Food Forecasting System

End-to-end retail kitchen forecasting — FastAPI, Postgres, Snowflake, dbt, LightGBM

view on github ↗
architecture

Retail kitchens waste food when production outpaces demand and miss revenue when they run short. Forecasting the right quantity per item per store, refreshed continuously, requires a real pipeline — not a spreadsheet. Modeled after the Kitchen Production System at Kwik Trip.

Async FastAPI ingest layer receives simulated POS events for 12 stores, each isolated in its own Neon Postgres schema. A Python extract script syncs all stores into Snowflake, where a three-layer dbt Core pipeline (staging → intermediate → marts) builds rolling feature tables. A LightGBM model trained on 2.7M synthetic events predicts units to produce per item per store over the next hour. A Streamlit dashboard refreshes every 60 seconds showing production queues and urgency flags.

  • 12 store schemas with per-store isolation; 2.7M historical events across 90 days
  • Three-layer dbt pipeline: 6 mart models covering production targets, item velocity, waste %, cold-start profiles, stockout summaries, store-level aggregates
  • Async POS simulator using Poisson arrivals and a time-of-day rush curve; StoreState class tracks FIFO batch inventory and expiration
  • LightGBM vs. scikit-learn RandomForest baseline — 190,773 predictions generated on held-out test data
  • Cold-start fallback to category-level mart averages for items with fewer than 4 data points
  • Single run_pipeline.py orchestrates the full extract → dbt → train → predict cycle
  • Urgency flag fires when sell-through exceeds 2x historical average (configurable threshold)
PythonFastAPIPostgreSQL (Neon)Snowflakedbt CoreLightGBMscikit-learnStreamlitasyncio / httpxPyArrow