KitchenSync Food Forecasting System
End-to-end retail kitchen forecasting — FastAPI, Postgres, Snowflake, dbt, LightGBM
view on github ↗Retail kitchens waste food when production outpaces demand and miss revenue when they run short. Forecasting the right quantity per item per store, refreshed continuously, requires a real pipeline — not a spreadsheet. Modeled after the Kitchen Production System at Kwik Trip.
Async FastAPI ingest layer receives simulated POS events for 12 stores, each isolated in its own Neon Postgres schema. A Python extract script syncs all stores into Snowflake, where a three-layer dbt Core pipeline (staging → intermediate → marts) builds rolling feature tables. A LightGBM model trained on 2.7M synthetic events predicts units to produce per item per store over the next hour. A Streamlit dashboard refreshes every 60 seconds showing production queues and urgency flags.
- → 12 store schemas with per-store isolation; 2.7M historical events across 90 days
- → Three-layer dbt pipeline: 6 mart models covering production targets, item velocity, waste %, cold-start profiles, stockout summaries, store-level aggregates
- → Async POS simulator using Poisson arrivals and a time-of-day rush curve; StoreState class tracks FIFO batch inventory and expiration
- → LightGBM vs. scikit-learn RandomForest baseline — 190,773 predictions generated on held-out test data
- → Cold-start fallback to category-level mart averages for items with fewer than 4 data points
- → Single run_pipeline.py orchestrates the full extract → dbt → train → predict cycle
- → Urgency flag fires when sell-through exceeds 2x historical average (configurable threshold)