google-research
Business Metric-Aware Forecasting: A Case Study in Inventory
Authors: Helen Zhou, Sercan Arik, Jingtao Wang
Paper link: TBD
Abstract
Time-series forecasts play a critical role in business planning, and have proliferated throughout several industries. However, forecasts are often optimized for objectives that could be misaligned with downstream business objectives. Focusing on the inventory management setting, we demonstrate that optimization of conventional forecasting metrics does not necessarily lead to better downstream business performance. Although downstream business metrics of interest may not in general be differentiable, we derive an efficient procedure for computing and optimizing common inventory performance metrics in an end-to-end differentiable fashion. We further explore various trade-offs in cost which our method is able to adapt to, and show that in several cases, end-to-end optimization outperforms optimization of standard business-agnostic forecasting metrics.
Code Organization
This repository contains the source code for end-to-end optimization of both conventional forecasting and novel inventory-metric aware objectives.
The code is organized into three modules:
data_formatting
: dataset-specific pre-processing code, functions which create dataset objects for each dataset, and embedding code for the multivariate favorita dataset.lib
: core functionality, including the LSTM Encoder-Decoder architecture, the Naive Seasonal Scaler, differentiable computation of both forecasting and inventory metrics, and dataset classes which facilitate roll-forward training.utils
: utility functions for evaluation, logging, pre-processing, memory management, and visualizing results.
The python scripts in the outermost level of the repository are as follows:
main.py
: main script for running experiments. Highly customizable training of models under various optimization objectives, cost tradeoffs, etc. Allows saving of model checkpoints and predictions. Except forsktime_baselines.py
, all other scripts in the outermost level take in saved predictions or model checkpoints from runningmain.py
.eval_preds.py
: efficient evaluation of predictions (allows for parallelization) under any number of settings of cost tradeoffs, among other configuration parameters.eval_simulation.py
: evaluates predictions using both simulation approach and differentiable computationtest_later.py
: given saved model checkpoints, loads these checkpoints, makes predictions, and runs evaluationsktime_baselines.py
: runs sktime baselines on M3 and Favorita datasets
Additionally, there are two bash scripts:
run.sh
: simple command that runs a quick example on a subset of the M3 datarun_all.sh
: list of commands training the naive seasonal model and LSTM on M3 and Favorita, under different cost objectives
Preliminaries
First, install all packages in requirements.txt
.
Running Experiments
Our experiments are conducted on the M3 monthly industry dataset (univariate) and the Favorita daily grocery dataset (multivariate). To run these experiments, download the data, run the relevant pre-processing script, and finally run the relevant training routine.
M3 Pipeline
- Create a folder named
data/
in the parent folder of this repository, if it does not already exist. Create a subfolderdata/m3/
.
mkdir ../data/
mkdir ../data/m3/
- Download the
M3C.xls
dataset at the M3 competition website, and move it into thedata/m3/
folder. - Run the M3 data preprocessing script:
python -m data_formatting.m3_preprocess
- You are now ready to run an experiment on M3 data. Below is an example command which trains a naive seasonal model on the entire dataset:
python -m main \
--dataset_name m3 \
--model_name naive_seasonal \
--optimization_objs total_cost \
--unit_holding_costs 1 \
--unit_stockout_costs 1 \
--unit_var_o_costs 0.000001 \
--nosweep \
--hidden_size 20 \
--max_steps 5 \
--learning_rate 0.0001 \
--num_workers 10 \
--no_safety_stock \
--project_name m3_example \
--return_test_preds
Favorita Pipeline
- Create a folder named
data/
in the parent folder of this repository, if it does not already exist. Create a subfolderdata/favorita/
.
mkdir ../data/
mkdir ../data/favorita/
- Download
favorita-grocery-sales-forecasting.zip
from the Favorita competition website, and move it into../data/favorita/
. - Run the Favorita preprocessing script (takes approximately 1.5 hrs):
python -m data_formatting.favorita_preprocess
- You are now ready to run an experiment on Favorita data. Below is an example command which trains a naive seasonal model on a subset Favorita (1,000 series):
python -m main \
--dataset_name favorita \
--model_name naive_seasonal \
--optimization_objs mse \
--unit_holding_costs 1 \
--unit_stockout_costs 1 \
--unit_var_o_costs 0.01 \
--nosweep \
--hidden_size 64 \
--max_steps 1 \
--learning_rate 0.0001 \
--batch_size 100 \
--num_batches 20 \
--num_workers 5 \
--project_name favorita_example \
--N 1000 \
--single_rollout \
--no_safety_stock \
--save latest