google-research

Форк
0

..
/
value_dice 
README.md

Imitation Learning via Off-Policy Distribution Matching

Ilya Kostrikov, Ofir Nachum, Jonathan Tompson

Source code to accompany Imitation Learning via Off-Policy Distribution Matching.

If you use this code for your research, please consider citing the paper:

@inproceedings{
  Kostrikov2020Imitation,
  title={Imitation Learning via Off-Policy Distribution Matching},
  author={Ilya Kostrikov and Ofir Nachum and Jonathan Tompson},
  booktitle={International Conference on Learning Representations},
  year={2020},
  url={https://openreview.net/forum?id=Hyg-JC4FDr}
}

Install Dependencies

pip install -m requirements.txt

You will also need to install Mujoco and use a valid license. Follow the install instructions here.

Datasets are stored in Google Cloud Storage and should be downloaded by running:

wget -P value_dice/datasets/ https://storage.googleapis.com/gresearch/value_dice/datasets/Ant-v2.npz
wget -P value_dice/datasets/ https://storage.googleapis.com/gresearch/value_dice/datasets/HalfCheetah-v2.npz
wget -P value_dice/datasets/ https://storage.googleapis.com/gresearch/value_dice/datasets/Hopper-v2.npz
wget -P value_dice/datasets/ https://storage.googleapis.com/gresearch/value_dice/datasets/Walker2d-v2.npz

Expert Trajectories:

Expert trajectories are generated using the GAIL code.

Running Training

From the root google_research directory, run:

wget -P value_dice/datasets/ https://storage.googleapis.com/gresearch/value_dice/datasets/HalfCheetah-v2.npz
python -m value_dice.train_eval \
--expert_dir ./datasets/ \
--save_dir ./save/ \
--algo value_dice \
--env_name HalfCheetah-v2 \
--seed 42 \
--num_trajectories 1 \
--alsologtostderr

To reproduce results run:

sh value_dice/run_experiments.sh

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.