google-research

Форк
0

..
/
state_of_sparsity 
README.md

The State of Sparsity in Deep Neural Networks

This directory contains the code accompanying the paper "The State of Sparsity in Deep Neural Networks". All authors contributed to this code.

The layers subdirectory contains implementations of variational dropout and l0 regularization in TensorFlow. The sparse_transformer and sparse_rn50 subdirectories contain code for the Transformer and ResNet-50 experiments from the aforementioned paper. The results subdirectory contains CSV files of the results of all hyperparameter configurations that we explored for each model, sparsity technique, and sparsity level.

Build Docker Image

To build a Docker image with all required dependencies, run sudo docker build -t <image_name> .. The base setup installs TensorFlow with GPU support and is based off Nvidia's CUDA-9.0 image with all the required libraries to run TensorFlow. To launch the container, run sudo docker run --runtime=nvidia -v ~/:/mount/ -it <image_name>:latest. This command additionaly makes your home directory accessbile at /mount inside the container.

To run with GPU support, swap tensorflow for tensorflow-gpu in requirements.txt.

Sparse Transformer

Once inside the container, this repo contains all of the code and data needed to decode the WMT English-German 2014 test set and calculate the BLEU score for each of the checkpoints we provided.

Small scripts to decode from Transformer checkpoints trained with each technique are provided in sparse_transformer/decode/. For random pruning checkpoints, use the decode_mp.sh script. For variational dropout, you'll need to pass in the same log alpha threshold that was used to achieve the BLEU score in checkpoint directory, which is provided as the last number in the checkpoint directory name.

The results of decoding from the model checkpoint will be saved in the sparse_transformer/decode/ directory with a name like newstest2014.end.sparse_transformer.... To calculate the BLEU score for these decodes, run sh get_ende_bleu.sh <decode_output>. This script relies on the mosesdecoder project (https://github.com/moses-smt/mosesdecoder), and assumes this is installed at /mount/mosesdecoder inside the container. The output of the script should match the BLEU score reported in the checkpoint directory.

Sparse ResNet-50

Scripts to evaluate ResNet-50 checkpoints on the ImageNet test set are provided in sparse_rn50/evaluate/. For random pruning checkpoints, use the decode_mp.sh script. You'll similarly need to pass in the log alpha threshold to evaluate va¯riaitonal dropout checkpoints, which was 0.5 for all our models. This repository does not include the ImageNet dataset, so you'll also need to point these scripts at a local version of the ImageNet test set stored as TFRecords. The output of the script should match the top-1 accuracy reported in the checkpoint directory.

Calculate Weight Sparsity

To calculate the weight sparsity for a checkpoint, use the checkpoint_sparsity.py script and pass the checkpoint file, sparsity technique, and model ("transformer" or "rn50"). For variational dropout, also pass the same log alpha threshold.

Trained Checkpoints

The top performing checkpoints for each model and sparsity technique can be downloaded with the following links.

ModelTechniqueSparsityBLEULink
TransformerMagnitude Pruning50%26.33link
TransformerMagnitude Pruning60%25.94link
TransformerMagnitude Pruning70%25.21link
TransformerMagnitude Pruning80%24.65link
TransformerMagnitude Pruning90%23.26link
TransformerMagnitude Pruning95%20.75link
TransformerMagnitude Pruning98%16.37link
TransformerVariational Dropout50%26.26link
TransformerVariational Dropout60%25.37link
TransformerVariational Dropout70%25.08link
TransformerVariational Dropout80%24.33link
TransformerVariational Dropout90%21.43link
TransformerVariational Dropout95%19.13link
TransformerVariational Dropout98%14.45link
TransformerL0 Regularization50%26.72link
TransformerL0 Regularization60%26.16link
TransformerL0 Regularization70%25.29link
TransformerL0 Regularization80%24.15link
TransformerL0 Regularization90%20.05link
TransformerL0 Regularization95%19.78link
TransformerL0 Regularization98%16.83link
TransformerRandom Pruning50%24.56link
TransformerRandom Pruning60%24.45link
TransformerRandom Pruning70%24.01link
TransformerRandom Pruning80%23.15link
TransformerRandom Pruning90%20.67link
TransformerRandom Pruning95%17.42link
TransformerRandom Pruning98%10.94link
ModelTechniqueSparsityTop-1 AccuracyLink
ResNet-50Magnitude Pruning50%76.53link
ResNet-50Magnitude Pruning70%76.38link
ResNet-50Magnitude Pruning80%75.58link
ResNet-50Magnitude Pruning90%73.91link
ResNet-50Magnitude Pruning95%70.59link
ResNet-50Magnitude Pruning98%57.9link
ResNet-50Magnitude Pruning (extended/non-uniform)80%76.52link
ResNet-50Magnitude Pruning (extended/non-uniform)90%75.16link
ResNet-50Magnitude Pruning (extended/non-uniform)95%72.71link
ResNet-50Magnitude Pruning (extended/non-uniform)96.5%69.26link
ResNet-50Random Pruning50%74.59link
ResNet-50Random Pruning70%72.2link
ResNet-50Random Pruning80%70.21link
ResNet-50Random Pruning90%65link
ResNet-50Random Pruning95%58.04link
ResNet-50Random Pruning98%43.99link
ResNet-50Variational Dropout50%76.55link
ResNet-50Variational Dropout80%75.28link
ResNet-50Variational Dropout90%73.84link
ResNet-50Variational Dropout95%71.91link
ResNet-50Variational Dropout98%67.36link

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.