google-research

The layers subdirectory contains implementations of variational dropout and l0 regularization in TensorFlow. The sparse_transformer and sparse_rn50 subdirectories contain code for the Transformer and ResNet-50 experiments from the aforementioned paper. The results subdirectory contains CSV files of the results of all hyperparameter configurations that we explored for each model, sparsity technique, and sparsity level.

Build Docker Image

To build a Docker image with all required dependencies, run sudo docker build -t <image_name> .. The base setup installs TensorFlow with GPU support and is based off Nvidia's CUDA-9.0 image with all the required libraries to run TensorFlow. To launch the container, run sudo docker run --runtime=nvidia -v ~/:/mount/ -it <image_name>:latest. This command additionaly makes your home directory accessbile at /mount inside the container.

To run with GPU support, swap tensorflow for tensorflow-gpu in requirements.txt.

Sparse Transformer

Once inside the container, this repo contains all of the code and data needed to decode the WMT English-German 2014 test set and calculate the BLEU score for each of the checkpoints we provided.

Small scripts to decode from Transformer checkpoints trained with each technique are provided in sparse_transformer/decode/. For random pruning checkpoints, use the decode_mp.sh script. For variational dropout, you'll need to pass in the same log alpha threshold that was used to achieve the BLEU score in checkpoint directory, which is provided as the last number in the checkpoint directory name.

The results of decoding from the model checkpoint will be saved in the sparse_transformer/decode/ directory with a name like newstest2014.end.sparse_transformer.... To calculate the BLEU score for these decodes, run sh get_ende_bleu.sh <decode_output>. This script relies on the mosesdecoder project (https://github.com/moses-smt/mosesdecoder), and assumes this is installed at /mount/mosesdecoder inside the container. The output of the script should match the BLEU score reported in the checkpoint directory.

Sparse ResNet-50

Scripts to evaluate ResNet-50 checkpoints on the ImageNet test set are provided in sparse_rn50/evaluate/. For random pruning checkpoints, use the decode_mp.sh script. You'll similarly need to pass in the log alpha threshold to evaluate va¯riaitonal dropout checkpoints, which was 0.5 for all our models. This repository does not include the ImageNet dataset, so you'll also need to point these scripts at a local version of the ImageNet test set stored as TFRecords. The output of the script should match the top-1 accuracy reported in the checkpoint directory.

Calculate Weight Sparsity

To calculate the weight sparsity for a checkpoint, use the checkpoint_sparsity.py script and pass the checkpoint file, sparsity technique, and model ("transformer" or "rn50"). For variational dropout, also pass the same log alpha threshold.

Trained Checkpoints

The top performing checkpoints for each model and sparsity technique can be downloaded with the following links.

Model	Technique	Sparsity	BLEU	Link
Transformer	Magnitude Pruning	50%	26.33	link
Transformer	Magnitude Pruning	60%	25.94	link
Transformer	Magnitude Pruning	70%	25.21	link
Transformer	Magnitude Pruning	80%	24.65	link
Transformer	Magnitude Pruning	90%	23.26	link
Transformer	Magnitude Pruning	95%	20.75	link
Transformer	Magnitude Pruning	98%	16.37	link
Transformer	Variational Dropout	50%	26.26	link
Transformer	Variational Dropout	60%	25.37	link
Transformer	Variational Dropout	70%	25.08	link
Transformer	Variational Dropout	80%	24.33	link
Transformer	Variational Dropout	90%	21.43	link
Transformer	Variational Dropout	95%	19.13	link
Transformer	Variational Dropout	98%	14.45	link
Transformer	L0 Regularization	50%	26.72	link
Transformer	L0 Regularization	60%	26.16	link
Transformer	L0 Regularization	70%	25.29	link
Transformer	L0 Regularization	80%	24.15	link
Transformer	L0 Regularization	90%	20.05	link
Transformer	L0 Regularization	95%	19.78	link
Transformer	L0 Regularization	98%	16.83	link
Transformer	Random Pruning	50%	24.56	link
Transformer	Random Pruning	60%	24.45	link
Transformer	Random Pruning	70%	24.01	link
Transformer	Random Pruning	80%	23.15	link
Transformer	Random Pruning	90%	20.67	link
Transformer	Random Pruning	95%	17.42	link
Transformer	Random Pruning	98%	10.94	link

Model	Technique	Sparsity	Top-1 Accuracy	Link
ResNet-50	Magnitude Pruning	50%	76.53	link
ResNet-50	Magnitude Pruning	70%	76.38	link
ResNet-50	Magnitude Pruning	80%	75.58	link
ResNet-50	Magnitude Pruning	90%	73.91	link
ResNet-50	Magnitude Pruning	95%	70.59	link
ResNet-50	Magnitude Pruning	98%	57.9	link
ResNet-50	Magnitude Pruning (extended/non-uniform)	80%	76.52	link
ResNet-50	Magnitude Pruning (extended/non-uniform)	90%	75.16	link
ResNet-50	Magnitude Pruning (extended/non-uniform)	95%	72.71	link
ResNet-50	Magnitude Pruning (extended/non-uniform)	96.5%	69.26	link
ResNet-50	Random Pruning	50%	74.59	link
ResNet-50	Random Pruning	70%	72.2	link
ResNet-50	Random Pruning	80%	70.21	link
ResNet-50	Random Pruning	90%	65	link
ResNet-50	Random Pruning	95%	58.04	link
ResNet-50	Random Pruning	98%	43.99	link
ResNet-50	Variational Dropout	50%	76.55	link
ResNet-50	Variational Dropout	80%	75.28	link
ResNet-50	Variational Dropout	90%	73.84	link
ResNet-50	Variational Dropout	95%	71.91	link
ResNet-50	Variational Dropout	98%	67.36	link

google-research

DDDaniel DuckworthAdd demo notebook for SMERF6 месяцев назадf9150d

The State of Sparsity in Deep Neural Networks

Build Docker Image

Sparse Transformer

Sparse ResNet-50

Calculate Weight Sparsity

Trained Checkpoints

Использование cookies