/

/

google-research

Обзор Центр заботыВойти

google-research

Ветки: 311 Коммиты: 4490 Теги: 0

..

google-research

/

scaling_transformer_inference_efficiency

- Adding untracked image

2 года назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Internal change

6 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

2 года назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

attention_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

checkpoint_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

collectives_test.py

Use shard_map instead of xmap

8 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

incremental_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

incremental_test_equivalency.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

inference_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

inference_test_equivalency.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

loading_utils.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

partitioning.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

partitioning_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

requirements.txt

This cl provides the code and scripts required to obtain the inference benchmarks/numbers released in the paper Efficiently Scaling Transformer Inference. The code enables the users to run with model configurations initialized with random weight matrices and benchmark the inference latencies against the optimized configurations in the paper. The goal of open-source release is to allow the users to fork this code and get the best inference performance on Cloud TPU v4 slices as part of Jax engagements.

2 года назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

special2_test.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

xmap_transformer_australis_test.cc

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

xmap_transformer_exporter.py

Open-sourcing the code for "CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor". https://arxiv.org/abs/2312.07661

9 месяцев назад

README.md

Scaling Transformer Inference Efficiency

This repo includes

Benchmarks to replicate the results in the paper - Scaling Transformer Inference Efficiency
A complete implementation of text generation with a transformer using the techniques in the paper

To replicate the head-to-head benchmarks from the paper at 540B scale

Ensure you are running on 64 TPUv4 chips, smaller numbers would be better suited for smaller models

python3 run_benchmark.py

This generates the latency and MFU numbers for the PALM and MT-NLG implementations in the following plot from the paper. The FastertTransformer baseline numbers are drawn from NVIDIA's repo.

To generate text

python3 run_generation.py --model 540b --quantized False

The current weight paths only load internal PaLM weights, which are unavailable externally. Using this externally will require modification of the checkpoint paths and transformer layer def to suit your own models. Text generation currently uses the pjit based code paths, updating to the faster xmap based code paths is in progress and should be done by next week.

TODO:

Insert table from benchmarks run
Include benchmark at larger setpoints
Update text generation to xmap code path
Include helper scripts for running TPU pod slices
Update this documentation

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.