transformers
153 строки · 4.4 Кб
1#!/usr/bin/env python
2# Copyright 2020 The HuggingFace Team. All rights reserved.
3#
4# Licensed under the Apache License, Version 2.0 (the "License");
5# you may not use this file except in compliance with the License.
6# You may obtain a copy of the License at
7#
8# http://www.apache.org/licenses/LICENSE-2.0
9#
10# Unless required by applicable law or agreed to in writing, software
11# distributed under the License is distributed on an "AS IS" BASIS,
12# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13# See the License for the specific language governing permissions and
14# limitations under the License.
15
16# Usage:
17# ./gen-card-allenai-wmt19.py
18
19import os20from pathlib import Path21
22
23def write_model_card(model_card_dir, src_lang, tgt_lang, model_name):24
25texts = {26"en": "Machine learning is great, isn't it?",27"ru": "Машинное обучение - это здорово, не так ли?",28"de": "Maschinelles Lernen ist großartig, nicht wahr?",29}30
31# BLUE scores as follows:32# "pair": [fairseq, transformers]33scores = {34"wmt19-de-en-6-6-base": [0, 38.37],35"wmt19-de-en-6-6-big": [0, 39.90],36}37pair = f"{src_lang}-{tgt_lang}"38
39readme = f"""40---
41
42language:
43- {src_lang}44- {tgt_lang}45thumbnail:
46tags:
47- translation
48- wmt19
49- allenai
50license: apache-2.0
51datasets:
52- wmt19
53metrics:
54- bleu
55---
56
57# FSMT
58
59## Model description
60
61This is a ported version of fairseq-based [wmt19 transformer](https://github.com/jungokasai/deep-shallow/) for {src_lang}-{tgt_lang}.62
63For more details, please, see [Deep Encoder, Shallow Decoder: Reevaluating the Speed-Quality Tradeoff in Machine Translation](https://arxiv.org/abs/2006.10369).
64
652 models are available:
66
67* [wmt19-de-en-6-6-big](https://huggingface.co/allenai/wmt19-de-en-6-6-big)
68* [wmt19-de-en-6-6-base](https://huggingface.co/allenai/wmt19-de-en-6-6-base)
69
70
71## Intended uses & limitations
72
73#### How to use
74
75```python
76from transformers import FSMTForConditionalGeneration, FSMTTokenizer
77mname = "allenai/{model_name}"78tokenizer = FSMTTokenizer.from_pretrained(mname)
79model = FSMTForConditionalGeneration.from_pretrained(mname)
80
81input = "{texts[src_lang]}"82input_ids = tokenizer.encode(input, return_tensors="pt")
83outputs = model.generate(input_ids)
84decoded = tokenizer.decode(outputs[0], skip_special_tokens=True)
85print(decoded) # {texts[tgt_lang]}86
87```
88
89#### Limitations and bias
90
91
92## Training data
93
94Pretrained weights were left identical to the original model released by allenai. For more details, please, see the [paper](https://arxiv.org/abs/2006.10369).
95
96## Eval results
97
98Here are the BLEU scores:
99
100model | transformers
101-------|---------
102{model_name} | {scores[model_name][1]}103
104The score was calculated using this code:
105
106```bash
107git clone https://github.com/huggingface/transformers
108cd transformers
109export PAIR={pair}110export DATA_DIR=data/$PAIR
111export SAVE_DIR=data/$PAIR
112export BS=8
113export NUM_BEAMS=5
114mkdir -p $DATA_DIR
115sacrebleu -t wmt19 -l $PAIR --echo src > $DATA_DIR/val.source
116sacrebleu -t wmt19 -l $PAIR --echo ref > $DATA_DIR/val.target
117echo $PAIR
118PYTHONPATH="src:examples/seq2seq" python examples/seq2seq/run_eval.py allenai/{model_name} $DATA_DIR/val.source $SAVE_DIR/test_translations.txt --reference_path $DATA_DIR/val.target --score_path $SAVE_DIR/test_bleu.json --bs $BS --task translation --num_beams $NUM_BEAMS119```
120
121## Data Sources
122
123- [training, etc.](http://www.statmt.org/wmt19/)
124- [test set](http://matrix.statmt.org/test_sets/newstest2019.tgz?1556572561)
125
126
127### BibTeX entry and citation info
128
129```
130@misc{{kasai2020deep,
131title={{Deep Encoder, Shallow Decoder: Reevaluating the Speed-Quality Tradeoff in Machine Translation}},
132author={{Jungo Kasai and Nikolaos Pappas and Hao Peng and James Cross and Noah A. Smith}},
133year={{2020}},
134eprint={{2006.10369}},
135archivePrefix={{arXiv}},
136primaryClass={{cs.CL}}
137}}
138```
139
140"""
141model_card_dir.mkdir(parents=True, exist_ok=True)142path = os.path.join(model_card_dir, "README.md")143print(f"Generating {path}")144with open(path, "w", encoding="utf-8") as f:145f.write(readme)146
147# make sure we are under the root of the project
148repo_dir = Path(__file__).resolve().parent.parent.parent149model_cards_dir = repo_dir / "model_cards"150
151for model_name in ["wmt19-de-en-6-6-base", "wmt19-de-en-6-6-big"]:152model_card_dir = model_cards_dir / "allenai" / model_name153write_model_card(model_card_dir, src_lang="de", tgt_lang="en", model_name=model_name)154