open-source-models-with-hugging-face

Форк
0
/
L3_Translation_and_Summarization.ipynb 
334 строки · 7.8 Кб
1
{
2
 "cells": [
3
  {
4
   "cell_type": "markdown",
5
   "id": "e6685771",
6
   "metadata": {},
7
   "source": [
8
    "# Lesson 3: Translation and Summarization"
9
   ]
10
  },
11
  {
12
   "cell_type": "markdown",
13
   "id": "4c7109da",
14
   "metadata": {},
15
   "source": [
16
    "First, install the following if you are running locally:\n",
17
    "\n",
18
    "```\n",
19
    "    !pip install transformers \n",
20
    "    !pip install torch\n",
21
    "```\n",
22
    "\n",
23
    "_Note:_ You don't have to install any extra libraries if you are running this lab in this platform."
24
   ]
25
  },
26
  {
27
   "cell_type": "markdown",
28
   "id": "97e3e7c9-1437-4784-8a21-cd200bc609a5",
29
   "metadata": {},
30
   "source": [
31
    "- Here is some code that suppresses some warning messages."
32
   ]
33
  },
34
  {
35
   "cell_type": "code",
36
   "execution_count": 1,
37
   "id": "782af222-1bea-449a-8dd4-655ad7a7b8ea",
38
   "metadata": {
39
    "height": 47
40
   },
41
   "outputs": [],
42
   "source": [
43
    "from transformers.utils import logging\n",
44
    "logging.set_verbosity_error()"
45
   ]
46
  },
47
  {
48
   "cell_type": "markdown",
49
   "id": "bea43ec1",
50
   "metadata": {},
51
   "source": [
52
    "### Build the `translation` pipeline using 🤗 Transformers Library"
53
   ]
54
  },
55
  {
56
   "cell_type": "code",
57
   "execution_count": 2,
58
   "id": "d1d46ac9-d665-4690-99a4-43b625e02114",
59
   "metadata": {
60
    "height": 47
61
   },
62
   "outputs": [],
63
   "source": [
64
    "from transformers import pipeline \n",
65
    "import torch"
66
   ]
67
  },
68
  {
69
   "cell_type": "code",
70
   "execution_count": 3,
71
   "id": "014e1c26-df35-406c-8ac2-9789b011c86b",
72
   "metadata": {
73
    "height": 94
74
   },
75
   "outputs": [],
76
   "source": [
77
    "# Create the translator pipeline using a model from Meta\n",
78
    "translator = pipeline(task=\"translation\",\n",
79
    "                      model=\"./models/facebook/nllb-200-distilled-600M\",\n",
80
    "                      torch_dtype=torch.bfloat16) "
81
   ]
82
  },
83
  {
84
   "cell_type": "markdown",
85
   "id": "69d8f7a5",
86
   "metadata": {},
87
   "source": [
88
    "Note: Find more information about the model 'nllb-200-distilled-600M', [here](https://huggingface.co/facebook/nllb-200-distilled-600M).\n",
89
    "\n",
90
    "NLLB: No Language Left Behind"
91
   ]
92
  },
93
  {
94
   "cell_type": "code",
95
   "execution_count": 4,
96
   "id": "095bd1c5-a96f-4b20-8e9c-601b0b158fd8",
97
   "metadata": {
98
    "height": 132
99
   },
100
   "outputs": [],
101
   "source": [
102
    "# Set the text to be translated\n",
103
    "text = \"\"\"\\\n",
104
    "My puppy is adorable, \\\n",
105
    "Your kitten is cute.\n",
106
    "Her panda is friendly.\n",
107
    "His llama is thoughtful. \\\n",
108
    "We all have nice pets!\"\"\""
109
   ]
110
  },
111
  {
112
   "cell_type": "code",
113
   "execution_count": 5,
114
   "id": "03d9ebdf-86d8-493b-8757-74b3d1010442",
115
   "metadata": {
116
    "height": 64
117
   },
118
   "outputs": [],
119
   "source": [
120
    "text_translated = translator(text,\n",
121
    "                             src_lang=\"eng_Latn\",\n",
122
    "                             tgt_lang=\"fra_Latn\")"
123
   ]
124
  },
125
  {
126
   "cell_type": "markdown",
127
   "id": "711052f5",
128
   "metadata": {},
129
   "source": [
130
    "Note: Find more languages's codes, [in this repository.](https://github.com/facebookresearch/flores/blob/main/flores200/README.md#languages-in-flores-200)"
131
   ]
132
  },
133
  {
134
   "cell_type": "code",
135
   "execution_count": 6,
136
   "id": "f4ba07e3-4a5e-4bf2-86a9-498781828eca",
137
   "metadata": {
138
    "height": 47
139
   },
140
   "outputs": [
141
    {
142
     "data": {
143
      "text/plain": [
144
       "[{'translation_text': 'Mon chiot est adorable, ton chaton est mignon, son panda est ami, sa lamme est attentive, nous avons tous de beaux animaux de compagnie.'}]"
145
      ]
146
     },
147
     "execution_count": 6,
148
     "metadata": {},
149
     "output_type": "execute_result"
150
    }
151
   ],
152
   "source": [
153
    "#Print the translated text\n",
154
    "text_translated"
155
   ]
156
  },
157
  {
158
   "cell_type": "markdown",
159
   "id": "c7517649",
160
   "metadata": {},
161
   "source": [
162
    "* Before moving with more examples, let's free some memory."
163
   ]
164
  },
165
  {
166
   "cell_type": "code",
167
   "execution_count": 7,
168
   "id": "c16e5dad-dac0-42e4-9a87-8128a1d49b44",
169
   "metadata": {
170
    "height": 47
171
   },
172
   "outputs": [],
173
   "source": [
174
    "# import gargabe collector\n",
175
    "import gc"
176
   ]
177
  },
178
  {
179
   "cell_type": "code",
180
   "execution_count": 8,
181
   "id": "43cafb3a-51b8-4aae-929c-31d524dec530",
182
   "metadata": {
183
    "height": 47
184
   },
185
   "outputs": [],
186
   "source": [
187
    "# Delete translator\n",
188
    "del translator"
189
   ]
190
  },
191
  {
192
   "cell_type": "code",
193
   "execution_count": 9,
194
   "id": "61d698a7-8ae2-475e-ac46-d768c282b17c",
195
   "metadata": {
196
    "height": 30
197
   },
198
   "outputs": [
199
    {
200
     "data": {
201
      "text/plain": [
202
       "25"
203
      ]
204
     },
205
     "execution_count": 9,
206
     "metadata": {},
207
     "output_type": "execute_result"
208
    }
209
   ],
210
   "source": [
211
    "gc.collect()"
212
   ]
213
  },
214
  {
215
   "cell_type": "markdown",
216
   "id": "b2fac55f",
217
   "metadata": {},
218
   "source": [
219
    "### Build the `summarization` pipeline using 🤗 Transformers Library"
220
   ]
221
  },
222
  {
223
   "cell_type": "code",
224
   "execution_count": 10,
225
   "id": "b132c646-0c6a-4c57-939a-b3015ea4b76f",
226
   "metadata": {
227
    "height": 81
228
   },
229
   "outputs": [],
230
   "source": [
231
    "# Create the summarizer pipeline using a model from Meta\n",
232
    "summarizer = pipeline(task=\"summarization\",\n",
233
    "                      model=\"./models/facebook/bart-large-cnn\",\n",
234
    "                      torch_dtype=torch.bfloat16)"
235
   ]
236
  },
237
  {
238
   "cell_type": "markdown",
239
   "id": "1ca6f847",
240
   "metadata": {},
241
   "source": [
242
    "Note: Find more information about the model 'bart-large-cnn', [here](https://huggingface.co/facebook/bart-large-cnn)."
243
   ]
244
  },
245
  {
246
   "cell_type": "code",
247
   "execution_count": 11,
248
   "id": "98276d66-4274-4a2f-b6a7-b4fb839b94f7",
249
   "metadata": {
250
    "height": 179
251
   },
252
   "outputs": [],
253
   "source": [
254
    "# Text to be summarized\n",
255
    "text = \"\"\"Paris is the capital and most populous city of France, with\n",
256
    "          an estimated population of 2,175,601 residents as of 2018,\n",
257
    "          in an area of more than 105 square kilometres (41 square\n",
258
    "          miles). The City of Paris is the centre and seat of\n",
259
    "          government of the region and province of Île-de-France, or\n",
260
    "          Paris Region, which has an estimated population of\n",
261
    "          12,174,880, or about 18 percent of the population of France\n",
262
    "          as of 2017.\"\"\""
263
   ]
264
  },
265
  {
266
   "cell_type": "code",
267
   "execution_count": 12,
268
   "id": "d856f193-cbf7-450b-8ae3-42287096e56f",
269
   "metadata": {
270
    "height": 64
271
   },
272
   "outputs": [],
273
   "source": [
274
    "summary = summarizer(text,\n",
275
    "                     min_length=10,\n",
276
    "                     max_length=100)"
277
   ]
278
  },
279
  {
280
   "cell_type": "code",
281
   "execution_count": 13,
282
   "id": "a2c79f81-6baf-4f6b-95ee-b1a2072ec073",
283
   "metadata": {
284
    "height": 47
285
   },
286
   "outputs": [
287
    {
288
     "data": {
289
      "text/plain": [
290
       "[{'summary_text': 'Paris is the capital and most populous city of France, with an estimated population of 2,175,601 residents as of 2018. The City of Paris is the centre and seat of the government of the region and province of Île-de-France.'}]"
291
      ]
292
     },
293
     "execution_count": 13,
294
     "metadata": {},
295
     "output_type": "execute_result"
296
    }
297
   ],
298
   "source": [
299
    "# Print the result of the summarization\n",
300
    "summary"
301
   ]
302
  },
303
  {
304
   "cell_type": "markdown",
305
   "id": "ca56abc0",
306
   "metadata": {},
307
   "source": [
308
    "### Try yourself! \n",
309
    "Now, it is your turn! Try this model with your own texts!"
310
   ]
311
  }
312
 ],
313
 "metadata": {
314
  "kernelspec": {
315
   "display_name": "Python 3 (ipykernel)",
316
   "language": "python",
317
   "name": "python3"
318
  },
319
  "language_info": {
320
   "codemirror_mode": {
321
    "name": "ipython",
322
    "version": 3
323
   },
324
   "file_extension": ".py",
325
   "mimetype": "text/x-python",
326
   "name": "python",
327
   "nbconvert_exporter": "python",
328
   "pygments_lexer": "ipython3",
329
   "version": "3.9.18"
330
  }
331
 },
332
 "nbformat": 4,
333
 "nbformat_minor": 5
334
}
335

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.