haystack-tutorials

17_Audio.ipynb
346 строк · 10.6 Кб
Перенос по словам
1
{
2
 "cells": [
3
  {
4
   "attachments": {},
5
   "cell_type": "markdown",
6
   "metadata": {
7
    "id": "Dne2XSNzB3SK"
8
   },
9
   "source": [
10
    "# Tutorial: Make Your QA Pipelines Talk!\n",
11
    "\n",
12
    "- **Level**: Intermediate\n",
13
    "- **Time to complete**: 15 minutes\n",
14
    "- **Nodes Used**: `InMemoryDocumentStore`, `BM25Retriever`, `FARMReader`, `AnswerToSpeech`\n",
15
    "- **Goal**: After completing this tutorial, you'll have created a extractive question answering system that can read out the answer.\n",
16
    "\n",
17
    ">**Update:** AnswerToSpeech lives in the [text2speech](https://github.com/deepset-ai/haystack-extras/tree/main/nodes/text2speech) package. Main [Haystack](https://github.com/deepset-ai/haystack) repository doesn't include it anymore."
18
   ]
19
  },
20
  {
21
   "attachments": {},
22
   "cell_type": "markdown",
23
   "metadata": {},
24
   "source": [
25
    "## Overview\n",
26
    "\n",
27
    "Question answering works primarily on text, but Haystack provides some features for audio files that contain speech as well.\n",
28
    "\n",
29
    "In this tutorial, we're going to see how to use `AnswerToSpeech` to convert answers into audio files."
30
   ]
31
  },
32
  {
33
   "attachments": {},
34
   "cell_type": "markdown",
35
   "metadata": {
36
    "collapsed": false,
37
    "id": "4UBjfz4LB3SS"
38
   },
39
   "source": [
40
    "## Preparing the Colab Environment\n",
41
    "\n",
42
    "- [Enable GPU Runtime in Colab](https://docs.haystack.deepset.ai/docs/enabling-gpu-acceleration#enabling-the-gpu-in-colab)\n",
43
    "- [Set logging level to INFO](https://docs.haystack.deepset.ai/docs/log-level)"
44
   ]
45
  },
46
  {
47
   "attachments": {},
48
   "cell_type": "markdown",
49
   "metadata": {
50
    "id": "nBvGUPVKN2oJ"
51
   },
52
   "source": [
53
    "## Installing Haystack\n",
54
    "\n",
55
    "To start, let's install the latest release of Haystack with `pip`. In this tutorial, we'll use components from [text2speech](https://github.com/deepset-ai/haystack-extras/tree/main/nodes/text2speech) which contains some extra Haystack components, so we'll install `farm-haystack-text2speech`."
56
   ]
57
  },
58
  {
59
   "cell_type": "code",
60
   "execution_count": null,
61
   "metadata": {
62
    "id": "QsY0HC8JB3Sc"
63
   },
64
   "outputs": [],
65
   "source": [
66
    "%%bash\n",
67
    "\n",
68
    "pip install --upgrade pip\n",
69
    "pip install farm-haystack[colab,preprocessing,inference]\n",
70
    "pip install farm-haystack-text2speech"
71
   ]
72
  },
73
  {
74
   "attachments": {},
75
   "cell_type": "markdown",
76
   "metadata": {},
77
   "source": [
78
    "### Enabling Telemetry\n",
79
    "\n",
80
    "Knowing you're using this tutorial helps us decide where to invest our efforts to build a better product but you can always opt out by commenting the following line. See [Telemetry](https://docs.haystack.deepset.ai/docs/telemetry) for more details."
81
   ]
82
  },
83
  {
84
   "cell_type": "code",
85
   "execution_count": null,
86
   "metadata": {},
87
   "outputs": [],
88
   "source": [
89
    "from haystack.telemetry import tutorial_running\n",
90
    "\n",
91
    "tutorial_running(17)"
92
   ]
93
  },
94
  {
95
   "attachments": {},
96
   "cell_type": "markdown",
97
   "metadata": {
98
    "id": "pbGu92rAB3Sl"
99
   },
100
   "source": [
101
    "## Indexing Documents\n",
102
    "\n",
103
    "We will populate the document store with a simple indexing pipeline. See [Tutorial: Build Your First Question Answering System](https://haystack.deepset.ai/tutorials/01_basic_qa_pipeline) for more details about these steps."
104
   ]
105
  },
106
  {
107
   "cell_type": "code",
108
   "execution_count": null,
109
   "metadata": {
110
    "id": "eWYnP3nWB3So",
111
    "pycharm": {
112
     "name": "#%%\n"
113
    }
114
   },
115
   "outputs": [],
116
   "source": [
117
    "from pathlib import Path\n",
118
    "from haystack.document_stores import InMemoryDocumentStore\n",
119
    "from haystack.utils import fetch_archive_from_http\n",
120
    "from haystack.pipelines import Pipeline\n",
121
    "from haystack.nodes import FileTypeClassifier, TextConverter, PreProcessor\n",
122
    "\n",
123
    "# Initialize the DocumentStore\n",
124
    "document_store = InMemoryDocumentStore(use_bm25=True)\n",
125
    "\n",
126
    "# Get the documents\n",
127
    "documents_path = \"data/tutorial17\"\n",
128
    "s3_url = \"https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/wiki_gameofthrones_txt17.zip\"\n",
129
    "fetch_archive_from_http(url=s3_url, output_dir=documents_path)\n",
130
    "\n",
131
    "# List all the paths\n",
132
    "file_paths = [p for p in Path(documents_path).glob(\"**/*\")]\n",
133
    "\n",
134
    "# NOTE: In this example we're going to use only one text file from the wiki\n",
135
    "file_paths = [p for p in file_paths if \"Stormborn\" in p.name]\n",
136
    "\n",
137
    "# Prepare some basic metadata for the files\n",
138
    "files_metadata = [{\"name\": path.name} for path in file_paths]\n",
139
    "\n",
140
    "# Makes sure the file is a TXT file (FileTypeClassifier node)\n",
141
    "classifier = FileTypeClassifier()\n",
142
    "\n",
143
    "# Converts a file into text and performs basic cleaning (TextConverter node)\n",
144
    "text_converter = TextConverter(remove_numeric_tables=True)\n",
145
    "\n",
146
    "# - Pre-processes the text by performing splits and adding metadata to the text (Preprocessor node)\n",
147
    "preprocessor = PreProcessor(clean_header_footer=True, split_length=200, split_overlap=20)\n",
148
    "\n",
149
    "# Here we create a basic indexing pipeline\n",
150
    "indexing_pipeline = Pipeline()\n",
151
    "indexing_pipeline.add_node(classifier, name=\"classifier\", inputs=[\"File\"])\n",
152
    "indexing_pipeline.add_node(text_converter, name=\"text_converter\", inputs=[\"classifier.output_1\"])\n",
153
    "indexing_pipeline.add_node(preprocessor, name=\"preprocessor\", inputs=[\"text_converter\"])\n",
154
    "indexing_pipeline.add_node(document_store, name=\"document_store\", inputs=[\"preprocessor\"])\n",
155
    "\n",
156
    "# Then we run it with the documents and their metadata as input\n",
157
    "indexing_pipeline.run(file_paths=file_paths, meta=files_metadata)"
158
   ]
159
  },
160
  {
161
   "attachments": {},
162
   "cell_type": "markdown",
163
   "metadata": {
164
    "id": "zW5qaqn1B3St"
165
   },
166
   "source": [
167
    "## Creating a QA Pipeline with AnswerToSpeech\n",
168
    "   \n",
169
    "Now we will create a pipeline very similar to the basic `ExtractiveQAPipeline` of [Tutorial: Build Your First Question Answering System](https://haystack.deepset.ai/tutorials/01_basic_qa_pipeline), with the addition of a node that converts our answers into audio files: AnswerToSpeech. Once the answer is retrieved, we can also listen to the audio version of the document where the answer came from."
170
   ]
171
  },
172
  {
173
   "cell_type": "code",
174
   "execution_count": null,
175
   "metadata": {
176
    "id": "m_oecui1B3Sw"
177
   },
178
   "outputs": [],
179
   "source": [
180
    "from haystack.nodes import BM25Retriever, FARMReader\n",
181
    "from text2speech import AnswerToSpeech\n",
182
    "\n",
183
    "retriever = BM25Retriever(document_store=document_store)\n",
184
    "reader = FARMReader(model_name_or_path=\"deepset/roberta-base-squad2\", use_gpu=True)\n",
185
    "answer2speech = AnswerToSpeech(\n",
186
    "    model_name_or_path=\"espnet/kan-bayashi_ljspeech_vits\", generated_audio_dir=Path(\"./audio_answers\")\n",
187
    ")\n",
188
    "\n",
189
    "audio_pipeline = Pipeline()\n",
190
    "audio_pipeline.add_node(retriever, name=\"Retriever\", inputs=[\"Query\"])\n",
191
    "audio_pipeline.add_node(reader, name=\"Reader\", inputs=[\"Retriever\"])\n",
192
    "audio_pipeline.add_node(answer2speech, name=\"AnswerToSpeech\", inputs=[\"Reader\"])"
193
   ]
194
  },
195
  {
196
   "attachments": {},
197
   "cell_type": "markdown",
198
   "metadata": {
199
    "id": "oV1KHzXGB3Sy"
200
   },
201
   "source": [
202
    "## Asking a question!\n",
203
    "\n",
204
    "Use the pipeline `run()` method to ask a question. The query argument is where you type your question. Additionally, you can set the number of documents you want the Reader and Retriever to return using the `top-k` parameter."
205
   ]
206
  },
207
  {
208
   "cell_type": "code",
209
   "execution_count": null,
210
   "metadata": {
211
    "id": "S-ZMUBzpB3Sz",
212
    "pycharm": {
213
     "is_executing": false
214
    }
215
   },
216
   "outputs": [],
217
   "source": [
218
    "prediction = audio_pipeline.run(\n",
219
    "    query=\"Who is the father of Arya Stark?\", params={\"Retriever\": {\"top_k\": 10}, \"Reader\": {\"top_k\": 5}}\n",
220
    ")"
221
   ]
222
  },
223
  {
224
   "cell_type": "code",
225
   "execution_count": null,
226
   "metadata": {
227
    "id": "vpFSxtNNB3S1"
228
   },
229
   "outputs": [],
230
   "source": [
231
    "# Now you can print prediction\n",
232
    "from pprint import pprint\n",
233
    "\n",
234
    "pprint(prediction)"
235
   ]
236
  },
237
  {
238
   "cell_type": "code",
239
   "execution_count": null,
240
   "metadata": {
241
    "id": "Xg6BN4v8N2oM"
242
   },
243
   "outputs": [],
244
   "source": [
245
    "# The document the first answer was extracted from\n",
246
    "original_document = [doc for doc in prediction[\"documents\"] if doc.id == prediction[\"answers\"][0].document_ids[0]][0]\n",
247
    "pprint(original_document)"
248
   ]
249
  },
250
  {
251
   "attachments": {},
252
   "cell_type": "markdown",
253
   "metadata": {
254
    "id": "FXf-kTn4B3S6"
255
   },
256
   "source": [
257
    "## Hear Answers out!\n",
258
    "\n",
259
    "Let's hear the answers and the context they are extracted from."
260
   ]
261
  },
262
  {
263
   "cell_type": "code",
264
   "execution_count": 13,
265
   "metadata": {
266
    "id": "cJJVpT7dB3S7"
267
   },
268
   "outputs": [],
269
   "source": [
270
    "from IPython.display import display, Audio\n",
271
    "import soundfile as sf"
272
   ]
273
  },
274
  {
275
   "cell_type": "code",
276
   "execution_count": null,
277
   "metadata": {
278
    "id": "usGVf1N6B3S8"
279
   },
280
   "outputs": [],
281
   "source": [
282
    "# The first answer in isolation\n",
283
    "\n",
284
    "print(\"Answer: \", prediction[\"answers\"][0].meta[\"answer_text\"])\n",
285
    "\n",
286
    "speech, _ = sf.read(prediction[\"answers\"][0].answer)\n",
287
    "display(Audio(speech, rate=24000))"
288
   ]
289
  },
290
  {
291
   "cell_type": "code",
292
   "execution_count": null,
293
   "metadata": {
294
    "id": "yTFwNJqtB3S9"
295
   },
296
   "outputs": [],
297
   "source": [
298
    "# The context of the first answer\n",
299
    "\n",
300
    "print(\"Context: \", prediction[\"answers\"][0].meta[\"context_text\"])\n",
301
    "\n",
302
    "speech, _ = sf.read(prediction[\"answers\"][0].context)\n",
303
    "display(Audio(speech, rate=24000))"
304
   ]
305
  },
306
  {
307
   "attachments": {},
308
   "cell_type": "markdown",
309
   "metadata": {},
310
   "source": [
311
    "🎉 Congratulations! You've learned how to create a extactive QA system that can read out the answer."
312
   ]
313
  }
314
 ],
315
 "metadata": {
316
  "accelerator": "GPU",
317
  "colab": {
318
   "provenance": []
319
  },
320
  "gpuClass": "standard",
321
  "kernelspec": {
322
   "display_name": "Python 3.9.6 64-bit",
323
   "language": "python",
324
   "name": "python3"
325
  },
326
  "language_info": {
327
   "codemirror_mode": {
328
    "name": "ipython",
329
    "version": 3
330
   },
331
   "file_extension": ".py",
332
   "mimetype": "text/x-python",
333
   "name": "python",
334
   "nbconvert_exporter": "python",
335
   "pygments_lexer": "ipython3",
336
   "version": "3.9.6"
337
  },
338
  "vscode": {
339
   "interpreter": {
340
    "hash": "31f2aee4e71d21fbe5cf8b01ff0e069b9275f58929596ceb00d14d90e3e16cd6"
341
   }
342
  }
343
 },
344
 "nbformat": 4,
345
 "nbformat_minor": 0
346
}
347
haystack-tutorials

Использование cookies