examples

Форк
0
/
08-langchain-retrieval-agent.ipynb 
1637 строк · 60.5 Кб
1
{
2
  "cells": [
3
    {
4
      "attachments": {},
5
      "cell_type": "markdown",
6
      "metadata": {
7
        "id": "TXC2wBpCU9f7"
8
      },
9
      "source": [
10
        "[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/08-langchain-retrieval-agent.ipynb)"
11
      ]
12
    },
13
    {
14
      "attachments": {},
15
      "cell_type": "markdown",
16
      "metadata": {
17
        "id": "bhWwrfbbVGOA"
18
      },
19
      "source": [
20
        "#### [LangChain Handbook](https://pinecone.io/learn/langchain)\n",
21
        "\n",
22
        "# Retrieval Agents\n",
23
        "\n",
24
        "We've seen in previous chapters how powerful [retrieval augmentation](https://www.pinecone.io/learn/langchain-retrieval-augmentation/) and [conversational agents](https://www.pinecone.io/learn/langchain-agents/) can be. They become even more impressive when we begin using them together.\n",
25
        "\n",
26
        "Conversational agents can struggle with data freshness, knowledge about specific domains, or accessing internal documentation. By coupling agents with retrieval augmentation tools we no longer have these problems.\n",
27
        "\n",
28
        "One the other side, using \"naive\" retrieval augmentation without the use of an agent means we will retrieve contexts with *every* query. Again, this isn't always ideal as not every query requires access to external knowledge.\n",
29
        "\n",
30
        "Merging these methods gives us the best of both worlds. In this notebook we'll learn how to do this.\n",
31
        "\n",
32
        "To begin, we must install the prerequisite libraries that we will be using in this notebook."
33
      ]
34
    },
35
    {
36
      "cell_type": "code",
37
      "execution_count": null,
38
      "metadata": {
39
        "colab": {
40
          "base_uri": "https://localhost:8080/"
41
        },
42
        "id": "pva9ehKXUpU2",
43
        "outputId": "3bbcf2dd-1889-412f-d45a-f56945ac4f2f"
44
      },
45
      "outputs": [],
46
      "source": [
47
        "!pip install -qU \\\n",
48
        "    openai==1.6.1 \\\n",
49
        "    pinecone-client==3.1.0 \\\n",
50
        "    langchain==0.1.1 \\\n",
51
        "    langchain-community==0.0.13 \\\n",
52
        "    tiktoken==0.5.2 \\\n",
53
        "    datasets==2.12.0"
54
      ]
55
    },
56
    {
57
      "attachments": {},
58
      "cell_type": "markdown",
59
      "metadata": {
60
        "id": "ZTgrOQziXUto"
61
      },
62
      "source": [
63
        "## Building the Knowledge Base"
64
      ]
65
    },
66
    {
67
      "attachments": {},
68
      "cell_type": "markdown",
69
      "metadata": {
70
        "id": "qNyRsz0ZXXaq"
71
      },
72
      "source": [
73
        "We start by constructing our knowledge base. We'll use a mostly prepared dataset called **S**tanford **Qu**estion-**A**nswering **D**ataset (SQuAD) hosted on Hugging Face *Datasets*. We download it like so:"
74
      ]
75
    },
76
    {
77
      "cell_type": "code",
78
      "execution_count": 1,
79
      "metadata": {
80
        "colab": {
81
          "base_uri": "https://localhost:8080/"
82
        },
83
        "id": "laSDMjqQXuj-",
84
        "outputId": "5272df99-eb4b-4ec2-c513-504e067be2b6"
85
      },
86
      "outputs": [
87
        {
88
          "data": {
89
            "application/vnd.jupyter.widget-view+json": {
90
              "model_id": "79d95a944b44423d9a2dd0082cdd80b0",
91
              "version_major": 2,
92
              "version_minor": 0
93
            },
94
            "text/plain": [
95
              "Downloading readme:   0%|          | 0.00/7.83k [00:00<?, ?B/s]"
96
            ]
97
          },
98
          "metadata": {},
99
          "output_type": "display_data"
100
        },
101
        {
102
          "data": {
103
            "application/vnd.jupyter.widget-view+json": {
104
              "model_id": "488d023c133946f18514d9eab88d18a6",
105
              "version_major": 2,
106
              "version_minor": 0
107
            },
108
            "text/plain": [
109
              "Downloading data files:   0%|          | 0/2 [00:00<?, ?it/s]"
110
            ]
111
          },
112
          "metadata": {},
113
          "output_type": "display_data"
114
        },
115
        {
116
          "data": {
117
            "application/vnd.jupyter.widget-view+json": {
118
              "model_id": "b516de7e929a48ebaed4eda6bc8e0697",
119
              "version_major": 2,
120
              "version_minor": 0
121
            },
122
            "text/plain": [
123
              "Downloading data:   0%|          | 0.00/14.5M [00:00<?, ?B/s]"
124
            ]
125
          },
126
          "metadata": {},
127
          "output_type": "display_data"
128
        },
129
        {
130
          "data": {
131
            "application/vnd.jupyter.widget-view+json": {
132
              "model_id": "98435a29bc384a218fd34eed7cc593cf",
133
              "version_major": 2,
134
              "version_minor": 0
135
            },
136
            "text/plain": [
137
              "Downloading data:   0%|          | 0.00/1.82M [00:00<?, ?B/s]"
138
            ]
139
          },
140
          "metadata": {},
141
          "output_type": "display_data"
142
        },
143
        {
144
          "data": {
145
            "application/vnd.jupyter.widget-view+json": {
146
              "model_id": "9733b641eea7468c95fb4d013a42c08e",
147
              "version_major": 2,
148
              "version_minor": 0
149
            },
150
            "text/plain": [
151
              "Extracting data files:   0%|          | 0/2 [00:00<?, ?it/s]"
152
            ]
153
          },
154
          "metadata": {},
155
          "output_type": "display_data"
156
        },
157
        {
158
          "data": {
159
            "application/vnd.jupyter.widget-view+json": {
160
              "model_id": "ee1fb919fdb44f258cdee51bd469e1b6",
161
              "version_major": 2,
162
              "version_minor": 0
163
            },
164
            "text/plain": [
165
              "Generating train split:   0%|          | 0/87599 [00:00<?, ? examples/s]"
166
            ]
167
          },
168
          "metadata": {},
169
          "output_type": "display_data"
170
        },
171
        {
172
          "data": {
173
            "application/vnd.jupyter.widget-view+json": {
174
              "model_id": "72b2cf694b7942358c4d8ebb974b6c95",
175
              "version_major": 2,
176
              "version_minor": 0
177
            },
178
            "text/plain": [
179
              "Generating validation split:   0%|          | 0/10570 [00:00<?, ? examples/s]"
180
            ]
181
          },
182
          "metadata": {},
183
          "output_type": "display_data"
184
        },
185
        {
186
          "data": {
187
            "text/plain": [
188
              "Dataset({\n",
189
              "    features: ['id', 'title', 'context', 'question', 'answers'],\n",
190
              "    num_rows: 87599\n",
191
              "})"
192
            ]
193
          },
194
          "execution_count": 1,
195
          "metadata": {},
196
          "output_type": "execute_result"
197
        }
198
      ],
199
      "source": [
200
        "from datasets import load_dataset\n",
201
        "\n",
202
        "data = load_dataset('squad', split='train')\n",
203
        "data"
204
      ]
205
    },
206
    {
207
      "attachments": {},
208
      "cell_type": "markdown",
209
      "metadata": {
210
        "id": "8casaLEpX18U"
211
      },
212
      "source": [
213
        "The dataset does contain duplicate contexts, which we can remove like so:"
214
      ]
215
    },
216
    {
217
      "cell_type": "code",
218
      "execution_count": 2,
219
      "metadata": {
220
        "colab": {
221
          "base_uri": "https://localhost:8080/",
222
          "height": 511
223
        },
224
        "id": "JnWZTcJiXzor",
225
        "outputId": "42659cec-23c3-4349-b007-676da99d834d"
226
      },
227
      "outputs": [
228
        {
229
          "data": {
230
            "text/html": [
231
              "<div>\n",
232
              "<style scoped>\n",
233
              "    .dataframe tbody tr th:only-of-type {\n",
234
              "        vertical-align: middle;\n",
235
              "    }\n",
236
              "\n",
237
              "    .dataframe tbody tr th {\n",
238
              "        vertical-align: top;\n",
239
              "    }\n",
240
              "\n",
241
              "    .dataframe thead th {\n",
242
              "        text-align: right;\n",
243
              "    }\n",
244
              "</style>\n",
245
              "<table border=\"1\" class=\"dataframe\">\n",
246
              "  <thead>\n",
247
              "    <tr style=\"text-align: right;\">\n",
248
              "      <th></th>\n",
249
              "      <th>id</th>\n",
250
              "      <th>title</th>\n",
251
              "      <th>context</th>\n",
252
              "      <th>question</th>\n",
253
              "      <th>answers</th>\n",
254
              "    </tr>\n",
255
              "  </thead>\n",
256
              "  <tbody>\n",
257
              "    <tr>\n",
258
              "      <th>0</th>\n",
259
              "      <td>5733be284776f41900661182</td>\n",
260
              "      <td>University_of_Notre_Dame</td>\n",
261
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
262
              "      <td>To whom did the Virgin Mary allegedly appear i...</td>\n",
263
              "      <td>{'text': ['Saint Bernadette Soubirous'], 'answ...</td>\n",
264
              "    </tr>\n",
265
              "    <tr>\n",
266
              "      <th>1</th>\n",
267
              "      <td>5733be284776f4190066117f</td>\n",
268
              "      <td>University_of_Notre_Dame</td>\n",
269
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
270
              "      <td>What is in front of the Notre Dame Main Building?</td>\n",
271
              "      <td>{'text': ['a copper statue of Christ'], 'answe...</td>\n",
272
              "    </tr>\n",
273
              "    <tr>\n",
274
              "      <th>2</th>\n",
275
              "      <td>5733be284776f41900661180</td>\n",
276
              "      <td>University_of_Notre_Dame</td>\n",
277
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
278
              "      <td>The Basilica of the Sacred heart at Notre Dame...</td>\n",
279
              "      <td>{'text': ['the Main Building'], 'answer_start'...</td>\n",
280
              "    </tr>\n",
281
              "    <tr>\n",
282
              "      <th>3</th>\n",
283
              "      <td>5733be284776f41900661181</td>\n",
284
              "      <td>University_of_Notre_Dame</td>\n",
285
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
286
              "      <td>What is the Grotto at Notre Dame?</td>\n",
287
              "      <td>{'text': ['a Marian place of prayer and reflec...</td>\n",
288
              "    </tr>\n",
289
              "    <tr>\n",
290
              "      <th>4</th>\n",
291
              "      <td>5733be284776f4190066117e</td>\n",
292
              "      <td>University_of_Notre_Dame</td>\n",
293
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
294
              "      <td>What sits on top of the Main Building at Notre...</td>\n",
295
              "      <td>{'text': ['a golden statue of the Virgin Mary'...</td>\n",
296
              "    </tr>\n",
297
              "  </tbody>\n",
298
              "</table>\n",
299
              "</div>"
300
            ],
301
            "text/plain": [
302
              "                         id                     title  \\\n",
303
              "0  5733be284776f41900661182  University_of_Notre_Dame   \n",
304
              "1  5733be284776f4190066117f  University_of_Notre_Dame   \n",
305
              "2  5733be284776f41900661180  University_of_Notre_Dame   \n",
306
              "3  5733be284776f41900661181  University_of_Notre_Dame   \n",
307
              "4  5733be284776f4190066117e  University_of_Notre_Dame   \n",
308
              "\n",
309
              "                                             context  \\\n",
310
              "0  Architecturally, the school has a Catholic cha...   \n",
311
              "1  Architecturally, the school has a Catholic cha...   \n",
312
              "2  Architecturally, the school has a Catholic cha...   \n",
313
              "3  Architecturally, the school has a Catholic cha...   \n",
314
              "4  Architecturally, the school has a Catholic cha...   \n",
315
              "\n",
316
              "                                            question  \\\n",
317
              "0  To whom did the Virgin Mary allegedly appear i...   \n",
318
              "1  What is in front of the Notre Dame Main Building?   \n",
319
              "2  The Basilica of the Sacred heart at Notre Dame...   \n",
320
              "3                  What is the Grotto at Notre Dame?   \n",
321
              "4  What sits on top of the Main Building at Notre...   \n",
322
              "\n",
323
              "                                             answers  \n",
324
              "0  {'text': ['Saint Bernadette Soubirous'], 'answ...  \n",
325
              "1  {'text': ['a copper statue of Christ'], 'answe...  \n",
326
              "2  {'text': ['the Main Building'], 'answer_start'...  \n",
327
              "3  {'text': ['a Marian place of prayer and reflec...  \n",
328
              "4  {'text': ['a golden statue of the Virgin Mary'...  "
329
            ]
330
          },
331
          "execution_count": 2,
332
          "metadata": {},
333
          "output_type": "execute_result"
334
        }
335
      ],
336
      "source": [
337
        "data = data.to_pandas()\n",
338
        "data.head()"
339
      ]
340
    },
341
    {
342
      "cell_type": "code",
343
      "execution_count": 3,
344
      "metadata": {
345
        "colab": {
346
          "base_uri": "https://localhost:8080/",
347
          "height": 528
348
        },
349
        "id": "er2nM4vsYD4R",
350
        "outputId": "d58b5640-be25-495a-b90c-585940c851bf"
351
      },
352
      "outputs": [
353
        {
354
          "data": {
355
            "text/html": [
356
              "<div>\n",
357
              "<style scoped>\n",
358
              "    .dataframe tbody tr th:only-of-type {\n",
359
              "        vertical-align: middle;\n",
360
              "    }\n",
361
              "\n",
362
              "    .dataframe tbody tr th {\n",
363
              "        vertical-align: top;\n",
364
              "    }\n",
365
              "\n",
366
              "    .dataframe thead th {\n",
367
              "        text-align: right;\n",
368
              "    }\n",
369
              "</style>\n",
370
              "<table border=\"1\" class=\"dataframe\">\n",
371
              "  <thead>\n",
372
              "    <tr style=\"text-align: right;\">\n",
373
              "      <th></th>\n",
374
              "      <th>id</th>\n",
375
              "      <th>title</th>\n",
376
              "      <th>context</th>\n",
377
              "      <th>question</th>\n",
378
              "      <th>answers</th>\n",
379
              "    </tr>\n",
380
              "  </thead>\n",
381
              "  <tbody>\n",
382
              "    <tr>\n",
383
              "      <th>0</th>\n",
384
              "      <td>5733be284776f41900661182</td>\n",
385
              "      <td>University_of_Notre_Dame</td>\n",
386
              "      <td>Architecturally, the school has a Catholic cha...</td>\n",
387
              "      <td>To whom did the Virgin Mary allegedly appear i...</td>\n",
388
              "      <td>{'text': ['Saint Bernadette Soubirous'], 'answ...</td>\n",
389
              "    </tr>\n",
390
              "    <tr>\n",
391
              "      <th>5</th>\n",
392
              "      <td>5733bf84d058e614000b61be</td>\n",
393
              "      <td>University_of_Notre_Dame</td>\n",
394
              "      <td>As at most other universities, Notre Dame's st...</td>\n",
395
              "      <td>When did the Scholastic Magazine of Notre dame...</td>\n",
396
              "      <td>{'text': ['September 1876'], 'answer_start': [...</td>\n",
397
              "    </tr>\n",
398
              "    <tr>\n",
399
              "      <th>10</th>\n",
400
              "      <td>5733bed24776f41900661188</td>\n",
401
              "      <td>University_of_Notre_Dame</td>\n",
402
              "      <td>The university is the major seat of the Congre...</td>\n",
403
              "      <td>Where is the headquarters of the Congregation ...</td>\n",
404
              "      <td>{'text': ['Rome'], 'answer_start': [119]}</td>\n",
405
              "    </tr>\n",
406
              "    <tr>\n",
407
              "      <th>15</th>\n",
408
              "      <td>5733a6424776f41900660f51</td>\n",
409
              "      <td>University_of_Notre_Dame</td>\n",
410
              "      <td>The College of Engineering was established in ...</td>\n",
411
              "      <td>How many BS level degrees are offered in the C...</td>\n",
412
              "      <td>{'text': ['eight'], 'answer_start': [487]}</td>\n",
413
              "    </tr>\n",
414
              "    <tr>\n",
415
              "      <th>20</th>\n",
416
              "      <td>5733a70c4776f41900660f64</td>\n",
417
              "      <td>University_of_Notre_Dame</td>\n",
418
              "      <td>All of Notre Dame's undergraduate students are...</td>\n",
419
              "      <td>What entity provides help with the management ...</td>\n",
420
              "      <td>{'text': ['Learning Resource Center'], 'answer...</td>\n",
421
              "    </tr>\n",
422
              "  </tbody>\n",
423
              "</table>\n",
424
              "</div>"
425
            ],
426
            "text/plain": [
427
              "                          id                     title  \\\n",
428
              "0   5733be284776f41900661182  University_of_Notre_Dame   \n",
429
              "5   5733bf84d058e614000b61be  University_of_Notre_Dame   \n",
430
              "10  5733bed24776f41900661188  University_of_Notre_Dame   \n",
431
              "15  5733a6424776f41900660f51  University_of_Notre_Dame   \n",
432
              "20  5733a70c4776f41900660f64  University_of_Notre_Dame   \n",
433
              "\n",
434
              "                                              context  \\\n",
435
              "0   Architecturally, the school has a Catholic cha...   \n",
436
              "5   As at most other universities, Notre Dame's st...   \n",
437
              "10  The university is the major seat of the Congre...   \n",
438
              "15  The College of Engineering was established in ...   \n",
439
              "20  All of Notre Dame's undergraduate students are...   \n",
440
              "\n",
441
              "                                             question  \\\n",
442
              "0   To whom did the Virgin Mary allegedly appear i...   \n",
443
              "5   When did the Scholastic Magazine of Notre dame...   \n",
444
              "10  Where is the headquarters of the Congregation ...   \n",
445
              "15  How many BS level degrees are offered in the C...   \n",
446
              "20  What entity provides help with the management ...   \n",
447
              "\n",
448
              "                                              answers  \n",
449
              "0   {'text': ['Saint Bernadette Soubirous'], 'answ...  \n",
450
              "5   {'text': ['September 1876'], 'answer_start': [...  \n",
451
              "10          {'text': ['Rome'], 'answer_start': [119]}  \n",
452
              "15         {'text': ['eight'], 'answer_start': [487]}  \n",
453
              "20  {'text': ['Learning Resource Center'], 'answer...  "
454
            ]
455
          },
456
          "execution_count": 3,
457
          "metadata": {},
458
          "output_type": "execute_result"
459
        }
460
      ],
461
      "source": [
462
        "data.drop_duplicates(subset='context', keep='first', inplace=True)\n",
463
        "data.head()"
464
      ]
465
    },
466
    {
467
      "attachments": {},
468
      "cell_type": "markdown",
469
      "metadata": {
470
        "id": "B2_Pt7N6Zg2X"
471
      },
472
      "source": [
473
        "### Initialize the Embedding Model and Vector DB"
474
      ]
475
    },
476
    {
477
      "attachments": {},
478
      "cell_type": "markdown",
479
      "metadata": {
480
        "id": "bGoS84KYZnSK"
481
      },
482
      "source": [
483
        "We'll be using OpenAI's `text-embedding-ada-002` model initialize via LangChain and the Pinecone vector DB. We start by initializing the embedding model, for this we need an [OpenAI API key](https://platform.openai.com/).\n",
484
        "\n",
485
        "*(Note that OpenAI is a paid service and so running the remainder of this notebook may incur some small cost)*"
486
      ]
487
    },
488
    {
489
      "cell_type": "code",
490
      "execution_count": 5,
491
      "metadata": {
492
        "colab": {
493
          "base_uri": "https://localhost:8080/"
494
        },
495
        "id": "U57x2_87YSpb",
496
        "outputId": "cf4c99af-24b7-4bf5-97ea-71091c8fc2ce"
497
      },
498
      "outputs": [],
499
      "source": [
500
        "import os\n",
501
        "from getpass import getpass\n",
502
        "from langchain.embeddings.openai import OpenAIEmbeddings\n",
503
        "\n",
504
        "# get API key from top-right dropdown on OpenAI website\n",
505
        "OPENAI_API_KEY = os.getenv(\"OPENAI_API_KEY\") or getpass(\"Enter your OpenAI API key: \")\n",
506
        "model_name = 'text-embedding-ada-002'\n",
507
        "\n",
508
        "embed = OpenAIEmbeddings(\n",
509
        "    model=model_name,\n",
510
        "    openai_api_key=OPENAI_API_KEY\n",
511
        ")"
512
      ]
513
    },
514
    {
515
      "attachments": {},
516
      "cell_type": "markdown",
517
      "metadata": {
518
        "id": "JQTfOTR6aBRS"
519
      },
520
      "source": [
521
        "Now we create our vector DB to store our vectors. For this we need to get a [free Pinecone API key](https://app.pinecone.io) — the API key can be found in the \"API Keys\" button found in the left navbar of the Pinecone dashboard."
522
      ]
523
    },
524
    {
525
      "cell_type": "code",
526
      "execution_count": 6,
527
      "metadata": {},
528
      "outputs": [],
529
      "source": [
530
        "from pinecone import Pinecone\n",
531
        "\n",
532
        "# initialize connection to pinecone (get API key at app.pinecone.io)\n",
533
        "api_key = os.getenv(\"PINECONE_API_KEY\") or getpass(\"Enter your Pinecone API key: \")\n",
534
        "\n",
535
        "# configure client\n",
536
        "pc = Pinecone(api_key=api_key)"
537
      ]
538
    },
539
    {
540
      "cell_type": "markdown",
541
      "metadata": {},
542
      "source": [
543
        "Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects)."
544
      ]
545
    },
546
    {
547
      "cell_type": "code",
548
      "execution_count": 7,
549
      "metadata": {},
550
      "outputs": [],
551
      "source": [
552
        "from pinecone import ServerlessSpec\n",
553
        "\n",
554
        "spec = ServerlessSpec(\n",
555
        "    cloud=\"aws\", region=\"us-west-2\"\n",
556
        ")"
557
      ]
558
    },
559
    {
560
      "cell_type": "markdown",
561
      "metadata": {},
562
      "source": [
563
        "Creating an index, we set `dimension` equal to to dimensionality of Ada-002 (`1536`), and use a `metric` also compatible with Ada-002 (this can be either `cosine` or `dotproduct`). We also pass our `spec` to index initialization."
564
      ]
565
    },
566
    {
567
      "cell_type": "code",
568
      "execution_count": 8,
569
      "metadata": {
570
        "colab": {
571
          "base_uri": "https://localhost:8080/"
572
        },
573
        "id": "C3wrG-9yaJel",
574
        "outputId": "842bf46f-fd0f-4322-8d90-a7fd769b7687"
575
      },
576
      "outputs": [
577
        {
578
          "data": {
579
            "text/plain": [
580
              "{'dimension': 1536,\n",
581
              " 'index_fullness': 0.0,\n",
582
              " 'namespaces': {},\n",
583
              " 'total_vector_count': 0}"
584
            ]
585
          },
586
          "execution_count": 8,
587
          "metadata": {},
588
          "output_type": "execute_result"
589
        }
590
      ],
591
      "source": [
592
        "import time\n",
593
        "\n",
594
        "index_name = \"langchain-retrieval-agent\"\n",
595
        "existing_indexes = [\n",
596
        "    index_info[\"name\"] for index_info in pc.list_indexes()\n",
597
        "]\n",
598
        "\n",
599
        "# check if index already exists (it shouldn't if this is first time)\n",
600
        "if index_name not in existing_indexes:\n",
601
        "    # if does not exist, create index\n",
602
        "    pc.create_index(\n",
603
        "        index_name,\n",
604
        "        dimension=1536,  # dimensionality of ada 002\n",
605
        "        metric='dotproduct',\n",
606
        "        spec=spec\n",
607
        "    )\n",
608
        "    # wait for index to be initialized\n",
609
        "    while not pc.describe_index(index_name).status['ready']:\n",
610
        "        time.sleep(1)\n",
611
        "\n",
612
        "# connect to index\n",
613
        "index = pc.Index(index_name)\n",
614
        "time.sleep(1)\n",
615
        "# view index stats\n",
616
        "index.describe_index_stats()"
617
      ]
618
    },
619
    {
620
      "attachments": {},
621
      "cell_type": "markdown",
622
      "metadata": {
623
        "id": "AD5IGOoLaVx7"
624
      },
625
      "source": [
626
        "We should see that the new Pinecone index has a `total_vector_count` of `0`, as we haven't added any vectors yet.\n",
627
        "\n",
628
        "## Indexing\n",
629
        "\n",
630
        "We can perform the indexing task using the LangChain vector store object. But for now it is much faster to do it via the Pinecone python client directly. We will do this in batches of `100` or more."
631
      ]
632
    },
633
    {
634
      "cell_type": "code",
635
      "execution_count": 9,
636
      "metadata": {
637
        "colab": {
638
          "base_uri": "https://localhost:8080/",
639
          "height": 49,
640
          "referenced_widgets": [
641
            "321ebcd192fc41e0a7a2ba9c0e7b8556",
642
            "7c898f43b39545879013df760aa4f6f1",
643
            "1b5fe655986842a6afd4a2eb2c0f47fc",
644
            "76872ca737b64616aeacc90f2f7fc655",
645
            "0679733cce4f4b5eb25f8f56dc7ebe22",
646
            "71551fc297b54fb4a21d787d117e01a1",
647
            "b90283522431447b925b226431bb4fa6",
648
            "a03d372e8b4f456e8663e54c818e0fe9",
649
            "6fc773ccbc594fddbc1f2f0adaa96186",
650
            "07b47869f727455fb5ba67e99767c04c",
651
            "4b2e12a33584489690aaf69f51cd4a3b"
652
          ]
653
        },
654
        "id": "AhDcbRGTaWPi",
655
        "outputId": "dca5a4d0-5d19-4542-c26c-3073bfa13f2a"
656
      },
657
      "outputs": [
658
        {
659
          "data": {
660
            "application/vnd.jupyter.widget-view+json": {
661
              "model_id": "862314956bfc40af91a2f2e8d68a0fdd",
662
              "version_major": 2,
663
              "version_minor": 0
664
            },
665
            "text/plain": [
666
              "  0%|          | 0/189 [00:00<?, ?it/s]"
667
            ]
668
          },
669
          "metadata": {},
670
          "output_type": "display_data"
671
        }
672
      ],
673
      "source": [
674
        "from tqdm.auto import tqdm\n",
675
        "\n",
676
        "batch_size = 100\n",
677
        "\n",
678
        "texts = []\n",
679
        "metadatas = []\n",
680
        "\n",
681
        "for i in tqdm(range(0, len(data), batch_size)):\n",
682
        "    # get end of batch\n",
683
        "    i_end = min(len(data), i+batch_size)\n",
684
        "    batch = data.iloc[i:i_end]\n",
685
        "    # first get metadata fields for this record\n",
686
        "    metadatas = [{\n",
687
        "        'title': record['title'],\n",
688
        "        'text': record['context']\n",
689
        "    } for j, record in batch.iterrows()]\n",
690
        "    # get the list of contexts / documents\n",
691
        "    documents = batch['context']\n",
692
        "    # create document embeddings\n",
693
        "    embeds = embed.embed_documents(documents)\n",
694
        "    # get IDs\n",
695
        "    ids = batch['id']\n",
696
        "    # add everything to pinecone\n",
697
        "    index.upsert(vectors=zip(ids, embeds, metadatas))"
698
      ]
699
    },
700
    {
701
      "attachments": {},
702
      "cell_type": "markdown",
703
      "metadata": {
704
        "id": "jDUnLdy1b7G1"
705
      },
706
      "source": [
707
        "We've indexed everything, now we can check the number of vectors in our index like so:"
708
      ]
709
    },
710
    {
711
      "cell_type": "code",
712
      "execution_count": 10,
713
      "metadata": {
714
        "colab": {
715
          "base_uri": "https://localhost:8080/"
716
        },
717
        "id": "SiccGZKAb_Qo",
718
        "outputId": "5cafd8c8-771f-46d0-e261-12c9b5b1b058"
719
      },
720
      "outputs": [
721
        {
722
          "data": {
723
            "text/plain": [
724
              "{'dimension': 1536,\n",
725
              " 'index_fullness': 0.0,\n",
726
              " 'namespaces': {'': {'vector_count': 18891}},\n",
727
              " 'total_vector_count': 18891}"
728
            ]
729
          },
730
          "execution_count": 10,
731
          "metadata": {},
732
          "output_type": "execute_result"
733
        }
734
      ],
735
      "source": [
736
        "index.describe_index_stats()"
737
      ]
738
    },
739
    {
740
      "attachments": {},
741
      "cell_type": "markdown",
742
      "metadata": {
743
        "id": "b-3oolT5cCR8"
744
      },
745
      "source": [
746
        "## Creating a Vector Store and Querying"
747
      ]
748
    },
749
    {
750
      "attachments": {},
751
      "cell_type": "markdown",
752
      "metadata": {
753
        "id": "DcZ12U06cCH5"
754
      },
755
      "source": [
756
        "Now that we've build our index we can switch back over to LangChain. We start by initializing a vector store using the same index we just built. We do that like so:"
757
      ]
758
    },
759
    {
760
      "cell_type": "code",
761
      "execution_count": 11,
762
      "metadata": {
763
        "id": "0MBJ477-cFNw"
764
      },
765
      "outputs": [
766
        {
767
          "name": "stderr",
768
          "output_type": "stream",
769
          "text": [
770
            "/Users/jamesbriggs/opt/anaconda3/envs/ml/lib/python3.9/site-packages/langchain_community/vectorstores/pinecone.py:74: UserWarning: Passing in `embedding` as a Callable is deprecated. Please pass in an Embeddings object instead.\n",
771
            "  warnings.warn(\n"
772
          ]
773
        }
774
      ],
775
      "source": [
776
        "from langchain.vectorstores import Pinecone\n",
777
        "\n",
778
        "text_field = \"text\"  # the metadata field that contains our text\n",
779
        "\n",
780
        "# initialize the vector store object\n",
781
        "vectorstore = Pinecone(\n",
782
        "    index, embed.embed_query, text_field\n",
783
        ")"
784
      ]
785
    },
786
    {
787
      "attachments": {},
788
      "cell_type": "markdown",
789
      "metadata": {
790
        "id": "3K3xRthWcXzW"
791
      },
792
      "source": [
793
        "As in previous examples, we can use the `similarity_search` method to do a pure semantic search (without the generation component)."
794
      ]
795
    },
796
    {
797
      "cell_type": "code",
798
      "execution_count": 12,
799
      "metadata": {
800
        "colab": {
801
          "base_uri": "https://localhost:8080/"
802
        },
803
        "id": "uITMZtzschJF",
804
        "outputId": "88fae02f-6f82-47d7-a153-21b3e0615709"
805
      },
806
      "outputs": [
807
        {
808
          "data": {
809
            "text/plain": [
810
              "[Document(page_content=\"In 1919 Father James Burns became president of Notre Dame, and in three years he produced an academic revolution that brought the school up to national standards by adopting the elective system and moving away from the university's traditional scholastic and classical emphasis. By contrast, the Jesuit colleges, bastions of academic conservatism, were reluctant to move to a system of electives. Their graduates were shut out of Harvard Law School for that reason. Notre Dame continued to grow over the years, adding more colleges, programs, and sports teams. By 1921, with the addition of the College of Commerce, Notre Dame had grown from a small college to a university with five colleges and a professional law school. The university continued to expand and add new residence halls and buildings with each subsequent president.\", metadata={'title': 'University_of_Notre_Dame'}),\n",
811
              " Document(page_content='The College of Engineering was established in 1920, however, early courses in civil and mechanical engineering were a part of the College of Science since the 1870s. Today the college, housed in the Fitzpatrick, Cushing, and Stinson-Remick Halls of Engineering, includes five departments of study – aerospace and mechanical engineering, chemical and biomolecular engineering, civil engineering and geological sciences, computer science and engineering, and electrical engineering – with eight B.S. degrees offered. Additionally, the college offers five-year dual degree programs with the Colleges of Arts and Letters and of Business awarding additional B.A. and Master of Business Administration (MBA) degrees, respectively.', metadata={'title': 'University_of_Notre_Dame'}),\n",
812
              " Document(page_content='Since 2005, Notre Dame has been led by John I. Jenkins, C.S.C., the 17th president of the university. Jenkins took over the position from Malloy on July 1, 2005. In his inaugural address, Jenkins described his goals of making the university a leader in research that recognizes ethics and building the connection between faith and studies. During his tenure, Notre Dame has increased its endowment, enlarged its student body, and undergone many construction projects on campus, including Compton Family Ice Arena, a new architecture hall, additional residence halls, and the Campus Crossroads, a $400m enhancement and expansion of Notre Dame Stadium.', metadata={'title': 'University_of_Notre_Dame'})]"
813
            ]
814
          },
815
          "execution_count": 12,
816
          "metadata": {},
817
          "output_type": "execute_result"
818
        }
819
      ],
820
      "source": [
821
        "query = \"when was the college of engineering in the University of Notre Dame established?\"\n",
822
        "\n",
823
        "vectorstore.similarity_search(\n",
824
        "    query,  # our search query\n",
825
        "    k=3  # return 3 most relevant docs\n",
826
        ")"
827
      ]
828
    },
829
    {
830
      "attachments": {},
831
      "cell_type": "markdown",
832
      "metadata": {
833
        "id": "-zGF6YsgczqT"
834
      },
835
      "source": [
836
        "Looks like we're getting good results. Let's take a look at how we can begin integrating this into a conversational agent."
837
      ]
838
    },
839
    {
840
      "attachments": {},
841
      "cell_type": "markdown",
842
      "metadata": {
843
        "id": "tFsIOm73dcOI"
844
      },
845
      "source": [
846
        "## Initializing the Conversational Agent"
847
      ]
848
    },
849
    {
850
      "attachments": {},
851
      "cell_type": "markdown",
852
      "metadata": {
853
        "id": "XMv6TXWkdfNR"
854
      },
855
      "source": [
856
        "Our conversational agent needs a Chat LLM, conversational memory, and a `RetrievalQA` chain to initialize. We create these using:"
857
      ]
858
    },
859
    {
860
      "cell_type": "code",
861
      "execution_count": 13,
862
      "metadata": {
863
        "id": "zMRs9Klic5-Y"
864
      },
865
      "outputs": [],
866
      "source": [
867
        "from langchain.chat_models import ChatOpenAI\n",
868
        "from langchain.chains.conversation.memory import ConversationBufferWindowMemory\n",
869
        "from langchain.chains import RetrievalQA\n",
870
        "\n",
871
        "# chat completion llm\n",
872
        "llm = ChatOpenAI(\n",
873
        "    openai_api_key=OPENAI_API_KEY,\n",
874
        "    model_name='gpt-3.5-turbo',\n",
875
        "    temperature=0.0\n",
876
        ")\n",
877
        "# conversational memory\n",
878
        "conversational_memory = ConversationBufferWindowMemory(\n",
879
        "    memory_key='chat_history',\n",
880
        "    k=5,\n",
881
        "    return_messages=True\n",
882
        ")\n",
883
        "# retrieval qa chain\n",
884
        "qa = RetrievalQA.from_chain_type(\n",
885
        "    llm=llm,\n",
886
        "    chain_type=\"stuff\",\n",
887
        "    retriever=vectorstore.as_retriever()\n",
888
        ")"
889
      ]
890
    },
891
    {
892
      "attachments": {},
893
      "cell_type": "markdown",
894
      "metadata": {
895
        "id": "-ySfWyZLdboX"
896
      },
897
      "source": [
898
        "Using these we can generate an answer using the `run` method:"
899
      ]
900
    },
901
    {
902
      "cell_type": "code",
903
      "execution_count": 14,
904
      "metadata": {
905
        "colab": {
906
          "base_uri": "https://localhost:8080/",
907
          "height": 35
908
        },
909
        "id": "LaYSq0V-dxHw",
910
        "outputId": "a85730a3-ab15-49b6-8aba-6da774fd3e3f"
911
      },
912
      "outputs": [
913
        {
914
          "data": {
915
            "text/plain": [
916
              "'The College of Engineering at the University of Notre Dame was established in 1920.'"
917
            ]
918
          },
919
          "execution_count": 14,
920
          "metadata": {},
921
          "output_type": "execute_result"
922
        }
923
      ],
924
      "source": [
925
        "qa.run(query)"
926
      ]
927
    },
928
    {
929
      "attachments": {},
930
      "cell_type": "markdown",
931
      "metadata": {
932
        "id": "DtSXR5RXdyU0"
933
      },
934
      "source": [
935
        "But this isn't yet ready for our conversational agent. For that we need to convert this retrieval chain into a tool. We do that like so:"
936
      ]
937
    },
938
    {
939
      "cell_type": "code",
940
      "execution_count": 15,
941
      "metadata": {
942
        "id": "FwCYrS4duqBW"
943
      },
944
      "outputs": [],
945
      "source": [
946
        "from langchain.agents import Tool\n",
947
        "\n",
948
        "tools = [\n",
949
        "    Tool(\n",
950
        "        name='Knowledge Base',\n",
951
        "        func=qa.run,\n",
952
        "        description=(\n",
953
        "            'use this tool when answering general knowledge queries to get '\n",
954
        "            'more information about the topic'\n",
955
        "        )\n",
956
        "    )\n",
957
        "]"
958
      ]
959
    },
960
    {
961
      "attachments": {},
962
      "cell_type": "markdown",
963
      "metadata": {
964
        "id": "wXi_0ipTvM_l"
965
      },
966
      "source": [
967
        "Now we can initialize the agent like so:"
968
      ]
969
    },
970
    {
971
      "cell_type": "code",
972
      "execution_count": 16,
973
      "metadata": {
974
        "id": "JaKTzPUEvOoy"
975
      },
976
      "outputs": [],
977
      "source": [
978
        "from langchain.agents import initialize_agent\n",
979
        "\n",
980
        "agent = initialize_agent(\n",
981
        "    agent='chat-conversational-react-description',\n",
982
        "    tools=tools,\n",
983
        "    llm=llm,\n",
984
        "    verbose=True,\n",
985
        "    max_iterations=3,\n",
986
        "    early_stopping_method='generate',\n",
987
        "    memory=conversational_memory\n",
988
        ")"
989
      ]
990
    },
991
    {
992
      "attachments": {},
993
      "cell_type": "markdown",
994
      "metadata": {
995
        "id": "WbXl-AzVvszB"
996
      },
997
      "source": [
998
        "With that our retrieval augmented conversational agent is ready and we can begin using it."
999
      ]
1000
    },
1001
    {
1002
      "attachments": {},
1003
      "cell_type": "markdown",
1004
      "metadata": {
1005
        "id": "IlxUBWKcvzeP"
1006
      },
1007
      "source": [
1008
        "### Using the Conversational Agent"
1009
      ]
1010
    },
1011
    {
1012
      "attachments": {},
1013
      "cell_type": "markdown",
1014
      "metadata": {
1015
        "id": "ZZapCP4Pv2kz"
1016
      },
1017
      "source": [
1018
        "To make queries we simply call the `agent` directly."
1019
      ]
1020
    },
1021
    {
1022
      "cell_type": "code",
1023
      "execution_count": 17,
1024
      "metadata": {
1025
        "colab": {
1026
          "base_uri": "https://localhost:8080/"
1027
        },
1028
        "id": "RJoAhy76vzAB",
1029
        "outputId": "31ae7589-e66a-4835-d1d8-555bb59c962c"
1030
      },
1031
      "outputs": [
1032
        {
1033
          "name": "stdout",
1034
          "output_type": "stream",
1035
          "text": [
1036
            "\n",
1037
            "\n",
1038
            "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
1039
            "\u001b[32;1m\u001b[1;3m{\n",
1040
            "    \"action\": \"Knowledge Base\",\n",
1041
            "    \"action_input\": \"When was the College of Engineering in the University of Notre Dame established?\"\n",
1042
            "}\u001b[0m\n",
1043
            "Observation: \u001b[36;1m\u001b[1;3mThe College of Engineering at the University of Notre Dame was established in 1920.\u001b[0m\n",
1044
            "Thought:\u001b[32;1m\u001b[1;3m{\n",
1045
            "    \"action\": \"Final Answer\",\n",
1046
            "    \"action_input\": \"The College of Engineering at the University of Notre Dame was established in 1920.\"\n",
1047
            "}\u001b[0m\n",
1048
            "\n",
1049
            "\u001b[1m> Finished chain.\u001b[0m\n"
1050
          ]
1051
        },
1052
        {
1053
          "data": {
1054
            "text/plain": [
1055
              "{'input': 'when was the college of engineering in the University of Notre Dame established?',\n",
1056
              " 'chat_history': [],\n",
1057
              " 'output': 'The College of Engineering at the University of Notre Dame was established in 1920.'}"
1058
            ]
1059
          },
1060
          "execution_count": 17,
1061
          "metadata": {},
1062
          "output_type": "execute_result"
1063
        }
1064
      ],
1065
      "source": [
1066
        "agent(query)"
1067
      ]
1068
    },
1069
    {
1070
      "attachments": {},
1071
      "cell_type": "markdown",
1072
      "metadata": {
1073
        "id": "YcMqa9Va2hU6"
1074
      },
1075
      "source": [
1076
        "Looks great, now what if we ask it a non-general knowledge question?"
1077
      ]
1078
    },
1079
    {
1080
      "cell_type": "code",
1081
      "execution_count": 18,
1082
      "metadata": {
1083
        "colab": {
1084
          "base_uri": "https://localhost:8080/"
1085
        },
1086
        "id": "85vipqC02deV",
1087
        "outputId": "345c724e-eaea-4a20-9445-99794bc743fe"
1088
      },
1089
      "outputs": [
1090
        {
1091
          "name": "stdout",
1092
          "output_type": "stream",
1093
          "text": [
1094
            "\n",
1095
            "\n",
1096
            "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
1097
            "\u001b[32;1m\u001b[1;3m{\n",
1098
            "    \"action\": \"Final Answer\",\n",
1099
            "    \"action_input\": \"The product of 2 multiplied by 7 is 14.\"\n",
1100
            "}\u001b[0m\n",
1101
            "\n",
1102
            "\u001b[1m> Finished chain.\u001b[0m\n"
1103
          ]
1104
        },
1105
        {
1106
          "data": {
1107
            "text/plain": [
1108
              "{'input': 'what is 2 * 7?',\n",
1109
              " 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),\n",
1110
              "  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.')],\n",
1111
              " 'output': 'The product of 2 multiplied by 7 is 14.'}"
1112
            ]
1113
          },
1114
          "execution_count": 18,
1115
          "metadata": {},
1116
          "output_type": "execute_result"
1117
        }
1118
      ],
1119
      "source": [
1120
        "agent(\"what is 2 * 7?\")"
1121
      ]
1122
    },
1123
    {
1124
      "attachments": {},
1125
      "cell_type": "markdown",
1126
      "metadata": {
1127
        "id": "gR_b0IN32rQ9"
1128
      },
1129
      "source": [
1130
        "Perfect, the agent is able to recognize that it doesn't need to refer to it's general knowledge tool for that question. Let's try some more questions."
1131
      ]
1132
    },
1133
    {
1134
      "cell_type": "code",
1135
      "execution_count": 19,
1136
      "metadata": {
1137
        "colab": {
1138
          "base_uri": "https://localhost:8080/"
1139
        },
1140
        "id": "mQeicHTj2pmY",
1141
        "outputId": "9810647d-0846-4ca8-ae2e-1b249ca4ad3d"
1142
      },
1143
      "outputs": [
1144
        {
1145
          "name": "stdout",
1146
          "output_type": "stream",
1147
          "text": [
1148
            "\n",
1149
            "\n",
1150
            "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
1151
            "\u001b[32;1m\u001b[1;3m{\n",
1152
            "    \"action\": \"Knowledge Base\",\n",
1153
            "    \"action_input\": \"University of Notre Dame\"\n",
1154
            "}\u001b[0m\n",
1155
            "Observation: \u001b[36;1m\u001b[1;3mThe University of Notre Dame is a Catholic research university located in South Bend, Indiana, in the United States. It is known for its strong academic programs, including undergraduate colleges in Arts and Letters, Science, Engineering, Business, and the Architecture School. The university also has a graduate program with over 50 master's, doctoral, and professional degree programs. Notre Dame is recognized as one of the top universities in the United States and has a strong alumni network. It is also known for its iconic landmarks, such as the Golden Dome and the Basilica. The university is committed to research and has various institutes dedicated to different fields of study. Notre Dame is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.\u001b[0m\n",
1156
            "Thought:\u001b[32;1m\u001b[1;3m{\n",
1157
            "    \"action\": \"Final Answer\",\n",
1158
            "    \"action_input\": \"The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.\"\n",
1159
            "}\u001b[0m\n",
1160
            "\n",
1161
            "\u001b[1m> Finished chain.\u001b[0m\n"
1162
          ]
1163
        },
1164
        {
1165
          "data": {
1166
            "text/plain": [
1167
              "{'input': 'can you tell me some facts about the University of Notre Dame?',\n",
1168
              " 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),\n",
1169
              "  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.'),\n",
1170
              "  HumanMessage(content='what is 2 * 7?'),\n",
1171
              "  AIMessage(content='The product of 2 multiplied by 7 is 14.')],\n",
1172
              " 'output': 'The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.'}"
1173
            ]
1174
          },
1175
          "execution_count": 19,
1176
          "metadata": {},
1177
          "output_type": "execute_result"
1178
        }
1179
      ],
1180
      "source": [
1181
        "agent(\"can you tell me some facts about the University of Notre Dame?\")"
1182
      ]
1183
    },
1184
    {
1185
      "cell_type": "code",
1186
      "execution_count": 20,
1187
      "metadata": {
1188
        "colab": {
1189
          "base_uri": "https://localhost:8080/"
1190
        },
1191
        "id": "G93vLXso3B5Z",
1192
        "outputId": "cb345147-5630-4a5a-bb9f-242ad91d9968"
1193
      },
1194
      "outputs": [
1195
        {
1196
          "name": "stdout",
1197
          "output_type": "stream",
1198
          "text": [
1199
            "\n",
1200
            "\n",
1201
            "\u001b[1m> Entering new AgentExecutor chain...\u001b[0m\n",
1202
            "\u001b[32;1m\u001b[1;3m{\n",
1203
            "    \"action\": \"Final Answer\",\n",
1204
            "    \"action_input\": \"The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs and is known for its iconic landmarks and commitment to research.\"\n",
1205
            "}\u001b[0m\n",
1206
            "\n",
1207
            "\u001b[1m> Finished chain.\u001b[0m\n"
1208
          ]
1209
        },
1210
        {
1211
          "data": {
1212
            "text/plain": [
1213
              "{'input': 'can you summarize these facts in two short sentences',\n",
1214
              " 'chat_history': [HumanMessage(content='when was the college of engineering in the University of Notre Dame established?'),\n",
1215
              "  AIMessage(content='The College of Engineering at the University of Notre Dame was established in 1920.'),\n",
1216
              "  HumanMessage(content='what is 2 * 7?'),\n",
1217
              "  AIMessage(content='The product of 2 multiplied by 7 is 14.'),\n",
1218
              "  HumanMessage(content='can you tell me some facts about the University of Notre Dame?'),\n",
1219
              "  AIMessage(content='The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs in various fields, including Arts and Letters, Science, Engineering, Business, and Architecture. Notre Dame is known for its academic excellence, iconic landmarks like the Golden Dome and the Basilica, and its commitment to research. It is also home to the Notre Dame Global Adaptation Index, which ranks countries based on their vulnerability to climate change.')],\n",
1220
              " 'output': 'The University of Notre Dame is a Catholic research university located in South Bend, Indiana. It offers strong academic programs and is known for its iconic landmarks and commitment to research.'}"
1221
            ]
1222
          },
1223
          "execution_count": 20,
1224
          "metadata": {},
1225
          "output_type": "execute_result"
1226
        }
1227
      ],
1228
      "source": [
1229
        "agent(\"can you summarize these facts in two short sentences\")"
1230
      ]
1231
    },
1232
    {
1233
      "attachments": {},
1234
      "cell_type": "markdown",
1235
      "metadata": {
1236
        "id": "PWivmw9F3bCw"
1237
      },
1238
      "source": [
1239
        "Looks great! We're also able to ask questions that refer to previous interactions in the conversation and the agent is able to refer to the conversation history to as a source of information.\n",
1240
        "\n",
1241
        "That's all for this example of building a retrieval augmented conversational agent with OpenAI and Pinecone (the OP stack) and LangChain.\n",
1242
        "\n",
1243
        "Once finished, we delete the Pinecone index to save resources:"
1244
      ]
1245
    },
1246
    {
1247
      "cell_type": "code",
1248
      "execution_count": 21,
1249
      "metadata": {
1250
        "id": "Pa1whr8V3Wfm"
1251
      },
1252
      "outputs": [],
1253
      "source": [
1254
        "pc.delete_index(index_name)"
1255
      ]
1256
    },
1257
    {
1258
      "attachments": {},
1259
      "cell_type": "markdown",
1260
      "metadata": {
1261
        "id": "Ykg5TYA033yR"
1262
      },
1263
      "source": [
1264
        "---"
1265
      ]
1266
    }
1267
  ],
1268
  "metadata": {
1269
    "colab": {
1270
      "provenance": []
1271
    },
1272
    "kernelspec": {
1273
      "display_name": "Python 3",
1274
      "name": "python3"
1275
    },
1276
    "language_info": {
1277
      "codemirror_mode": {
1278
        "name": "ipython",
1279
        "version": 3
1280
      },
1281
      "file_extension": ".py",
1282
      "mimetype": "text/x-python",
1283
      "name": "python",
1284
      "nbconvert_exporter": "python",
1285
      "pygments_lexer": "ipython3",
1286
      "version": "3.9.12"
1287
    },
1288
    "widgets": {
1289
      "application/vnd.jupyter.widget-state+json": {
1290
        "0679733cce4f4b5eb25f8f56dc7ebe22": {
1291
          "model_module": "@jupyter-widgets/base",
1292
          "model_module_version": "1.2.0",
1293
          "model_name": "LayoutModel",
1294
          "state": {
1295
            "_model_module": "@jupyter-widgets/base",
1296
            "_model_module_version": "1.2.0",
1297
            "_model_name": "LayoutModel",
1298
            "_view_count": null,
1299
            "_view_module": "@jupyter-widgets/base",
1300
            "_view_module_version": "1.2.0",
1301
            "_view_name": "LayoutView",
1302
            "align_content": null,
1303
            "align_items": null,
1304
            "align_self": null,
1305
            "border": null,
1306
            "bottom": null,
1307
            "display": null,
1308
            "flex": null,
1309
            "flex_flow": null,
1310
            "grid_area": null,
1311
            "grid_auto_columns": null,
1312
            "grid_auto_flow": null,
1313
            "grid_auto_rows": null,
1314
            "grid_column": null,
1315
            "grid_gap": null,
1316
            "grid_row": null,
1317
            "grid_template_areas": null,
1318
            "grid_template_columns": null,
1319
            "grid_template_rows": null,
1320
            "height": null,
1321
            "justify_content": null,
1322
            "justify_items": null,
1323
            "left": null,
1324
            "margin": null,
1325
            "max_height": null,
1326
            "max_width": null,
1327
            "min_height": null,
1328
            "min_width": null,
1329
            "object_fit": null,
1330
            "object_position": null,
1331
            "order": null,
1332
            "overflow": null,
1333
            "overflow_x": null,
1334
            "overflow_y": null,
1335
            "padding": null,
1336
            "right": null,
1337
            "top": null,
1338
            "visibility": null,
1339
            "width": null
1340
          }
1341
        },
1342
        "07b47869f727455fb5ba67e99767c04c": {
1343
          "model_module": "@jupyter-widgets/base",
1344
          "model_module_version": "1.2.0",
1345
          "model_name": "LayoutModel",
1346
          "state": {
1347
            "_model_module": "@jupyter-widgets/base",
1348
            "_model_module_version": "1.2.0",
1349
            "_model_name": "LayoutModel",
1350
            "_view_count": null,
1351
            "_view_module": "@jupyter-widgets/base",
1352
            "_view_module_version": "1.2.0",
1353
            "_view_name": "LayoutView",
1354
            "align_content": null,
1355
            "align_items": null,
1356
            "align_self": null,
1357
            "border": null,
1358
            "bottom": null,
1359
            "display": null,
1360
            "flex": null,
1361
            "flex_flow": null,
1362
            "grid_area": null,
1363
            "grid_auto_columns": null,
1364
            "grid_auto_flow": null,
1365
            "grid_auto_rows": null,
1366
            "grid_column": null,
1367
            "grid_gap": null,
1368
            "grid_row": null,
1369
            "grid_template_areas": null,
1370
            "grid_template_columns": null,
1371
            "grid_template_rows": null,
1372
            "height": null,
1373
            "justify_content": null,
1374
            "justify_items": null,
1375
            "left": null,
1376
            "margin": null,
1377
            "max_height": null,
1378
            "max_width": null,
1379
            "min_height": null,
1380
            "min_width": null,
1381
            "object_fit": null,
1382
            "object_position": null,
1383
            "order": null,
1384
            "overflow": null,
1385
            "overflow_x": null,
1386
            "overflow_y": null,
1387
            "padding": null,
1388
            "right": null,
1389
            "top": null,
1390
            "visibility": null,
1391
            "width": null
1392
          }
1393
        },
1394
        "1b5fe655986842a6afd4a2eb2c0f47fc": {
1395
          "model_module": "@jupyter-widgets/controls",
1396
          "model_module_version": "1.5.0",
1397
          "model_name": "FloatProgressModel",
1398
          "state": {
1399
            "_dom_classes": [],
1400
            "_model_module": "@jupyter-widgets/controls",
1401
            "_model_module_version": "1.5.0",
1402
            "_model_name": "FloatProgressModel",
1403
            "_view_count": null,
1404
            "_view_module": "@jupyter-widgets/controls",
1405
            "_view_module_version": "1.5.0",
1406
            "_view_name": "ProgressView",
1407
            "bar_style": "success",
1408
            "description": "",
1409
            "description_tooltip": null,
1410
            "layout": "IPY_MODEL_a03d372e8b4f456e8663e54c818e0fe9",
1411
            "max": 189,
1412
            "min": 0,
1413
            "orientation": "horizontal",
1414
            "style": "IPY_MODEL_6fc773ccbc594fddbc1f2f0adaa96186",
1415
            "value": 189
1416
          }
1417
        },
1418
        "321ebcd192fc41e0a7a2ba9c0e7b8556": {
1419
          "model_module": "@jupyter-widgets/controls",
1420
          "model_module_version": "1.5.0",
1421
          "model_name": "HBoxModel",
1422
          "state": {
1423
            "_dom_classes": [],
1424
            "_model_module": "@jupyter-widgets/controls",
1425
            "_model_module_version": "1.5.0",
1426
            "_model_name": "HBoxModel",
1427
            "_view_count": null,
1428
            "_view_module": "@jupyter-widgets/controls",
1429
            "_view_module_version": "1.5.0",
1430
            "_view_name": "HBoxView",
1431
            "box_style": "",
1432
            "children": [
1433
              "IPY_MODEL_7c898f43b39545879013df760aa4f6f1",
1434
              "IPY_MODEL_1b5fe655986842a6afd4a2eb2c0f47fc",
1435
              "IPY_MODEL_76872ca737b64616aeacc90f2f7fc655"
1436
            ],
1437
            "layout": "IPY_MODEL_0679733cce4f4b5eb25f8f56dc7ebe22"
1438
          }
1439
        },
1440
        "4b2e12a33584489690aaf69f51cd4a3b": {
1441
          "model_module": "@jupyter-widgets/controls",
1442
          "model_module_version": "1.5.0",
1443
          "model_name": "DescriptionStyleModel",
1444
          "state": {
1445
            "_model_module": "@jupyter-widgets/controls",
1446
            "_model_module_version": "1.5.0",
1447
            "_model_name": "DescriptionStyleModel",
1448
            "_view_count": null,
1449
            "_view_module": "@jupyter-widgets/base",
1450
            "_view_module_version": "1.2.0",
1451
            "_view_name": "StyleView",
1452
            "description_width": ""
1453
          }
1454
        },
1455
        "6fc773ccbc594fddbc1f2f0adaa96186": {
1456
          "model_module": "@jupyter-widgets/controls",
1457
          "model_module_version": "1.5.0",
1458
          "model_name": "ProgressStyleModel",
1459
          "state": {
1460
            "_model_module": "@jupyter-widgets/controls",
1461
            "_model_module_version": "1.5.0",
1462
            "_model_name": "ProgressStyleModel",
1463
            "_view_count": null,
1464
            "_view_module": "@jupyter-widgets/base",
1465
            "_view_module_version": "1.2.0",
1466
            "_view_name": "StyleView",
1467
            "bar_color": null,
1468
            "description_width": ""
1469
          }
1470
        },
1471
        "71551fc297b54fb4a21d787d117e01a1": {
1472
          "model_module": "@jupyter-widgets/base",
1473
          "model_module_version": "1.2.0",
1474
          "model_name": "LayoutModel",
1475
          "state": {
1476
            "_model_module": "@jupyter-widgets/base",
1477
            "_model_module_version": "1.2.0",
1478
            "_model_name": "LayoutModel",
1479
            "_view_count": null,
1480
            "_view_module": "@jupyter-widgets/base",
1481
            "_view_module_version": "1.2.0",
1482
            "_view_name": "LayoutView",
1483
            "align_content": null,
1484
            "align_items": null,
1485
            "align_self": null,
1486
            "border": null,
1487
            "bottom": null,
1488
            "display": null,
1489
            "flex": null,
1490
            "flex_flow": null,
1491
            "grid_area": null,
1492
            "grid_auto_columns": null,
1493
            "grid_auto_flow": null,
1494
            "grid_auto_rows": null,
1495
            "grid_column": null,
1496
            "grid_gap": null,
1497
            "grid_row": null,
1498
            "grid_template_areas": null,
1499
            "grid_template_columns": null,
1500
            "grid_template_rows": null,
1501
            "height": null,
1502
            "justify_content": null,
1503
            "justify_items": null,
1504
            "left": null,
1505
            "margin": null,
1506
            "max_height": null,
1507
            "max_width": null,
1508
            "min_height": null,
1509
            "min_width": null,
1510
            "object_fit": null,
1511
            "object_position": null,
1512
            "order": null,
1513
            "overflow": null,
1514
            "overflow_x": null,
1515
            "overflow_y": null,
1516
            "padding": null,
1517
            "right": null,
1518
            "top": null,
1519
            "visibility": null,
1520
            "width": null
1521
          }
1522
        },
1523
        "76872ca737b64616aeacc90f2f7fc655": {
1524
          "model_module": "@jupyter-widgets/controls",
1525
          "model_module_version": "1.5.0",
1526
          "model_name": "HTMLModel",
1527
          "state": {
1528
            "_dom_classes": [],
1529
            "_model_module": "@jupyter-widgets/controls",
1530
            "_model_module_version": "1.5.0",
1531
            "_model_name": "HTMLModel",
1532
            "_view_count": null,
1533
            "_view_module": "@jupyter-widgets/controls",
1534
            "_view_module_version": "1.5.0",
1535
            "_view_name": "HTMLView",
1536
            "description": "",
1537
            "description_tooltip": null,
1538
            "layout": "IPY_MODEL_07b47869f727455fb5ba67e99767c04c",
1539
            "placeholder": "​",
1540
            "style": "IPY_MODEL_4b2e12a33584489690aaf69f51cd4a3b",
1541
            "value": " 189/189 [03:23&lt;00:00,  1.12s/it]"
1542
          }
1543
        },
1544
        "7c898f43b39545879013df760aa4f6f1": {
1545
          "model_module": "@jupyter-widgets/controls",
1546
          "model_module_version": "1.5.0",
1547
          "model_name": "HTMLModel",
1548
          "state": {
1549
            "_dom_classes": [],
1550
            "_model_module": "@jupyter-widgets/controls",
1551
            "_model_module_version": "1.5.0",
1552
            "_model_name": "HTMLModel",
1553
            "_view_count": null,
1554
            "_view_module": "@jupyter-widgets/controls",
1555
            "_view_module_version": "1.5.0",
1556
            "_view_name": "HTMLView",
1557
            "description": "",
1558
            "description_tooltip": null,
1559
            "layout": "IPY_MODEL_71551fc297b54fb4a21d787d117e01a1",
1560
            "placeholder": "​",
1561
            "style": "IPY_MODEL_b90283522431447b925b226431bb4fa6",
1562
            "value": "100%"
1563
          }
1564
        },
1565
        "a03d372e8b4f456e8663e54c818e0fe9": {
1566
          "model_module": "@jupyter-widgets/base",
1567
          "model_module_version": "1.2.0",
1568
          "model_name": "LayoutModel",
1569
          "state": {
1570
            "_model_module": "@jupyter-widgets/base",
1571
            "_model_module_version": "1.2.0",
1572
            "_model_name": "LayoutModel",
1573
            "_view_count": null,
1574
            "_view_module": "@jupyter-widgets/base",
1575
            "_view_module_version": "1.2.0",
1576
            "_view_name": "LayoutView",
1577
            "align_content": null,
1578
            "align_items": null,
1579
            "align_self": null,
1580
            "border": null,
1581
            "bottom": null,
1582
            "display": null,
1583
            "flex": null,
1584
            "flex_flow": null,
1585
            "grid_area": null,
1586
            "grid_auto_columns": null,
1587
            "grid_auto_flow": null,
1588
            "grid_auto_rows": null,
1589
            "grid_column": null,
1590
            "grid_gap": null,
1591
            "grid_row": null,
1592
            "grid_template_areas": null,
1593
            "grid_template_columns": null,
1594
            "grid_template_rows": null,
1595
            "height": null,
1596
            "justify_content": null,
1597
            "justify_items": null,
1598
            "left": null,
1599
            "margin": null,
1600
            "max_height": null,
1601
            "max_width": null,
1602
            "min_height": null,
1603
            "min_width": null,
1604
            "object_fit": null,
1605
            "object_position": null,
1606
            "order": null,
1607
            "overflow": null,
1608
            "overflow_x": null,
1609
            "overflow_y": null,
1610
            "padding": null,
1611
            "right": null,
1612
            "top": null,
1613
            "visibility": null,
1614
            "width": null
1615
          }
1616
        },
1617
        "b90283522431447b925b226431bb4fa6": {
1618
          "model_module": "@jupyter-widgets/controls",
1619
          "model_module_version": "1.5.0",
1620
          "model_name": "DescriptionStyleModel",
1621
          "state": {
1622
            "_model_module": "@jupyter-widgets/controls",
1623
            "_model_module_version": "1.5.0",
1624
            "_model_name": "DescriptionStyleModel",
1625
            "_view_count": null,
1626
            "_view_module": "@jupyter-widgets/base",
1627
            "_view_module_version": "1.2.0",
1628
            "_view_name": "StyleView",
1629
            "description_width": ""
1630
          }
1631
        }
1632
      }
1633
    }
1634
  },
1635
  "nbformat": 4,
1636
  "nbformat_minor": 0
1637
}
1638

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.