examples
632 строки · 23.5 Кб
1{
2"cells": [
3{
4"attachments": {},
5"cell_type": "markdown",
6"metadata": {
7"id": "AWGzucuFfbBn"
8},
9"source": [
10"[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/00-langchain-intro.ipynb) [![Open nbviewer](https://raw.githubusercontent.com/pinecone-io/examples/master/assets/nbviewer-shield.svg)](https://nbviewer.org/github/pinecone-io/examples/blob/master/learn/generation/langchain/handbook/00-langchain-intro.ipynb)\n",
11"\n",
12"#### [LangChain Handbook](https://github.com/pinecone-io/examples/tree/master/generation/langchain/handbook)\n",
13"\n",
14"# Intro to LangChain\n",
15"\n",
16"LangChain is a popular framework that allow users to quickly build apps and pipelines around **L**arge **L**anguage **M**odels. It can be used to for chatbots, **G**enerative **Q**uestion-**A**nwering (GQA), summarization, and much more.\n",
17"\n",
18"The core idea of the library is that we can _\"chain\"_ together different components to create more advanced use-cases around LLMs. Chains may consist of multiple components from several modules:\n",
19"\n",
20"* **Prompt templates**: Prompt templates are, well, templates for different types of prompts. Like \"chatbot\" style templates, ELI5 question-answering, etc\n",
21"\n",
22"* **LLMs**: Large language models like GPT-3, BLOOM, etc\n",
23"\n",
24"* **Agents**: Agents use LLMs to decide what actions should be taken, tools like web search or calculators can be used, and all packaged into logical loop of operations.\n",
25"\n",
26"* **Memory**: Short-term memory, long-term memory."
27]
28},
29{
30"cell_type": "code",
31"execution_count": 4,
32"metadata": {
33"id": "r-ryCeG_f_GC"
34},
35"outputs": [],
36"source": [
37"!pip install -qU langchain"
38]
39},
40{
41"cell_type": "markdown",
42"metadata": {
43"id": "mNaXrEPOhbuL"
44},
45"source": [
46"# Using LLMs in LangChain\n",
47"\n",
48"LangChain supports several LLM providers, like Hugging Face and OpenAI.\n",
49"\n",
50"Let's start our exploration of LangChain by learning how to use a few of these different LLM integrations.\n",
51"\n",
52"## Hugging Face\n",
53"\n",
54"We first need to install additional prerequisite libraries:"
55]
56},
57{
58"cell_type": "code",
59"execution_count": 5,
60"metadata": {
61"colab": {
62"base_uri": "https://localhost:8080/"
63},
64"id": "LWA15ZkVjg80",
65"outputId": "b38a4c6a-9d98-44b4-eb71-6e96398c647a"
66},
67"outputs": [
68{
69"name": "stdout",
70"output_type": "stream",
71"text": [
72"Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
73"Collecting huggingface_hub\n",
74" Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)\n",
75"\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/182.4 KB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m182.4/182.4 KB\u001b[0m \u001b[31m5.3 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
76"\u001b[?25hRequirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (4.64.1)\n",
77"Requirement already satisfied: packaging>=20.9 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (21.3)\n",
78"Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (6.0)\n",
79"Requirement already satisfied: requests in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (2.25.1)\n",
80"Requirement already satisfied: filelock in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (3.9.0)\n",
81"Requirement already satisfied: typing-extensions>=3.7.4.3 in /usr/local/lib/python3.8/dist-packages (from huggingface_hub) (4.4.0)\n",
82"Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.8/dist-packages (from packaging>=20.9->huggingface_hub) (3.0.9)\n",
83"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (2.10)\n",
84"Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (1.24.3)\n",
85"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (2022.12.7)\n",
86"Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests->huggingface_hub) (4.0.0)\n",
87"Installing collected packages: huggingface_hub\n",
88"Successfully installed huggingface_hub-0.11.1\n"
89]
90}
91],
92"source": [
93"!pip install -qU huggingface_hub"
94]
95},
96{
97"cell_type": "markdown",
98"metadata": {
99"id": "m-whfR5Tjf1O"
100},
101"source": [
102"For Hugging Face models we need a Hugging Face Hub API token. We can find this by first getting an account at [HuggingFace.co](https://huggingface.co/) and clicking on our profile in the top-right corner > click *Settings* > click *Access Tokens* > click *New Token* > set *Role* to *write* > *Generate* > copy and paste the token below:"
103]
104},
105{
106"cell_type": "code",
107"execution_count": 1,
108"metadata": {
109"id": "sRGTytxCjKaW"
110},
111"outputs": [],
112"source": [
113"import os\n",
114"\n",
115"os.environ['HUGGINGFACEHUB_API_TOKEN'] = 'HF_API_KEY'"
116]
117},
118{
119"cell_type": "markdown",
120"metadata": {
121"id": "exAl3iQgnAra"
122},
123"source": [
124"We can then generate text using a HF Hub model (we'll use `google/flan-t5-x1`) using the Inference API built into Hugging Face Hub.\n",
125"\n",
126"_(The default Inference API doesn't use specialized hardware and so can be slow and cannot run larger models like `bigscience/bloom-560m` or `google/flan-t5-xxl`)_"
127]
128},
129{
130"cell_type": "code",
131"execution_count": 6,
132"metadata": {
133"colab": {
134"base_uri": "https://localhost:8080/"
135},
136"id": "l7yubiSJhIfs",
137"outputId": "39f9bb8b-c116-46a3-e9be-c3b3549789a9"
138},
139"outputs": [
140{
141"name": "stdout",
142"output_type": "stream",
143"text": [
144"green bay packers\n"
145]
146}
147],
148"source": [
149"from langchain import PromptTemplate, HuggingFaceHub, LLMChain\n",
150"\n",
151"# initialize HF LLM\n",
152"flan_t5 = HuggingFaceHub(\n",
153" repo_id=\"google/flan-t5-xl\",\n",
154" model_kwargs={\"temperature\":1e-10}\n",
155")\n",
156"\n",
157"# build prompt template for simple question-answering\n",
158"template = \"\"\"Question: {question}\n",
159"\n",
160"Answer: \"\"\"\n",
161"prompt = PromptTemplate(template=template, input_variables=[\"question\"])\n",
162"\n",
163"llm_chain = LLMChain(\n",
164" prompt=prompt,\n",
165" llm=flan_t5\n",
166")\n",
167"\n",
168"question = \"Which NFL team won the Super Bowl in the 2010 season?\"\n",
169"\n",
170"print(llm_chain.run(question))"
171]
172},
173{
174"cell_type": "markdown",
175"metadata": {
176"id": "DXroZSjCKxa2"
177},
178"source": [
179"If we'd like to ask multiple questions we can by passing a list of dictionary objects, where the dictionaries must contain the input variable set in our prompt template (`\"question\"`) that is mapped to the question we'd like to ask."
180]
181},
182{
183"cell_type": "code",
184"execution_count": 7,
185"metadata": {
186"colab": {
187"base_uri": "https://localhost:8080/"
188},
189"id": "4jNZgxSIJsXj",
190"outputId": "f5711d89-de5a-48d0-e815-9f186a84e807"
191},
192"outputs": [
193{
194"data": {
195"text/plain": [
196"LLMResult(generations=[[Generation(text='green bay packers', generation_info=None)], [Generation(text='184', generation_info=None)], [Generation(text='john glenn', generation_info=None)], [Generation(text='one', generation_info=None)]], llm_output=None)"
197]
198},
199"execution_count": 7,
200"metadata": {},
201"output_type": "execute_result"
202}
203],
204"source": [
205"qs = [\n",
206" {'question': \"Which NFL team won the Super Bowl in the 2010 season?\"},\n",
207" {'question': \"If I am 6 ft 4 inches, how tall am I in centimeters?\"},\n",
208" {'question': \"Who was the 12th person on the moon?\"},\n",
209" {'question': \"How many eyes does a blade of grass have?\"}\n",
210"]\n",
211"res = llm_chain.generate(qs)\n",
212"res"
213]
214},
215{
216"cell_type": "markdown",
217"metadata": {
218"id": "7zoxlXHYLQix"
219},
220"source": [
221"It is a LLM, so we can try feeding in all questions at once:"
222]
223},
224{
225"cell_type": "code",
226"execution_count": 8,
227"metadata": {
228"colab": {
229"base_uri": "https://localhost:8080/"
230},
231"id": "b96WIvouLQ-7",
232"outputId": "c9ff1c1f-2991-4832-d57c-b46cc346ca64"
233},
234"outputs": [
235{
236"name": "stdout",
237"output_type": "stream",
238"text": [
239"six\n"
240]
241}
242],
243"source": [
244"multi_template = \"\"\"Answer the following questions one at a time.\n",
245"\n",
246"Questions:\n",
247"{questions}\n",
248"\n",
249"Answers:\n",
250"\"\"\"\n",
251"long_prompt = PromptTemplate(\n",
252" template=multi_template,\n",
253" input_variables=[\"questions\"]\n",
254")\n",
255"\n",
256"llm_chain = LLMChain(\n",
257" prompt=long_prompt,\n",
258" llm=flan_t5\n",
259")\n",
260"\n",
261"qs_str = (\n",
262" \"Which NFL team won the Super Bowl in the 2010 season?\\n\" +\n",
263" \"If I am 6 ft 4 inches, how tall am I in centimeters?\\n\" +\n",
264" \"Who was the 12th person on the moon?\" +\n",
265" \"How many eyes does a blade of grass have?\"\n",
266")\n",
267"\n",
268"print(llm_chain.run(qs_str))"
269]
270},
271{
272"cell_type": "markdown",
273"metadata": {
274"id": "y99CMKSbOqBy"
275},
276"source": [
277"But with this model it doesn't work too well, we'll see this approach works better with different models soon."
278]
279},
280{
281"cell_type": "markdown",
282"metadata": {
283"id": "YpdXG9YtzrLJ"
284},
285"source": [
286"## OpenAI\n",
287"\n",
288"Start by installing additional prerequisites:"
289]
290},
291{
292"cell_type": "code",
293"execution_count": 9,
294"metadata": {
295"colab": {
296"base_uri": "https://localhost:8080/"
297},
298"id": "FHo2YRHPDgHH",
299"outputId": "c8fd417b-d6b4-4f8e-a8a0-a95e3e1e676f"
300},
301"outputs": [
302{
303"name": "stdout",
304"output_type": "stream",
305"text": [
306"Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/\n",
307"Collecting openai\n",
308" Downloading openai-0.26.1.tar.gz (55 kB)\n",
309"\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m55.3/55.3 KB\u001b[0m \u001b[31m3.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n",
310"\u001b[?25h Installing build dependencies ... \u001b[?25l\u001b[?25hdone\n",
311" Getting requirements to build wheel ... \u001b[?25l\u001b[?25hdone\n",
312" Preparing metadata (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
313"Requirement already satisfied: aiohttp in /usr/local/lib/python3.8/dist-packages (from openai) (3.8.3)\n",
314"Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from openai) (4.64.1)\n",
315"Requirement already satisfied: requests>=2.20 in /usr/local/lib/python3.8/dist-packages (from openai) (2.25.1)\n",
316"Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/local/lib/python3.8/dist-packages (from requests>=2.20->openai) (1.24.3)\n",
317"Requirement already satisfied: chardet<5,>=3.0.2 in /usr/local/lib/python3.8/dist-packages (from requests>=2.20->openai) (4.0.0)\n",
318"Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.8/dist-packages (from requests>=2.20->openai) (2.10)\n",
319"Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.8/dist-packages (from requests>=2.20->openai) (2022.12.7)\n",
320"Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (6.0.4)\n",
321"Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (22.2.0)\n",
322"Requirement already satisfied: charset-normalizer<3.0,>=2.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (2.1.1)\n",
323"Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (1.3.1)\n",
324"Requirement already satisfied: async-timeout<5.0,>=4.0.0a3 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (4.0.2)\n",
325"Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (1.3.3)\n",
326"Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.8/dist-packages (from aiohttp->openai) (1.8.2)\n",
327"Building wheels for collected packages: openai\n",
328" Building wheel for openai (pyproject.toml) ... \u001b[?25l\u001b[?25hdone\n",
329" Created wheel for openai: filename=openai-0.26.1-py3-none-any.whl size=67316 sha256=3f271293eb83a4726bba3b95ac96b32dabbf776235bd73bdf63d9fef623ce735\n",
330" Stored in directory: /root/.cache/pip/wheels/2f/9c/55/95d3609ccfc463eeffb96d50c756f1f1899453b85e92021a0a\n",
331"Successfully built openai\n",
332"Installing collected packages: openai\n",
333"Successfully installed openai-0.26.1\n"
334]
335}
336],
337"source": [
338"!pip install -qU openai"
339]
340},
341{
342"cell_type": "markdown",
343"metadata": {
344"id": "0fOo9qQvDgkz"
345},
346"source": [
347"We can also use OpenAI's generative models. The process is similar, we need to\n",
348"give our API key which can be retrieved by signing up for an account on the\n",
349"[OpenAI website](https://openai.com/api/) (see top-right of page). We then pass the API key below:"
350]
351},
352{
353"cell_type": "code",
354"execution_count": 2,
355"metadata": {
356"id": "deWmOJecfbBr"
357},
358"outputs": [],
359"source": [
360"import os\n",
361"\n",
362"os.environ['OPENAI_API_KEY'] = 'OPENAI_API_KEY'"
363]
364},
365{
366"cell_type": "markdown",
367"metadata": {
368"id": "CU4xirWX-Ds4"
369},
370"source": [
371"If using OpenAI via Azure you should also set:\n",
372"\n",
373"```python\n",
374"os.environ['OPENAI_API_TYPE'] = 'azure'\n",
375"# API version to use (Azure has several)\n",
376"os.environ['OPENAI_API_VERSION'] = '2022-12-01'\n",
377"# base URL for your Azure OpenAI resource\n",
378"os.environ['OPENAI_API_BASE'] = 'https://your-resource-name.openai.azure.com'\n",
379"```"
380]
381},
382{
383"cell_type": "markdown",
384"metadata": {
385"id": "2AWnaTCP0Ryg"
386},
387"source": [
388"Then we decide on which model we'd like to use, there are several options but we will go with `text-davinci-003`:"
389]
390},
391{
392"cell_type": "code",
393"execution_count": 10,
394"metadata": {
395"id": "ZhQSDoYe0ly4"
396},
397"outputs": [],
398"source": [
399"from langchain.llms import OpenAI\n",
400"\n",
401"davinci = OpenAI(model_name='text-davinci-003')"
402]
403},
404{
405"cell_type": "markdown",
406"metadata": {
407"id": "_NvK4o6SDrs0"
408},
409"source": [
410"Alternatively if using Azure OpenAI we do:\n",
411"\n",
412"```python\n",
413"from langchain.llms import AzureOpenAI\n",
414"\n",
415"llm = AzureOpenAI(\n",
416" deployment_name=\"your-azure-deployment\", \n",
417" model_name=\"text-davinci-003\"\n",
418")\n",
419"```"
420]
421},
422{
423"cell_type": "markdown",
424"metadata": {
425"id": "SGL2zs3uEVj6"
426},
427"source": [
428"We'll use the same simple question-answer prompt template as before with the Hugging Face example. The only change is that we now pass our OpenAI LLM `davinci`:"
429]
430},
431{
432"cell_type": "code",
433"execution_count": 11,
434"metadata": {
435"colab": {
436"base_uri": "https://localhost:8080/"
437},
438"id": "gVSsC3iGEPAp",
439"outputId": "1d562f8d-2fbf-4cc4-84cd-cce998720eb3"
440},
441"outputs": [
442{
443"name": "stdout",
444"output_type": "stream",
445"text": [
446" The Green Bay Packers won the Super Bowl in the 2010 season.\n"
447]
448}
449],
450"source": [
451"llm_chain = LLMChain(\n",
452" prompt=prompt,\n",
453" llm=davinci\n",
454")\n",
455"\n",
456"print(llm_chain.run(question))"
457]
458},
459{
460"cell_type": "markdown",
461"metadata": {
462"id": "DL-buasOKpKs"
463},
464"source": [
465"The same works again for multiple questions using `generate`:"
466]
467},
468{
469"cell_type": "code",
470"execution_count": 12,
471"metadata": {
472"colab": {
473"base_uri": "https://localhost:8080/"
474},
475"id": "vMua1MWcKtSx",
476"outputId": "55efeae1-8c30-4069-a2a8-fb78212f8523"
477},
478"outputs": [
479{
480"data": {
481"text/plain": [
482"LLMResult(generations=[[Generation(text=' The Green Bay Packers won the Super Bowl in the 2010 season.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' 193.04 centimeters', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' Eugene A. Cernan was the 12th person to walk on the moon. He was part of the Apollo 17 mission in December 1972.', generation_info={'finish_reason': 'stop', 'logprobs': None})], [Generation(text=' A blade of grass does not have any eyes.', generation_info={'finish_reason': 'stop', 'logprobs': None})]], llm_output={'token_usage': {'total_tokens': 131, 'prompt_tokens': 75, 'completion_tokens': 56}})"
483]
484},
485"execution_count": 12,
486"metadata": {},
487"output_type": "execute_result"
488}
489],
490"source": [
491"qs = [\n",
492" {'question': \"Which NFL team won the Super Bowl in the 2010 season?\"},\n",
493" {'question': \"If I am 6 ft 4 inches, how tall am I in centimeters?\"},\n",
494" {'question': \"Who was the 12th person on the moon?\"},\n",
495" {'question': \"How many eyes does a blade of grass have?\"}\n",
496"]\n",
497"llm_chain.generate(qs)"
498]
499},
500{
501"attachments": {},
502"cell_type": "markdown",
503"metadata": {},
504"source": [
505"Note that the below format doesn't feed the questions in iteratively but instead all in one chunk."
506]
507},
508{
509"cell_type": "code",
510"execution_count": 15,
511"metadata": {
512"colab": {
513"base_uri": "https://localhost:8080/"
514},
515"id": "w2-es7SgFddS",
516"outputId": "d16b7afc-f0d1-4f6a-9d89-f6811839bb02"
517},
518"outputs": [
519{
520"name": "stdout",
521"output_type": "stream",
522"text": [
523"\n",
524"1. The New Orleans Saints \n",
525"2. 193 centimeters \n",
526"3. Harrison Schmitt \n",
527"4. Zero.\n"
528]
529}
530],
531"source": [
532"qs = [\n",
533" \"Which NFL team won the Super Bowl in the 2010 season?\",\n",
534" \"If I am 6 ft 4 inches, how tall am I in centimeters?\",\n",
535" \"Who was the 12th person on the moon?\",\n",
536" \"How many eyes does a blade of grass have?\"\n",
537"]\n",
538"print(llm_chain.run(qs))"
539]
540},
541{
542"attachments": {},
543"cell_type": "markdown",
544"metadata": {},
545"source": [
546"Now we can try to answer all question in one go, as mentioned, more powerful LLMs like `text-davinci-003` will be more likely to handle these more complex queries."
547]
548},
549{
550"cell_type": "code",
551"execution_count": 17,
552"metadata": {
553"colab": {
554"base_uri": "https://localhost:8080/"
555},
556"id": "xbjxnnVzA47s",
557"outputId": "cf5397ca-9e06-4221-eb45-17b1346f6f06"
558},
559"outputs": [
560{
561"name": "stdout",
562"output_type": "stream",
563"text": [
564"The New Orleans Saints won the Super Bowl in the 2010 season.\n",
565"If you are 6 ft 4 inches, you are 193.04 centimeters tall.\n",
566"The 12th person on the moon was Harrison Schmitt.\n",
567"A blade of grass does not have any eyes."
568]
569}
570],
571"source": [
572"multi_template = \"\"\"Answer the following questions one at a time.\n",
573"\n",
574"Questions:\n",
575"{questions}\n",
576"\n",
577"Answers:\n",
578"\"\"\"\n",
579"long_prompt = PromptTemplate(\n",
580" template=multi_template,\n",
581" input_variables=[\"questions\"]\n",
582")\n",
583"\n",
584"llm_chain = LLMChain(\n",
585" prompt=long_prompt,\n",
586" llm=davinci\n",
587")\n",
588"\n",
589"qs_str = (\n",
590" \"Which NFL team won the Super Bowl in the 2010 season?\\n\" +\n",
591" \"If I am 6 ft 4 inches, how tall am I in centimeters?\\n\" +\n",
592" \"Who was the 12th person on the moon?\" +\n",
593" \"How many eyes does a blade of grass have?\"\n",
594")\n",
595"\n",
596"print(llm_chain.run(qs_str))"
597]
598},
599{
600"cell_type": "markdown",
601"metadata": {
602"id": "ybMkI18xfbBr"
603},
604"source": [
605"---"
606]
607}
608],
609"metadata": {
610"colab": {
611"provenance": []
612},
613"gpuClass": "standard",
614"kernelspec": {
615"display_name": "ml",
616"language": "python",
617"name": "python3"
618},
619"language_info": {
620"name": "python",
621"version": "3.9.12 (main, Apr 5 2022, 01:52:34) \n[Clang 12.0.0 ]"
622},
623"orig_nbformat": 4,
624"vscode": {
625"interpreter": {
626"hash": "b8e7999f96e1b425e2d542f21b571f5a4be3e97158b0b46ea1b2500df63956ce"
627}
628}
629},
630"nbformat": 4,
631"nbformat_minor": 0
632}
633