GenerativeAIExamples

Форк
0
/
05_dataloader.ipynb 
210 строк · 6.4 Кб
1
{
2
 "cells": [
3
  {
4
   "cell_type": "markdown",
5
   "id": "a6231285",
6
   "metadata": {},
7
   "source": [
8
    "# Press Release Chat Bot\n",
9
    "\n",
10
    "As part of this generative AI workflow, we create a NVIDIA PR chatbot that answers questions from the NVIDIA news and blogs from years of 2022 and 2023. For this, we have created a REST FastAPI server that wraps llama-index. The API server has two methods, ```upload_document``` and ```generate```. The ```upload_document``` method takes a document from the user's computer and uploads it to a Milvus vector database after splitting, chunking and embedding the document. The ```generate``` API method generates an answer from the provided prompt optionally sourcing information from a vector database. "
11
   ]
12
  },
13
  {
14
   "cell_type": "markdown",
15
   "id": "4c74eaf2",
16
   "metadata": {},
17
   "source": [
18
    "#### Step-1: Load the pdf files from the dataset folder.\n",
19
    "\n",
20
    "You can upload the pdf files containing the NVIDIA blogs to ```query:8081/uploadDocument``` API endpoint"
21
   ]
22
  },
23
  {
24
   "cell_type": "code",
25
   "execution_count": null,
26
   "id": "263a7a8b",
27
   "metadata": {},
28
   "outputs": [],
29
   "source": [
30
    "%%capture\n",
31
    "!unzip dataset.zip"
32
   ]
33
  },
34
  {
35
   "cell_type": "code",
36
   "execution_count": null,
37
   "id": "c2244b8c",
38
   "metadata": {},
39
   "outputs": [],
40
   "source": [
41
    "import os\n",
42
    "import requests\n",
43
    "import mimetypes\n",
44
    "\n",
45
    "def upload_document(file_path, url):\n",
46
    "    headers = {\n",
47
    "        'accept': 'application/json'\n",
48
    "    }\n",
49
    "    mime_type, _ = mimetypes.guess_type(file_path)\n",
50
    "    files = {\n",
51
    "        'file': (file_path, open(file_path, 'rb'), mime_type)\n",
52
    "    }\n",
53
    "    response = requests.post(url, headers=headers, files=files)\n",
54
    "\n",
55
    "    return response.text\n",
56
    "\n",
57
    "def upload_pdf_files(folder_path, upload_url, num_files):\n",
58
    "    i = 0\n",
59
    "    for files in os.listdir(folder_path):\n",
60
    "        _, ext = os.path.splitext(files)\n",
61
    "        # Ingest only pdf files\n",
62
    "        if ext.lower() == \".pdf\":\n",
63
    "            file_path = os.path.join(folder_path, files)\n",
64
    "            print(upload_document(file_path, upload_url))\n",
65
    "            i += 1\n",
66
    "            if i > num_files:\n",
67
    "                break"
68
   ]
69
  },
70
  {
71
   "cell_type": "code",
72
   "execution_count": null,
73
   "id": "4f5c99ac",
74
   "metadata": {},
75
   "outputs": [],
76
   "source": [
77
    "import time\n",
78
    "\n",
79
    "start_time = time.time()\n",
80
    "NUM_DOCS_TO_UPLOAD=100\n",
81
    "upload_pdf_files(\"dataset\", \"http://chain-server:8081/documents\", NUM_DOCS_TO_UPLOAD)\n",
82
    "print(f\"--- {time.time() - start_time} seconds ---\")"
83
   ]
84
  },
85
  {
86
   "cell_type": "markdown",
87
   "id": "830882ef",
88
   "metadata": {},
89
   "source": [
90
    "#### Step-2 : Ask a question without referring to the knowledge base\n",
91
    "Ask Tensorrt LLM llama-2 13B model a question about \"the nvidia grace superchip\" without seeking help from the vectordb/knowledge base by setting ```use_knowledge_base``` to ```false```"
92
   ]
93
  },
94
  {
95
   "cell_type": "code",
96
   "execution_count": null,
97
   "id": "4eb862fd",
98
   "metadata": {},
99
   "outputs": [],
100
   "source": [
101
    "import time\n",
102
    "import json\n",
103
    "\n",
104
    "data = {\n",
105
    " \"messages\": [\n",
106
    "    {\n",
107
    "      \"role\": \"user\",\n",
108
    "      \"content\": \"how many cores are on the nvidia grace superchip?\"\n",
109
    "    }\n",
110
    "  ],\n",
111
    "  \"use_knowledge_base\": \"false\",\n",
112
    "  \"max_tokens\": 256\n",
113
    "}\n",
114
    "\n",
115
    "url = \"http://chain-server:8081/generate\"\n",
116
    "\n",
117
    "start_time = time.time()\n",
118
    "with requests.post(url, stream=True, json=data) as req:\n",
119
    "    for chunk in req.iter_lines():\n",
120
    "        raw_resp = chunk.decode(\"UTF-8\")\n",
121
    "        if not raw_resp:\n",
122
    "            continue\n",
123
    "        resp_dict = json.loads(raw_resp[6:])\n",
124
    "        resp_choices = resp_dict.get(\"choices\", [])\n",
125
    "        if len(resp_choices):\n",
126
    "            resp_str = resp_choices[0].get(\"message\", {}).get(\"content\", \"\")\n",
127
    "            print(resp_str, end =\"\")\n",
128
    "\n",
129
    "print(f\"--- {time.time() - start_time} seconds ---\")"
130
   ]
131
  },
132
  {
133
   "cell_type": "markdown",
134
   "id": "fcf37ee9",
135
   "metadata": {},
136
   "source": [
137
    "Now ask it the same question by setting ```use_knowledge_base``` to ```true```"
138
   ]
139
  },
140
  {
141
   "cell_type": "code",
142
   "execution_count": null,
143
   "id": "e904a658",
144
   "metadata": {},
145
   "outputs": [],
146
   "source": [
147
    "data = {\n",
148
    " \"messages\": [\n",
149
    "    {\n",
150
    "      \"role\": \"user\",\n",
151
    "      \"content\": \"how many cores are on the nvidia grace superchip?\"\n",
152
    "    }\n",
153
    "  ],\n",
154
    "  \"use_knowledge_base\": \"true\",\n",
155
    "  \"max_tokens\": 50\n",
156
    "}\n",
157
    "\n",
158
    "url = \"http://chain-server:8081/generate\"\n",
159
    "\n",
160
    "start_time = time.time()\n",
161
    "tokens_generated = 0\n",
162
    "with requests.post(url, stream=True, json=data) as req:\n",
163
    "    for chunk in req.iter_lines():\n",
164
    "        raw_resp = chunk.decode(\"UTF-8\")\n",
165
    "        if not raw_resp:\n",
166
    "            continue\n",
167
    "        resp_dict = json.loads(raw_resp[6:])\n",
168
    "        resp_choices = resp_dict.get(\"choices\", [])\n",
169
    "        if len(resp_choices):\n",
170
    "            resp_str = resp_choices[0].get(\"message\", {}).get(\"content\", \"\")\n",
171
    "            print(resp_str, end =\"\")\n",
172
    "\n",
173
    "total_time = time.time() - start_time\n",
174
    "print(f\"\\n--- Generated {tokens_generated} tokens in {total_time} seconds ---\")\n",
175
    "print(f\"--- {tokens_generated/total_time} tokens/sec\")"
176
   ]
177
  },
178
  {
179
   "cell_type": "markdown",
180
   "id": "58954d15",
181
   "metadata": {},
182
   "source": [
183
    "#### Next steps\n",
184
    "\n",
185
    "We have setup a playground UI for you to upload files and get answers from, the UI is available on the same IP address as the notebooks: `host_ip:8090/converse`"
186
   ]
187
  }
188
 ],
189
 "metadata": {
190
  "kernelspec": {
191
   "display_name": "Python 3 (ipykernel)",
192
   "language": "python",
193
   "name": "python3"
194
  },
195
  "language_info": {
196
   "codemirror_mode": {
197
    "name": "ipython",
198
    "version": 3
199
   },
200
   "file_extension": ".py",
201
   "mimetype": "text/x-python",
202
   "name": "python",
203
   "nbconvert_exporter": "python",
204
   "pygments_lexer": "ipython3",
205
   "version": "3.10.6"
206
  }
207
 },
208
 "nbformat": 4,
209
 "nbformat_minor": 5
210
}
211

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.