LLM-FineTuning-Large-Language-Models

togetherai-api-with_Mixtral.ipynb
227 строк · 6.4 Кб
Перенос по словам
1
{
2
 "cells": [
3
  {
4
   "cell_type": "markdown",
5
   "metadata": {},
6
   "source": [
7
    "## Together api with Mixtral\n",
8
    "\n",
9
    "### Checkout my [Twitter(@rohanpaul_ai)](https://twitter.com/rohanpaul_ai) for daily LLM bits"
10
   ]
11
  },
12
  {
13
   "cell_type": "code",
14
   "execution_count": null,
15
   "metadata": {},
16
   "outputs": [],
17
   "source": [
18
    "# !pip install together python-dotenv"
19
   ]
20
  },
21
  {
22
   "cell_type": "code",
23
   "execution_count": null,
24
   "metadata": {},
25
   "outputs": [],
26
   "source": [
27
    "import together\n",
28
    "import dotenv\n",
29
    "import os\n",
30
    "\n",
31
    "dotenv.load_dotenv()\n",
32
    "together.api_key = os.getenv(\"together_key\")"
33
   ]
34
  },
35
  {
36
   "cell_type": "code",
37
   "execution_count": null,
38
   "metadata": {},
39
   "outputs": [],
40
   "source": [
41
    "model_list = together.Models.list()\n",
42
    "print(f\"{len(model_list)} models available\")"
43
   ]
44
  },
45
  {
46
   "cell_type": "markdown",
47
   "metadata": {},
48
   "source": [
49
    "https://pypi.org/project/together/"
50
   ]
51
  },
52
  {
53
   "cell_type": "markdown",
54
   "metadata": {},
55
   "source": [
56
    "Will print something like below\n",
57
    "\n",
58
    "```\n",
59
    "120 models available\n",
60
    "\n",
61
    "['EleutherAI/gpt-j-6b',\n",
62
    " 'EleutherAI/gpt-neox-20b',\n",
63
    " 'EleutherAI/pythia-12b-v0',\n",
64
    " 'EleutherAI/pythia-1b-v0',\n",
65
    " 'EleutherAI/pythia-2.8b-v0',\n",
66
    " 'EleutherAI/pythia-6.9b',\n",
67
    " 'HuggingFaceH4/starchat-alpha',\n",
68
    " 'NousResearch/Nous-Hermes-13b',\n",
69
    " 'NousResearch/Nous-Hermes-Llama2-13b',\n",
70
    " 'NumbersStation/nsql-6B']\n",
71
    "```\n",
72
    "\n",
73
    "The `Complete` class of the Together Python Library allows you to easily integrate the Together API's completion functionality into your applications, allowing you to generate text with a single line of code.\n",
74
    "\n",
75
    "https://docs.together.ai/docs/python-complete\n"
76
   ]
77
  },
78
  {
79
   "cell_type": "code",
80
   "execution_count": null,
81
   "metadata": {},
82
   "outputs": [],
83
   "source": [
84
    "model = \"mistralai/Mixtral-8x7B-v0.1\"\n",
85
    "\n",
86
    "prompt = \"\"\"To install PSU in your desktop machine first you will\"\"\"\n",
87
    "\n",
88
    "output = together.Complete.create(\n",
89
    "  prompt = prompt,\n",
90
    "  model = model,\n",
91
    "  max_tokens = 64,\n",
92
    "  temperature = 0.7,\n",
93
    "  top_k = 50,\n",
94
    "  top_p = 0.7,\n",
95
    "  repetition_penalty = 1,\n",
96
    "  #stop = [] # add any sequence you want to stop generating at.\n",
97
    ")\n",
98
    "\n",
99
    "# print generated text\n",
100
    "print(output['output']['choices'][0]['text'])"
101
   ]
102
  },
103
  {
104
   "cell_type": "markdown",
105
   "metadata": {},
106
   "source": [
107
    "`max_tokens (integer, optional)` -- Maximum number of tokens the model should generate. Default: 128\n",
108
    "\n",
109
    "`stop (List[str], optional)` -- List of stop words the model should stop generation at. Default: [\"<human>\"]\n",
110
    "\n",
111
    "\n",
112
    "`temperature(float, optional)` -- A decimal number that determines the degree of randomness in the response. Default: 0.7\n",
113
    "\n",
114
    "`repetition_penalty (float, optional)` -- A number that controls the diversity of generated text by reducing the likelihood of repeated sequences. Higher values decrease repetition. Default: 1\n",
115
    "\n",
116
    "-----------------"
117
   ]
118
  },
119
  {
120
   "cell_type": "markdown",
121
   "metadata": {},
122
   "source": [
123
    "## Run Mixtral-8x7B - with @togethercompute API 🚀\n",
124
    "\n",
125
    "Streaming tokens instead of waiting for the entire response\n",
126
    "\n",
127
    "\n",
128
    "Use the `stream_tokens` parameter to enable streaming responses.\n",
129
    "\n",
130
    "When `stream_tokens` is true, in the request payload, the API returns events as it generates the response instead of waiting for the entire response first."
131
   ]
132
  },
133
  {
134
   "cell_type": "code",
135
   "execution_count": null,
136
   "metadata": {},
137
   "outputs": [],
138
   "source": [
139
    "import json\n",
140
    "import requests\n",
141
    "import sseclient\n",
142
    "\n",
143
    "model_name = \"mistralai/Mixtral-8x7B-v0.1\"\n",
144
    "\n",
145
    "def stream_tokens_from_api(prompt, api_key, model=model_name, max_tokens=512):\n",
146
    "    url = \"https://api.together.xyz/inference\"\n",
147
    "    headers = {\n",
148
    "        \"accept\": \"application/json\",\n",
149
    "        \"content-type\": \"application/json\",\n",
150
    "        \"Authorization\": f\"Bearer {api_key}\",\n",
151
    "    }\n",
152
    "    payload = {\n",
153
    "        \"model\": model,\n",
154
    "        \"prompt\": prompt,\n",
155
    "        \"max_tokens\": max_tokens,\n",
156
    "        \"temperature\": 0.7,\n",
157
    "        \"top_k\": 50,\n",
158
    "        \"top_p\": 0.7,\n",
159
    "        \"repetition_penalty\": 2,\n",
160
    "        \"stream_tokens\": True,\n",
161
    "    }\n",
162
    "\n",
163
    "    try:\n",
164
    "        response = requests.post(url, json=payload, headers=headers, stream=True)\n",
165
    "        response.raise_for_status()\n",
166
    "    except requests.RequestException as e:\n",
167
    "        raise RuntimeError(f\"Request to API failed: {e}\")\n",
168
    "\n",
169
    "    try:\n",
170
    "        client = sseclient.SSEClient(response)\n",
171
    "        for event in client.events():\n",
172
    "            if event.data == \"[DONE]\":\n",
173
    "                break\n",
174
    "            yield json.loads(event.data)[\"choices\"][0][\"text\"]\n",
175
    "    except Exception as e:\n",
176
    "        raise RuntimeError(f\"Error while streaming tokens: {e}\")\n",
177
    "\n",
178
    "# Usage Example\n",
179
    "api_key = \"YOUR_API_KEY\"  # Replace with your API key\n",
180
    "prompt = \"To install PSU in your desktop machine first you will\"\n",
181
    "for token in stream_tokens_from_api(prompt, api_key):\n",
182
    "    print(token, end=\"\", flush=True)"
183
   ]
184
  },
185
  {
186
   "cell_type": "markdown",
187
   "metadata": {},
188
   "source": [
189
    "📌 Usage Example:"
190
   ]
191
  },
192
  {
193
   "cell_type": "code",
194
   "execution_count": null,
195
   "metadata": {},
196
   "outputs": [],
197
   "source": [
198
    "api_key = \"YOUR_API_KEY\"\n",
199
    "prompt = \"To install PSU in your desktop machine first you will\"\n",
200
    "\n",
201
    "for token in stream_tokens_from_api(prompt, api_key):\n",
202
    "    print(token, end=\"\", flush=True)"
203
   ]
204
  }
205
 ],
206
 "metadata": {
207
  "kernelspec": {
208
   "display_name": "Python 3 (ipykernel)",
209
   "language": "python",
210
   "name": "python3"
211
  },
212
  "language_info": {
213
   "codemirror_mode": {
214
    "name": "ipython",
215
    "version": 3
216
   },
217
   "file_extension": ".py",
218
   "mimetype": "text/x-python",
219
   "name": "python",
220
   "nbconvert_exporter": "python",
221
   "pygments_lexer": "ipython3",
222
   "version": "3.8.10"
223
  }
224
 },
225
 "nbformat": 4,
226
 "nbformat_minor": 4
227
}
228
LLM-FineTuning-Large-Language-Models

Использование cookies