openai-cookbook

whisper.ipynb
228 строк · 6.9 Кб
Перенос по словам
1
{
2
 "cells": [
3
  {
4
   "attachments": {},
5
   "cell_type": "markdown",
6
   "metadata": {},
7
   "source": [
8
    "# Azure audio whisper (preview) example\n",
9
    "\n",
10
    "The example shows how to use the Azure OpenAI Whisper model to transcribe audio files.\n"
11
   ]
12
  },
13
  {
14
   "cell_type": "markdown",
15
   "metadata": {},
16
   "source": [
17
    "## Setup\n",
18
    "\n",
19
    "First, we install the necessary dependencies and import the libraries we will be using."
20
   ]
21
  },
22
  {
23
   "cell_type": "code",
24
   "execution_count": null,
25
   "metadata": {},
26
   "outputs": [],
27
   "source": [
28
    "! pip install \"openai>=1.0.0,<2.0.0\"\n",
29
    "! pip install python-dotenv"
30
   ]
31
  },
32
  {
33
   "cell_type": "code",
34
   "execution_count": null,
35
   "metadata": {},
36
   "outputs": [],
37
   "source": [
38
    "import os\n",
39
    "import openai\n",
40
    "import dotenv\n",
41
    "\n",
42
    "dotenv.load_dotenv()"
43
   ]
44
  },
45
  {
46
   "cell_type": "markdown",
47
   "metadata": {},
48
   "source": [
49
    "### Authentication\n",
50
    "\n",
51
    "The Azure OpenAI service supports multiple authentication mechanisms that include API keys and Azure Active Directory token credentials."
52
   ]
53
  },
54
  {
55
   "cell_type": "code",
56
   "execution_count": 4,
57
   "metadata": {},
58
   "outputs": [],
59
   "source": [
60
    "use_azure_active_directory = False  # Set this flag to True if you are using Azure Active Directory"
61
   ]
62
  },
63
  {
64
   "cell_type": "markdown",
65
   "metadata": {},
66
   "source": [
67
    "#### Authentication using API key\n",
68
    "\n",
69
    "To set up the OpenAI SDK to use an *Azure API Key*, we need to set `api_key` to a key associated with your endpoint (you can find this key in *\"Keys and Endpoints\"* under *\"Resource Management\"* in the [Azure Portal](https://portal.azure.com)). You'll also find the endpoint for your resource here."
70
   ]
71
  },
72
  {
73
   "cell_type": "code",
74
   "execution_count": 5,
75
   "metadata": {},
76
   "outputs": [],
77
   "source": [
78
    "if not use_azure_active_directory:\n",
79
    "    endpoint = os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n",
80
    "    api_key = os.environ[\"AZURE_OPENAI_API_KEY\"]\n",
81
    "\n",
82
    "    client = openai.AzureOpenAI(\n",
83
    "        azure_endpoint=endpoint,\n",
84
    "        api_key=api_key,\n",
85
    "        api_version=\"2023-09-01-preview\"\n",
86
    "    )"
87
   ]
88
  },
89
  {
90
   "cell_type": "markdown",
91
   "metadata": {},
92
   "source": [
93
    "#### Authentication using Azure Active Directory\n",
94
    "Let's now see how we can autheticate via Azure Active Directory. We'll start by installing the `azure-identity` library. This library will provide the token credentials we need to authenticate and help us build a token credential provider through the `get_bearer_token_provider` helper function. It's recommended to use `get_bearer_token_provider` over providing a static token to `AzureOpenAI` because this API will automatically cache and refresh tokens for you. \n",
95
    "\n",
96
    "For more information on how to set up Azure Active Directory authentication with Azure OpenAI, see the [documentation](https://learn.microsoft.com/azure/ai-services/openai/how-to/managed-identity)."
97
   ]
98
  },
99
  {
100
   "cell_type": "code",
101
   "execution_count": null,
102
   "metadata": {},
103
   "outputs": [],
104
   "source": [
105
    "! pip install \"azure-identity>=1.15.0\""
106
   ]
107
  },
108
  {
109
   "cell_type": "code",
110
   "execution_count": 5,
111
   "metadata": {},
112
   "outputs": [],
113
   "source": [
114
    "from azure.identity import DefaultAzureCredential, get_bearer_token_provider\n",
115
    "\n",
116
    "if use_azure_active_directory:\n",
117
    "    endpoint = os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n",
118
    "    api_key = os.environ[\"AZURE_OPENAI_API_KEY\"]\n",
119
    "\n",
120
    "    client = openai.AzureOpenAI(\n",
121
    "        azure_endpoint=endpoint,\n",
122
    "        azure_ad_token_provider=get_bearer_token_provider(DefaultAzureCredential(), \"https://cognitiveservices.azure.com/.default\"),\n",
123
    "        api_version=\"2023-09-01-preview\"\n",
124
    "    )"
125
   ]
126
  },
127
  {
128
   "cell_type": "markdown",
129
   "metadata": {},
130
   "source": [
131
    "> Note: the AzureOpenAI infers the following arguments from their corresponding environment variables if they are not provided:\n",
132
    "\n",
133
    "- `api_key` from `AZURE_OPENAI_API_KEY`\n",
134
    "- `azure_ad_token` from `AZURE_OPENAI_AD_TOKEN`\n",
135
    "- `api_version` from `OPENAI_API_VERSION`\n",
136
    "- `azure_endpoint` from `AZURE_OPENAI_ENDPOINT`\n"
137
   ]
138
  },
139
  {
140
   "attachments": {},
141
   "cell_type": "markdown",
142
   "metadata": {},
143
   "source": [
144
    "## Deployments\n",
145
    "\n",
146
    "In this section we are going to create a deployment using the `whisper-1` model to transcribe audio files."
147
   ]
148
  },
149
  {
150
   "cell_type": "markdown",
151
   "metadata": {},
152
   "source": [
153
    "### Deployments: Create in the Azure OpenAI Studio\n",
154
    "Let's deploy a model to use with whisper. Go to https://portal.azure.com, find your Azure OpenAI resource, and then navigate to the Azure OpenAI Studio. Click on the \"Deployments\" tab and then create a deployment for the model you want to use for whisper. The deployment name that you give the model will be used in the code below."
155
   ]
156
  },
157
  {
158
   "cell_type": "code",
159
   "execution_count": 6,
160
   "metadata": {},
161
   "outputs": [],
162
   "source": [
163
    "deployment = \"whisper-deployment\" # Fill in the deployment name from the portal here"
164
   ]
165
  },
166
  {
167
   "cell_type": "markdown",
168
   "metadata": {},
169
   "source": [
170
    "## Audio transcription\n",
171
    "\n",
172
    "Audio transcription, or speech-to-text, is the process of converting spoken words into text. Use the `openai.Audio.transcribe` method to transcribe an audio file stream to text.\n",
173
    "\n",
174
    "You can get sample audio files from the [Azure AI Speech SDK repository at GitHub](https://github.com/Azure-Samples/cognitive-services-speech-sdk/tree/master/sampledata/audiofiles)."
175
   ]
176
  },
177
  {
178
   "cell_type": "code",
179
   "execution_count": 7,
180
   "metadata": {},
181
   "outputs": [],
182
   "source": [
183
    "# download sample audio file\n",
184
    "import requests\n",
185
    "\n",
186
    "sample_audio_url = \"https://github.com/Azure-Samples/cognitive-services-speech-sdk/raw/master/sampledata/audiofiles/wikipediaOcelot.wav\"\n",
187
    "audio_file = requests.get(sample_audio_url)\n",
188
    "with open(\"wikipediaOcelot.wav\", \"wb\") as f:\n",
189
    "    f.write(audio_file.content)"
190
   ]
191
  },
192
  {
193
   "cell_type": "code",
194
   "execution_count": null,
195
   "metadata": {},
196
   "outputs": [],
197
   "source": [
198
    "transcription = client.audio.transcriptions.create(\n",
199
    "    file=open(\"wikipediaOcelot.wav\", \"rb\"),\n",
200
    "    model=deployment,\n",
201
    ")\n",
202
    "print(transcription.text)"
203
   ]
204
  }
205
 ],
206
 "metadata": {
207
  "kernelspec": {
208
   "display_name": "venv",
209
   "language": "python",
210
   "name": "python3"
211
  },
212
  "language_info": {
213
   "codemirror_mode": {
214
    "name": "ipython",
215
    "version": 3
216
   },
217
   "file_extension": ".py",
218
   "mimetype": "text/x-python",
219
   "name": "python",
220
   "nbconvert_exporter": "python",
221
   "pygments_lexer": "ipython3",
222
   "version": "3.10.0"
223
  },
224
  "orig_nbformat": 4
225
 },
226
 "nbformat": 4,
227
 "nbformat_minor": 2
228
}
229
openai-cookbook

Использование cookies