openai-cookbook

Форк
0
/
Visualizing_embeddings_with_Atlas.ipynb 
207 строк · 5.7 Кб
1
{
2
 "cells": [
3
  {
4
   "attachments": {},
5
   "cell_type": "markdown",
6
   "metadata": {},
7
   "source": [
8
    "## Visualizing Open AI Embeddings in Atlas\n",
9
    "\n",
10
    "In this example, we will upload food review embeddings to [Atlas](https://atlas.nomic.ai) to visualize the embeddings."
11
   ]
12
  },
13
  {
14
   "attachments": {},
15
   "cell_type": "markdown",
16
   "metadata": {},
17
   "source": [
18
    "## What is Atlas?\n",
19
    "\n",
20
    "[Atlas](https://atlas.nomic.ai) is a machine learning tool used to visualize massive datasets of embeddings in your web browser. Upload millions of embeddings to Atlas and interact with them in your web browser or jupyter notebook."
21
   ]
22
  },
23
  {
24
   "attachments": {},
25
   "cell_type": "markdown",
26
   "metadata": {},
27
   "source": [
28
    "### 1. Login to Atlas.\n"
29
   ]
30
  },
31
  {
32
   "cell_type": "code",
33
   "execution_count": 1,
34
   "metadata": {
35
    "collapsed": false
36
   },
37
   "outputs": [
38
    {
39
     "name": "stdout",
40
     "output_type": "stream",
41
     "text": []
42
    }
43
   ],
44
   "source": [
45
    "!pip install nomic"
46
   ]
47
  },
48
  {
49
   "cell_type": "code",
50
   "execution_count": 3,
51
   "metadata": {},
52
   "outputs": [],
53
   "source": [
54
    "import pandas as pd\n",
55
    "import numpy as np\n",
56
    "from ast import literal_eval\n",
57
    "\n",
58
    "# Load the embeddings\n",
59
    "datafile_path = \"data/fine_food_reviews_with_embeddings_1k.csv\"\n",
60
    "df = pd.read_csv(datafile_path)\n",
61
    "\n",
62
    "# Convert to a list of lists of floats\n",
63
    "embeddings = np.array(df.embedding.apply(literal_eval).to_list())\n",
64
    "df = df.drop('embedding', axis=1)\n",
65
    "df = df.rename(columns={'Unnamed: 0': 'id'})\n"
66
   ]
67
  },
68
  {
69
   "cell_type": "code",
70
   "execution_count": 8,
71
   "metadata": {},
72
   "outputs": [
73
    {
74
     "name": "stderr",
75
     "output_type": "stream",
76
     "text": []
77
    }
78
   ],
79
   "source": [
80
    "import nomic\n",
81
    "from nomic import atlas\n",
82
    "nomic.login('7xDPkYXSYDc1_ErdTPIcoAR9RNd8YDlkS3nVNXcVoIMZ6') #demo account\n",
83
    "\n",
84
    "data = df.to_dict('records')\n",
85
    "project = atlas.map_embeddings(embeddings=embeddings, data=data,\n",
86
    "                               id_field='id',\n",
87
    "                               colorable_fields=['Score'])\n",
88
    "map = project.maps[0]"
89
   ]
90
  },
91
  {
92
   "attachments": {},
93
   "cell_type": "markdown",
94
   "metadata": {},
95
   "source": [
96
    "### 2. Interact with your embeddings in Jupyter"
97
   ]
98
  },
99
  {
100
   "cell_type": "code",
101
   "execution_count": 10,
102
   "metadata": {
103
    "collapsed": false
104
   },
105
   "outputs": [
106
    {
107
     "data": {
108
      "text/html": [
109
       "\n",
110
       "            <h3>Project: meek-laborer</h3>\n",
111
       "            <script>\n",
112
       "            destroy = function() {\n",
113
       "                document.getElementById(\"iframe463f4614-7689-47e4-b55b-1da0cc679559\").remove()\n",
114
       "            }\n",
115
       "        </script>\n",
116
       "\n",
117
       "        <h4>Projection ID: 463f4614-7689-47e4-b55b-1da0cc679559</h4>\n",
118
       "        <div class=\"actions\">\n",
119
       "            <div id=\"hide\" class=\"action\" onclick=\"destroy()\">Hide embedded project</div>\n",
120
       "            <div class=\"action\" id=\"out\">\n",
121
       "                <a href=\"https://atlas.nomic.ai/map/fddc0e07-97c5-477c-827c-96bca44519aa/463f4614-7689-47e4-b55b-1da0cc679559\" target=\"_blank\">Explore on atlas.nomic.ai</a>\n",
122
       "            </div>\n",
123
       "        </div>\n",
124
       "        \n",
125
       "        <iframe class=\"iframe\" id=\"iframe463f4614-7689-47e4-b55b-1da0cc679559\" allow=\"clipboard-read; clipboard-write\" src=\"https://atlas.nomic.ai/map/fddc0e07-97c5-477c-827c-96bca44519aa/463f4614-7689-47e4-b55b-1da0cc679559\">\n",
126
       "        </iframe>\n",
127
       "\n",
128
       "        <style>\n",
129
       "            .iframe {\n",
130
       "                /* vh can be **very** large in vscode ipynb. */\n",
131
       "                height: min(75vh, 66vw);\n",
132
       "                width: 100%;\n",
133
       "            }\n",
134
       "        </style>\n",
135
       "        \n",
136
       "        <style>\n",
137
       "            .actions {\n",
138
       "              display: block;\n",
139
       "            }\n",
140
       "            .action {\n",
141
       "              min-height: 18px;\n",
142
       "              margin: 5px;\n",
143
       "              transition: all 500ms ease-in-out;\n",
144
       "            }\n",
145
       "            .action:hover {\n",
146
       "              cursor: pointer;\n",
147
       "            }\n",
148
       "            #hide:hover::after {\n",
149
       "                content: \" X\";\n",
150
       "            }\n",
151
       "            #out:hover::after {\n",
152
       "                content: \"\";\n",
153
       "            }\n",
154
       "        </style>\n",
155
       "        \n",
156
       "            "
157
      ],
158
      "text/plain": [
159
       "meek-laborer: https://atlas.nomic.ai/map/fddc0e07-97c5-477c-827c-96bca44519aa/463f4614-7689-47e4-b55b-1da0cc679559"
160
      ]
161
     },
162
     "execution_count": 10,
163
     "metadata": {},
164
     "output_type": "execute_result"
165
    }
166
   ],
167
   "source": [
168
    "map"
169
   ]
170
  },
171
  {
172
   "cell_type": "code",
173
   "execution_count": null,
174
   "metadata": {
175
    "collapsed": false
176
   },
177
   "outputs": [],
178
   "source": []
179
  }
180
 ],
181
 "metadata": {
182
  "kernelspec": {
183
   "display_name": "Python 3 (ipykernel)",
184
   "language": "python",
185
   "name": "python3"
186
  },
187
  "language_info": {
188
   "codemirror_mode": {
189
    "name": "ipython",
190
    "version": 3
191
   },
192
   "file_extension": ".py",
193
   "mimetype": "text/x-python",
194
   "name": "python",
195
   "nbconvert_exporter": "python",
196
   "pygments_lexer": "ipython3",
197
   "version": "3.9.15"
198
  },
199
  "vscode": {
200
   "interpreter": {
201
    "hash": "365536dcbde60510dc9073d6b991cd35db2d9bac356a11f5b64279a5e6708b97"
202
   }
203
  }
204
 },
205
 "nbformat": 4,
206
 "nbformat_minor": 4
207
}
208

Использование cookies

Мы используем файлы cookie в соответствии с Политикой конфиденциальности и Политикой использования cookies.

Нажимая кнопку «Принимаю», Вы даете АО «СберТех» согласие на обработку Ваших персональных данных в целях совершенствования нашего веб-сайта и Сервиса GitVerse, а также повышения удобства их использования.

Запретить использование cookies Вы можете самостоятельно в настройках Вашего браузера.