9
"display_name": "Python 3",
15
"cell_type": "markdown",
22
"The txtai API is a web-based service backed by [FastAPI](https://fastapi.tiangolo.com/). All txtai functionality including similarity search, extractive QA and zero-shot labeling is available via the API.\n",
24
"This notebook installs the txtai API and shows an example using each of the supported language bindings for txtai."
28
"cell_type": "markdown",
33
"# Install dependencies\n",
35
"Install `txtai` and all dependencies. Since this notebook uses the API, we need to install the api extras package."
45
"!pip install git+https://github.com/neuml/txtai#egg=txtai[api]"
47
"execution_count": null,
51
"cell_type": "markdown",
58
"The first method we'll try is direct access via Python. We'll use zero-shot labeling for all the examples here. See [this notebook](https://colab.research.google.com/github/neuml/txtai/blob/master/examples/07_Apply_labels_with_zero_shot_classification.ipynb) for more details on zero-shot classification. "
62
"cell_type": "markdown",
67
"## Configure Labels instance"
78
"from IPython.core.display import display, HTML\n",
79
"from txtai.pipeline import Labels\n",
83
" <style type='text/css'>\n",
84
" @import url('https://fonts.googleapis.com/css?family=Oswald&display=swap');\n",
86
" border-collapse: collapse;\n",
90
" border: 1px solid #9e9e9e;\n",
92
" font: 20px Oswald;\n",
97
" html += \"<table><thead><tr><th>Text</th><th>Label</th></tr></thead>\"\n",
98
" for text, label in rows:\n",
99
" html += \"<tr><td>%s</td><td>%s</td></tr>\" % (text, label)\n",
100
" html += \"</table>\"\n",
102
" display(HTML(html))\n",
104
"# Create labels model\n",
107
"execution_count": null,
111
"cell_type": "markdown",
116
"## Apply labels to text"
123
"base_uri": "https://localhost:8080/",
126
"id": "-K2YJJzsVtfq",
127
"outputId": "65782fd8-51fb-4531-8e8b-f28bca678fa0"
130
"data = [\"Wears a red suit and says ho ho\",\n",
131
" \"Pulls a flying sleigh\",\n",
132
" \"This is cut down and decorated\",\n",
133
" \"Santa puts these under the tree\",\n",
134
" \"Best way to spend the holidays\"]\n",
136
"# List of labels\n",
137
"tags = [\"🎅 Santa Clause\", \"🦌 Reindeer\", \"🍪 Cookies\", \"🎄 Christmas Tree\", \"🎁 Gifts\", \"👪 Family\"]\n",
139
"# Render output to table\n",
140
"table([(text, tags[labels(text, tags)[0][0]]) for text in data])"
142
"execution_count": null,
145
"output_type": "display_data",
149
" <style type='text/css'>\n",
150
" @import url('https://fonts.googleapis.com/css?family=Oswald&display=swap');\n",
152
" border-collapse: collapse;\n",
156
" border: 1px solid #9e9e9e;\n",
158
" font: 20px Oswald;\n",
161
" <table><thead><tr><th>Text</th><th>Label</th></tr></thead><tr><td>Wears a red suit and says ho ho</td><td>🎅 Santa Clause</td></tr><tr><td>Pulls a flying sleigh</td><td>🦌 Reindeer</td></tr><tr><td>This is cut down and decorated</td><td>🎄 Christmas Tree</td></tr><tr><td>Santa puts these under the tree</td><td>🎁 Gifts</td></tr><tr><td>Best way to spend the holidays</td><td>👪 Family</td></tr></table>"
164
"<IPython.core.display.HTML object>"
172
"cell_type": "markdown",
177
"Once again we see the power of zero-shot labeling. The model wasn't trained on any data specific to this example. Still amazed with how much knowledge is stored in large NLP models."
181
"cell_type": "markdown",
186
"# Start an API instance\n",
188
"Now we'll start an API instance to run the remaining examples. The API needs a configuration file to run. The example below is simplified to only include labeling. See [this link](https://github.com/neuml/txtai#api) for a more detailed configuration example.\n",
190
"The API instance is started in the background.\n"
197
"base_uri": "https://localhost:8080/"
199
"id": "nTDwXOUeTH2-",
200
"outputId": "2220a3c9-1cff-4c2f-b21e-13dd2d7cb816"
203
"%%writefile index.yml\n",
205
"# Labels settings\n",
208
"execution_count": null,
211
"output_type": "stream",
214
"Writing index.yml\n"
225
"!CONFIG=index.yml nohup uvicorn \"txtai.api:app\" &> api.log &\n",
228
"execution_count": null,
232
"cell_type": "markdown",
239
"txtai.js is available via NPM and can be installed as follows.\n",
242
"npm install txtai\n",
245
"For this example, we'll clone the txtai.js project to import the example build configuration."
255
"!git clone https://github.com/neuml/txtai.js"
257
"execution_count": null,
261
"cell_type": "markdown",
266
"## Create labels.js\n",
268
"The following file is a JavaScript version of the labels example."
275
"base_uri": "https://localhost:8080/"
277
"id": "zJbKRTSJV-kd",
278
"outputId": "6c111b5d-6e55-4dac-c6c2-0988c2a834da"
281
"%%writefile txtai.js/examples/node/src/labels.js\n",
282
"import {Labels} from \"txtai\";\n",
283
"import {sprintf} from \"sprintf-js\";\n",
285
"const run = async () => {\n",
287
" let labels = new Labels(\"http://localhost:8000\");\n",
289
" let data = [\"Wears a red suit and says ho ho\",\n",
290
" \"Pulls a flying sleigh\",\n",
291
" \"This is cut down and decorated\",\n",
292
" \"Santa puts these under the tree\",\n",
293
" \"Best way to spend the holidays\"];\n",
295
" // List of labels\n",
296
" let tags = [\"🎅 Santa Clause\", \"🦌 Reindeer\", \"🍪 Cookies\", \"🎄 Christmas Tree\", \"🎁 Gifts\", \"👪 Family\"];\n",
298
" console.log(sprintf(\"%-40s %s\", \"Text\", \"Label\"));\n",
299
" console.log(\"-\".repeat(75))\n",
301
" for (let text of data) {\n",
302
" let label = await labels.label(text, tags);\n",
303
" label = tags[label[0].id];\n",
305
" console.log(sprintf(\"%-40s %s\", text, label));\n",
309
" console.trace(e);\n",
315
"execution_count": null,
318
"output_type": "stream",
321
"Overwriting txtai.js/examples/node/src/labels.js\n"
327
"cell_type": "markdown",
332
"## Build and run labels example\n",
345
"os.chdir(\"txtai.js/examples/node\")\n",
349
"execution_count": null,
356
"base_uri": "https://localhost:8080/"
358
"id": "ckOHNqyaeL-B",
359
"outputId": "6d8e745c-52d1-4456-fc46-2ff8fda2e675"
362
"!node dist/labels.js"
364
"execution_count": null,
367
"output_type": "stream",
371
"---------------------------------------------------------------------------\n",
372
"Wears a red suit and says ho ho 🎅 Santa Clause\n",
373
"Pulls a flying sleigh 🦌 Reindeer\n",
374
"This is cut down and decorated 🎄 Christmas Tree\n",
375
"Santa puts these under the tree 🎁 Gifts\n",
376
"Best way to spend the holidays 👪 Family\n"
382
"cell_type": "markdown",
387
"The JavaScript program is showing the same results as when natively running through Python!"
391
"cell_type": "markdown",
398
"txtai.java integrates with standard Java build tools (Gradle, Maven, SBT). The following shows how to add txtai as a dependency to Gradle.\n",
401
"implementation 'com.github.neuml:txtai.java:v4.0.0'\n",
404
"For this example, we'll clone the txtai.java project to import the example build configuration."
414
"os.chdir(\"/content\")\n",
415
"!git clone https://github.com/neuml/txtai.java"
417
"execution_count": null,
421
"cell_type": "markdown",
426
"## Create LabelsDemo.java\n",
428
"The following file is a Java version of the labels example."
435
"base_uri": "https://localhost:8080/"
437
"id": "v73L8Gw0p6fh",
438
"outputId": "a7f797f2-a91f-4033-89c7-4baf76204d93"
441
"%%writefile txtai.java/examples/src/main/java/LabelsDemo.java\n",
442
"import java.util.Arrays;\n",
443
"import java.util.ArrayList;\n",
444
"import java.util.List;\n",
446
"import txtai.API.IndexResult;\n",
447
"import txtai.Labels;\n",
449
"public class LabelsDemo {\n",
450
" public static void main(String[] args) {\n",
452
" Labels labels = new Labels(\"http://localhost:8000\");\n",
454
" List <String> data = \n",
455
" Arrays.asList(\"Wears a red suit and says ho ho\",\n",
456
" \"Pulls a flying sleigh\",\n",
457
" \"This is cut down and decorated\",\n",
458
" \"Santa puts these under the tree\",\n",
459
" \"Best way to spend the holidays\");\n",
461
" // List of labels\n",
462
" List<String> tags = Arrays.asList(\"🎅 Santa Clause\", \"🦌 Reindeer\", \"🍪 Cookies\", \"🎄 Christmas Tree\", \"🎁 Gifts\", \"👪 Family\");\n",
464
" System.out.printf(\"%-40s %s%n\", \"Text\", \"Label\");\n",
465
" System.out.println(new String(new char[75]).replace(\"\\0\", \"-\"));\n",
467
" for (String text: data) {\n",
468
" List<IndexResult> label = labels.label(text, tags);\n",
469
" System.out.printf(\"%-40s %s%n\", text, tags.get(label.get(0).id));\n",
472
" catch (Exception ex) {\n",
473
" ex.printStackTrace();\n",
478
"execution_count": null,
481
"output_type": "stream",
484
"Overwriting txtai.java/examples/src/main/java/LabelsDemo.java\n"
490
"cell_type": "markdown",
495
"## Build and run labels example"
502
"base_uri": "https://localhost:8080/"
504
"id": "N2Mm3Gl5sH1z",
505
"outputId": "b5249daf-e5a1-4b71-b64c-2b3c6748e846"
508
"os.chdir(\"txtai.java/examples\")\n",
509
"!../gradlew -q --console=plain labels 2> /dev/null"
511
"execution_count": null,
514
"output_type": "stream",
518
"---------------------------------------------------------------------------\n",
519
"Wears a red suit and says ho ho 🎅 Santa Clause\n",
520
"Pulls a flying sleigh 🦌 Reindeer\n",
521
"This is cut down and decorated 🎄 Christmas Tree\n",
522
"Santa puts these under the tree 🎁 Gifts\n",
523
"Best way to spend the holidays 👪 Family\n",
530
"cell_type": "markdown",
535
"The Java program is showing the same results as when natively running through Python!"
539
"cell_type": "markdown",
546
"txtai.rs is available via crates.io and can be installed by adding the following to your cargo.toml file.\n",
550
"txtai = { version = \"4.0\" }\n",
551
"tokio = { version = \"0.2\", features = [\"full\"] }\n",
554
"For this example, we'll clone the txtai.rs project to import the example build configuration. First we need to install Rust."
564
"os.chdir(\"/content\")\n",
565
"!apt-get install rustc\n",
566
"!git clone https://github.com/neuml/txtai.rs"
568
"execution_count": null,
572
"cell_type": "markdown",
577
"## Create labels.rs\n",
579
"The following file is a Rust version of the labels example."
586
"base_uri": "https://localhost:8080/"
588
"id": "jjggKnKQ7jQO",
589
"outputId": "76a2b1d9-2889-47b0-a3af-5d71a763bb0b"
592
"%%writefile txtai.rs/examples/demo/src/labels.rs\n",
593
"use std::error::Error;\n",
595
"use txtai::labels::Labels;\n",
597
"pub async fn labels() -> Result<(), Box<dyn Error>> {\n",
598
" let labels = Labels::new(\"http://localhost:8000\");\n",
600
" let data = [\"Wears a red suit and says ho ho\",\n",
601
" \"Pulls a flying sleigh\",\n",
602
" \"This is cut down and decorated\",\n",
603
" \"Santa puts these under the tree\",\n",
604
" \"Best way to spend the holidays\"];\n",
606
" println!(\"{:<40} {}\", \"Text\", \"Label\");\n",
607
" println!(\"{}\", \"-\".repeat(75));\n",
609
" for text in data.iter() {\n",
610
" let tags = vec![\"🎅 Santa Clause\", \"🦌 Reindeer\", \"🍪 Cookies\", \"🎄 Christmas Tree\", \"🎁 Gifts\", \"👪 Family\"];\n",
611
" let label = labels.label(text, &tags).await?[0].id;\n",
613
" println!(\"{:<40} {}\", text, tags[label]);\n",
619
"execution_count": null,
622
"output_type": "stream",
625
"Overwriting txtai.rs/examples/demo/src/labels.rs\n"
631
"cell_type": "markdown",
636
"## Build and run labels example\n",
649
"os.chdir(\"txtai.rs/examples/demo\")\n",
652
"execution_count": null,
659
"base_uri": "https://localhost:8080/"
661
"id": "-_v_FbL0-yPk",
662
"outputId": "821333f5-5f90-4f89-c2eb-673c2e14e4fe"
667
"execution_count": null,
670
"output_type": "stream",
673
"\u001b[0m\u001b[0m\u001b[1m\u001b[32m Finished\u001b[0m dev [unoptimized + debuginfo] target(s) in 0.07s\n",
674
"\u001b[0m\u001b[0m\u001b[1m\u001b[32m Running\u001b[0m `target/debug/demo labels`\n",
676
"---------------------------------------------------------------------------\n",
677
"Wears a red suit and says ho ho 🎅 Santa Clause\n",
678
"Pulls a flying sleigh 🦌 Reindeer\n",
679
"This is cut down and decorated 🎄 Christmas Tree\n",
680
"Santa puts these under the tree 🎁 Gifts\n",
681
"Best way to spend the holidays 👪 Family\n"
687
"cell_type": "markdown",
692
"The Rust program is showing the same results as when natively running through Python!"
696
"cell_type": "markdown",
703
"txtai.go can be installed by adding the following import statement. When using modules, txtai.go will automatically be installed. Otherwise use `go get`.\n",
706
"import \"github.com/neuml/txtai.go\"\n",
709
"For this example, we'll create a standalone process for labeling. First we need to install Go."
719
"os.chdir(\"/content\")\n",
720
"!apt install golang-go\n",
721
"!go get \"github.com/neuml/txtai.go\""
723
"execution_count": null,
727
"cell_type": "markdown",
732
"## Create labels.go\n",
734
"The following file is a Go version of the labels example."
740
"id": "bLBJwkN4ANpi",
742
"base_uri": "https://localhost:8080/"
744
"outputId": "883ea7b2-2fbc-471c-e0bb-59ef5172a6a4"
747
"%%writefile labels.go\n",
753
"\t\"github.com/neuml/txtai.go\"\n",
757
"\tlabels := txtai.Labels(\"http://localhost:8000\")\n",
759
"\tdata := []string{\"Wears a red suit and says ho ho\",\n",
760
" \"Pulls a flying sleigh\",\n",
761
" \"This is cut down and decorated\",\n",
762
" \"Santa puts these under the tree\",\n",
763
" \"Best way to spend the holidays\"}\n",
765
"\t// List of labels\n",
766
"\ttags := []string{\"🎅 Santa Clause\", \"🦌 Reindeer\", \"🍪 Cookies\", \"🎄 Christmas Tree\", \"🎁 Gifts\", \"👪 Family\"}\n",
768
"\tfmt.Printf(\"%-40s %s\\n\", \"Text\", \"Label\")\n",
769
"\tfmt.Println(strings.Repeat(\"-\", 75))\n",
771
"\tfor _, text := range data {\n",
772
"\t\tlabel := labels.Label(text, tags)\n",
773
"\t\tfmt.Printf(\"%-40s %s\\n\", text, tags[label[0].Id])\n",
777
"execution_count": null,
780
"output_type": "stream",
783
"Writing labels.go\n"
789
"cell_type": "markdown",
794
"## Build and run labels example\n"
800
"id": "l1xnUbtdAy0p",
802
"base_uri": "https://localhost:8080/"
804
"outputId": "5bc6015c-5c9c-4d8a-daf7-6897ec6cbd80"
809
"execution_count": null,
812
"output_type": "stream",
816
"---------------------------------------------------------------------------\n",
817
"Wears a red suit and says ho ho 🎅 Santa Clause\n",
818
"Pulls a flying sleigh 🦌 Reindeer\n",
819
"This is cut down and decorated 🎄 Christmas Tree\n",
820
"Santa puts these under the tree 🎁 Gifts\n",
821
"Best way to spend the holidays 👪 Family\n"
827
"cell_type": "markdown",
832
"The Go program is showing the same results as when natively running through Python!"