tokenizers
42 строки · 1.5 Кб
1🤗 Tokenizers is tested on Python 3.5+.
2
3You should install 🤗 Tokenizers in a
4`virtual environment <https://docs.python.org/3/library/venv.html>`_. If you're unfamiliar with
5Python virtual environments, check out the
6`user guide <https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/>`__.
7Create a virtual environment with the version of Python you're going to use and activate it.
8
9Installation with pip
10----------------------------------------------------------------------------------------------------
11
12🤗 Tokenizers can be installed using pip as follows::
13
14pip install tokenizers
15
16
17Installation from sources
18----------------------------------------------------------------------------------------------------
19
20To use this method, you need to have the Rust language installed. You can follow
21`the official guide <https://www.rust-lang.org/learn/get-started>`__ for more information.
22
23If you are using a unix based OS, the installation should be as simple as running::
24
25curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
26
27Or you can easiy update it with the following command::
28
29rustup update
30
31Once rust is installed, we can start retrieving the sources for 🤗 Tokenizers::
32
33git clone https://github.com/huggingface/tokenizers
34
35Then we go into the python bindings folder::
36
37cd tokenizers/bindings/python
38
39At this point you should have your `virtual environment`_ already activated. In order to
40compile 🤗 Tokenizers, you need to::
41
42pip install -e .
43