project-p-annotations
Project P annotations
Create dataset locally
Assume data in a directory
:
.
├── 2020-05-24
├── 2021-04-13
├── 2022-05-04
├── 2023-04-09
├── 2023-06-30
├── 2023-07-26
├── 2023-07-26-mini-3-pro
├── 2023-07-27-panorama
├── 2023-07-27-phantom-4-pro
Deploy CVAT in a cloud
Any cloud provider is ok, but Project P starts with VK Cloud.
Create a virtual machine
Go to
->
->
, perform step-by-step VM creation, save SSH key (
), then log into the machine:
ssh -i /path/to/key ubuntu@new-ip-address
Install Docker:
sudo apt install docker-ce
Set up DNS
It's preferably to set up DNS for the
in order to connect like a god:
and use HTTPS (otherwise one needs to tunnel via SSH).
Set up CVAT
First, get it:
git clone https://github.com/opencv/cvat.gitcd cvatTAG=$(git tag -l | sort | tail -1)git checkout "$TAG"
Then run:
CVAT_HOST=domain-name docker compose -f docker-compose.yml -f components/serverless/docker-compose.serverless.yml up -d
Configure HTTPS later
Generate CVAT manifest
In order to attach data from a cloud storage, CVAT requires so-called manifest files, which can be generated (from
directory) with:
docker run -ti --rm -u "$(id -u)":"django" -v "$PWD":"/local" --entrypoint python3 cvat/server:"$TAG" utils/dataset_manifest/create.py --output-dir /local /local
Where
from
directory on the server where CVAT is being deployed (or just the same version of CVAT Docker image which is being deployed).
Upload dataset to the cloud
Project P uses S3-like object storage, so some steps for GCP or Azure may differ.
Create a bucket
Go to
->
->
, provide the new bucket name, select storage class (
- cold data storage, 'cause the data are not going to be overwritten/accessed frequently), select default ACL (
- recomended, anyway permissions may be changed later manually).
Create access credentials
For S3-like object storage one needs only two types of credentials: access key id and secret access key.
Go to the newly created bucket ->
->
, provide a name (no need to limit access for now), then copy and save somewhere
and
.
Upload dataset to the bucket
It's possible to do via:
command-line utility (CLI)aws- SDK like
Python packageboto3 - File managers that support S3
- Web interface (yeah, here we go)
Connect the cloud storage in CVAT
Following the official CVAT guide, go to
->
(add), provide a name for the connection, provider (
in this case), bucket name, authorization type (
and fill in the saved access keys), endpoint url (e. g. https://hb.ru-msk.vkcs.cloud or blank for official AWS), region (
, may be added), and manifest (name of the manifest file created earlier
- just put it into the bucket root).
Create a GitHub repository for annotations
Just like this repository.
Create a new task in CVAT
In CVAT ui go to
->
->
(or
-> select a project ->
->
) - task creation form.
First of all, provide the new task name, project name or set of labels (if the task is not within a project), subset (optionally), select files...
Add files from the cloud storage
In
tab ->
->
(name of the connected cloud storage) ->
(no option if it is the only one upon creation) ->
(unfold and select files).
Atatch Git for annotations
In
section set
to an ssh form of git-url, plus annotations file with parent prefix, e. g.:
git@github.com:Shining-Future/project-p-annotations.git [task-name/annotations.xml]
Annotations may be saved as either
or
*.zipfiles.
*.xml
For
select
.
Field
may be useful to set how many images may be in a job.
Add SSH key for annotations repository
Finally hitting
or
will fail and show a pop-up window with SSH key, that must be added account-wide e. g. in GitHub. After public key is added to the GitHub account, task submission will proceed (click one of the submission buttons).
At this point newly created task should be up with cloud data source and Git-versioned annotations.
Set up HTTPS for CVAT
CVAT comes with optional HTTPS support (docker-compose.https.yml), that requires Let's Encypt set up.
Let's Encrypt
Just follow CVAT official installation guide.
Self-signed HTTPS
This is the most interesting option. Following the Ubuntu security reference and some experience with Traefik and Docker Compose one needs the following steps:
- Generate a CSR (certificate signing request) with
andserver.key
as output filesserver.csr - Generate a self-signed certificate with
as output (X.509 format puplic key)server.crt - Install
andserver.key
toserver.crt
and/path/to/private
respectively/path/to/public - Create a drop-in Traefik config, say,
with contents:self-signed.yml
tls: certificates: - certFile: /path/to/public/server.crt keyFile: /path/to/private/server.key
- Create a Docker Compose file, say,
for CVAT:docker-compose.self-signed.yml
services: cvat_server: labels: - traefik.http.routers.cvat.entrypoints=websecure - traefik.http.routers.cvat.tls=true
cvat_ui: labels: - traefik.http.routers.cvat-ui.entrypoints=websecure - traefik.http.routers.cvat-ui.tls=true
traefik: image: traefik:v2.4 container_name: traefik command: - '--providers.docker.exposedByDefault=false' - '--providers.docker.network=cvat' - '--entryPoints.web.address=:80' - '--entryPoints.web.http.redirections.entryPoint.to=websecure' - '--entryPoints.web.http.redirections.entryPoint.scheme=https' - '--entryPoints.websecure.address=:443' - '--providers.file.directory=/etc/traefik/rules' # Uncomment to get Traefik dashboard # - "--entryPoints.dashboard.address=:8090" # - "--api.dashboard=true" ports: - 80:80 - 443:443 volumes: - /path/to/public/server.crt:/path/to/public/server.crt:ro - /path/to/private/server.key:/path/to/private/server.key:ro - /path/to/sef-signed.yml:/etc/traefik/rules/self-signed.yml:ro
- Start CVAT with:
docker compose -f docker-compose.yml -f docker-compose.self-signed.yml up -d
After these manipulations CVAT shall work over 80 port (redirecting it to 443 and using self-signed certificates).
Yeah, that's it. Browsers will complain about non-valid certificate and insecure connection (but we know it's ok).
Nevertheless protocol will be HTTPS and in certificate viewer one can see the details of the created self-signed certificate.