wandb
Use W&B to build better models faster. Track and visualize all the pieces of your machine learning pipeline, from datasets to production machine learning models. Get started with W&B today, sign up for a free account!
🎓 W&B is free for students, educators, and academic researchers. For more information, visit https://wandb.ai/site/research.
Want to use Weights & Biases for seamless collaboration between your ML or Data Science team? Looking for Production-grade MLOps at scale? Sign up to one of our plans or contact the Sales Team.
Documentation
See the W&B Developer Guide and API Reference Guide for a full technical description of the W&B platform.
Quickstart
Get started with W&B in four steps:
-
First, sign up for a free W&B account.
-
Second, install the W&B SDK with pip. Navigate to your terminal and type the following command:
pip install wandb
- Third, log into W&B:
wandb.login()
- Use the example code snippet below as a template to integrate W&B to your Python script:
import wandb
# Start a W&B Run with wandb.initrun = wandb.init(project="my_first_project")
# Save model inputs and hyperparameters in a wandb.config objectconfig = run.configconfig.learning_rate = 0.01
# Model training code here ...
# Log metrics over time to visualize performance with wandb.logfor i in range(10): run.log({"loss": loss})
That's it! Navigate to the W&B App to view a dashboard of your first W&B Experiment. Use the W&B App to compare multiple experiments in a unified place, dive into the results of a single run, and much more!
Example W&B Dashboard that shows Runs from an Experiment.
Integrations
Use your favorite framework with W&B. W&B integrations make it fast and easy to set up experiment tracking and data versioning inside existing projects. For more information on how to integrate W&B with the framework of your choice, see the Integrations chapter in the W&B Developer Guide.
🔥 PyTorch
Call .watch
and pass in your PyTorch model to automatically log gradients and store the network topology. Next, use .log
to track other metrics. The following example demonstrates an example of how to do this:
import wandb
# 1. Start a new runrun = wandb.init(project="gpt4")
# 2. Save model inputs and hyperparametersconfig = run.configconfig.dropout = 0.01
# 3. Log gradients and model parametersrun.watch(model)for batch_idx, (data, target) in enumerate(train_loader): ... if batch_idx % args.log_interval == 0: # 4. Log metrics to visualize performance run.log({"loss": loss})
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate PyTorch with W&B.
- Explore W&B Reports.
🌊 TensorFlow/Keras
Use W&B Callbacks to automatically save metrics to W&B when you call `model.fit` during training.The following code example demonstrates how your script might look like when you integrate W&B with Keras:
# This script needs these libraries to be installed:# tensorflow, numpy
import wandbfrom wandb.keras import WandbMetricsLogger, WandbModelCheckpoint
import randomimport numpy as npimport tensorflow as tf
# Start a run, tracking hyperparametersrun = wandb.init( # set the wandb project where this run will be logged project="my-awesome-project", # track hyperparameters and run metadata with wandb.config config={ "layer_1": 512, "activation_1": "relu", "dropout": random.uniform(0.01, 0.80), "layer_2": 10, "activation_2": "softmax", "optimizer": "sgd", "loss": "sparse_categorical_crossentropy", "metric": "accuracy", "epoch": 8, "batch_size": 256, },)
# [optional] use wandb.config as your configconfig = run.config
# get the datamnist = tf.keras.datasets.mnist(x_train, y_train), (x_test, y_test) = mnist.load_data()x_train, x_test = x_train / 255.0, x_test / 255.0x_train, y_train = x_train[::5], y_train[::5]x_test, y_test = x_test[::20], y_test[::20]labels = [str(digit) for digit in range(np.max(y_train) + 1)]
# build a modelmodel = tf.keras.models.Sequential( [ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(config.layer_1, activation=config.activation_1), tf.keras.layers.Dropout(config.dropout), tf.keras.layers.Dense(config.layer_2, activation=config.activation_2), ])
# compile the modelmodel.compile(optimizer=config.optimizer, loss=config.loss, metrics=[config.metric])
# WandbMetricsLogger will log train and validation metrics to wandb# WandbModelCheckpoint will upload model checkpoints to wandbhistory = model.fit( x=x_train, y=y_train, epochs=config.epoch, batch_size=config.batch_size, validation_data=(x_test, y_test), callbacks=[ WandbMetricsLogger(log_freq=5), WandbModelCheckpoint("models"), ],)
# [optional] finish the wandb run, necessary in notebooksrun.finish()
Get started integrating your Keras model with W&B today:
- Run an example Google Colab Notebook
- Read the Developer Guide for technical details on how to integrate Keras with W&B.
- Explore W&B Reports.
🤗 Hugging Face Transformers
Pass wandb
to the report_to
argument when you run a script using a Hugging Face Trainer. W&B will automatically log losses,
evaluation metrics, model topology, and gradients.
Note: The environment you run your script in must have wandb
installed.
The following example demonstrates how to integrate W&B with Hugging Face:
# This script needs these libraries to be installed:# numpy, transformers, datasets
import wandb
import osimport numpy as npfrom datasets import load_datasetfrom transformers import TrainingArguments, Trainerfrom transformers import AutoTokenizer, AutoModelForSequenceClassification
def tokenize_function(examples): return tokenizer(examples["text"], padding="max_length", truncation=True)
def compute_metrics(eval_pred): logits, labels = eval_pred predictions = np.argmax(logits, axis=-1) return {"accuracy": np.mean(predictions == labels)}
# download prepare the datadataset = load_dataset("yelp_review_full")tokenizer = AutoTokenizer.from_pretrained("distilbert-base-uncased")
small_train_dataset = dataset["train"].shuffle(seed=42).select(range(1000))small_eval_dataset = dataset["test"].shuffle(seed=42).select(range(300))
small_train_dataset = small_train_dataset.map(tokenize_function, batched=True)small_eval_dataset = small_train_dataset.map(tokenize_function, batched=True)
# download the modelmodel = AutoModelForSequenceClassification.from_pretrained( "distilbert-base-uncased", num_labels=5)
# set the wandb project where this run will be loggedos.environ["WANDB_PROJECT"] = "my-awesome-project"
# save your trained model checkpoint to wandbos.environ["WANDB_LOG_MODEL"] = "true"
# turn off watch to log fasteros.environ["WANDB_WATCH"] = "false"
# pass "wandb" to the `report_to` parameter to turn on wandb loggingtraining_args = TrainingArguments( output_dir="models", report_to="wandb", logging_steps=5, per_device_train_batch_size=32, per_device_eval_batch_size=32, evaluation_strategy="steps", eval_steps=20, max_steps=100, save_steps=100,)
# define the trainer and start trainingtrainer = Trainer( model=model, args=training_args, train_dataset=small_train_dataset, eval_dataset=small_eval_dataset, compute_metrics=compute_metrics,)trainer.train()
# [optional] finish the wandb run, necessary in notebookswandb.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate Hugging Face with W&B.
⚡️ PyTorch Lightning
Build scalable, structured, high-performance PyTorch models with Lightning and log them with W&B.
# This script needs these libraries to be installed:# torch, torchvision, pytorch_lightning
import wandb
import osfrom torch import optim, nn, utilsfrom torchvision.datasets import MNISTfrom torchvision.transforms import ToTensor
import pytorch_lightning as plfrom pytorch_lightning.loggers import WandbLogger
class LitAutoEncoder(pl.LightningModule): def __init__(self, lr=1e-3, inp_size=28, optimizer="Adam"): super().__init__()
self.encoder = nn.Sequential( nn.Linear(inp_size * inp_size, 64), nn.ReLU(), nn.Linear(64, 3) ) self.decoder = nn.Sequential( nn.Linear(3, 64), nn.ReLU(), nn.Linear(64, inp_size * inp_size) ) self.lr = lr
# save hyperparameters to self.hparamsm auto-logged by wandb self.save_hyperparameters()
def training_step(self, batch, batch_idx): x, y = batch x = x.view(x.size(0), -1) z = self.encoder(x) x_hat = self.decoder(z) loss = nn.functional.mse_loss(x_hat, x)
# log metrics to wandb self.log("train_loss", loss) return loss
def configure_optimizers(self): optimizer = optim.Adam(self.parameters(), lr=self.lr) return optimizer
# init the autoencoderautoencoder = LitAutoEncoder(lr=1e-3, inp_size=28)
# setup databatch_size = 32dataset = MNIST(os.getcwd(), download=True, transform=ToTensor())train_loader = utils.data.DataLoader(dataset, shuffle=True)
# initialise the wandb logger and name your wandb projectwandb_logger = WandbLogger(project="my-awesome-project")
# add your batch size to the wandb configwandb_logger.experiment.config["batch_size"] = batch_size
# pass wandb_logger to the Trainertrainer = pl.Trainer(limit_train_batches=750, max_epochs=5, logger=wandb_logger)
# train the modeltrainer.fit(model=autoencoder, train_dataloaders=train_loader)
# [optional] finish the wandb run, necessary in notebookswandb.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate PyTorch Lightning with W&B.
💨 XGBoost
Use W&B Callbacks to automatically save metrics to W&B when you call `model.fit` during training.The following code example demonstrates how your script might look like when you integrate W&B with XGBoost:
# This script needs these libraries to be installed:# numpy, xgboost
import wandbfrom wandb.xgboost import WandbCallback
import numpy as npimport xgboost as xgb
# setup parameters for xgboostparam = { "objective": "multi:softmax", "eta": 0.1, "max_depth": 6, "nthread": 4, "num_class": 6,}
# start a new wandb run to track this scriptrun = wandb.init( # set the wandb project where this run will be logged project="my-awesome-project", # track hyperparameters and run metadata config=param,)
# download data from wandb Artifacts and prep datarun.use_artifact("wandb/intro/dermatology_data:v0", type="dataset").download(".")data = np.loadtxt( "./dermatology.data", delimiter=",", converters={33: lambda x: int(x == "?"), 34: lambda x: int(x) - 1},)sz = data.shape
train = data[: int(sz[0] * 0.7), :]test = data[int(sz[0] * 0.7) :, :]
train_X = train[:, :33]train_Y = train[:, 34]
test_X = test[:, :33]test_Y = test[:, 34]
xg_train = xgb.DMatrix(train_X, label=train_Y)xg_test = xgb.DMatrix(test_X, label=test_Y)watchlist = [(xg_train, "train"), (xg_test, "test")]
# add another config to the wandb runnum_round = 5run.config["num_round"] = 5run.config["data_shape"] = sz
# pass WandbCallback to the booster to log its configs and metricsbst = xgb.train( param, xg_train, num_round, evals=watchlist, callbacks=[WandbCallback()])
# get predictionpred = bst.predict(xg_test)error_rate = np.sum(pred != test_Y) / test_Y.shape[0]
# log your test metric to wandbrun.summary["Error Rate"] = error_rate
# [optional] finish the wandb run, necessary in notebooksrun.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate XGBoost with W&B.
🧮 Sci-Kit Learn
Use wandb to visualize and compare your scikit-learn models' performance:# This script needs these libraries to be installed:# numpy, sklearn
import wandbfrom wandb.sklearn import plot_precision_recall, plot_feature_importancesfrom wandb.sklearn import plot_class_proportions, plot_learning_curve, plot_roc
import numpy as npfrom sklearn import datasetsfrom sklearn.ensemble import RandomForestClassifierfrom sklearn.model_selection import train_test_split
# load and process datawbcd = datasets.load_breast_cancer()feature_names = wbcd.feature_nameslabels = wbcd.target_names
test_size = 0.2X_train, X_test, y_train, y_test = train_test_split( wbcd.data, wbcd.target, test_size=test_size)
# train modelmodel = RandomForestClassifier()model.fit(X_train, y_train)model_params = model.get_params()
# get predictionsy_pred = model.predict(X_test)y_probas = model.predict_proba(X_test)importances = model.feature_importances_indices = np.argsort(importances)[::-1]
# start a new wandb run and add your model hyperparametersrun = wandb.init(project="my-awesome-project", config=model_params)
# Add additional configs to wandbrun.config.update( { "test_size": test_size, "train_len": len(X_train), "test_len": len(X_test), })
# log additional visualisations to wandbplot_class_proportions(y_train, y_test, labels)plot_learning_curve(model, X_train, y_train)plot_roc(y_test, y_probas, labels)plot_precision_recall(y_test, y_probas, labels)plot_feature_importances(model)
# [optional] finish the wandb run, necessary in notebooksrun.finish()
- Run an example Google Colab Notebook.
- Read the Developer Guide for technical details on how to integrate Scikit-Learn with W&B.
W&B Hosting Options
Weights & Biases is available in the cloud or installed on your private infrastructure. Set up a W&B Server in a production environment in one of three ways:
- Production Cloud: Set up a production deployment on a private cloud in just a few steps using terraform scripts provided by W&B.
- Dedicated Cloud: A managed, dedicated deployment on W&B's single-tenant infrastructure in your choice of cloud region.
- On-Prem/Bare Metal: W&B supports setting up a production server on most bare metal servers in your on-premise data centers. Quickly get started by running
wandb server
to easily start hosting W&B on your local infrastructure.
See the Hosting documentation in the W&B Developer Guide for more information.
Contribution guidelines
Weights & Biases ❤️ open source, and we welcome contributions from the community! See the Contribution guide for more information on the development workflow and the internals of the wandb library. For wandb bugs and feature requests, visit GitHub Issues or contact support@wandb.com .
W&B Community
Be a part of the growing W&B Community and interact with the W&B team in our Discord. Stay connected with the latest ML updates and tutorials with W&B Fully Connected.
License
Описание
🔥 A tool for visualizing and tracking your machine learning experiments. This repo contains the CLI and Python API.
Языки
Python
- Rust
- Go
- C
- Swift
- Jupyter Notebook
- Makefile
- C++
- Shell
- Dockerfile
- PureBasic