Introduction

In this notebook, I'll show you how to save and restore models with Weights and Biases.

W&B lets you save everything you need to reproduce your models - weights, architecture, predictions, code to a safe place in the cloud.

This is useful because you don’t have to re-train your models, you can simply view their performance days, weeks, or even a few months later. Before you're ready to deploy, you can compare the performance of all the models you trained in the previous months and restore the best performing one.

Try quickly saving and restoring a model in this Colab →.

Save a model

There are two ways to save a file to associate with a run.

  1. Use wandb.save(filename).
  2. Put a file in the wandb run directory, and it will get uploaded at the end of the run.

If you want to sync files as they're being written, you can specify a filename or glob in wandb.save.

Here's how you can do this in just a few lines of code. See this colab for a complete example.

# "model.h5" is saved in wandb.run.dir & will be uploaded at the end of training
model.save(os.path.join(wandb.run.dir, "model.h5"))

# Save a model file manually from the current directory:
wandb.save('model.h5')

# Save all files that currently exist containing the substring "ckpt":
wandb.save('../logs/*ckpt*')

# Save any files starting with "checkpoint" as they're written to:
wandb.save(os.path.join(wandb.run.dir, "checkpoint*"))

You can view your saved models by navigating to a run page, clicking on the Files tab, then clicking on your model file. See an example here.

See the docs for frequently asked questions about saving and restoring.

Restore a model

Restore a file, such as a model checkpoint, into your local run folder to access in your script.

Common use cases:

Here's how you can do this in just a few lines of code. See this colab for a complete example.

# restore the model file "model.h5" from a specific run by user "lavanyashukla"
# in project "save_and_restore" from run "10pr4joa"
best_model = wandb.restore('model.h5', run_path="lavanyashukla/save_and_restore/10pr4joa")

# use the "name" attribute of the returned object if your framework expects a filename, e.g. as in Keras
model.load_weights(best_model.name)

See the restore docs for more details.