#colab

Introduction

In this tutorial we are going to complete an entire nbdev-based project from start to finish utilizing only Google Colaboratory, a free* alternative Jupyter environment with GPU capabilities. The goal is to set your project up in such a way that building, generating notebooks, testing, and setup is as seamless as possible inside their platform.

This tutorial is also available as a video, which you can watch below:

blah

Why Use Google Colaboratory?

Google Colaboratory offers a few unique advantages over other platforms in regards to the capabilities you can utilize:

  1. You can have synchronous programming within the same notebook

    When working in a team or in a pair-programming situation, you both can have the exact same Jupyter notebook open and synchronously write comments or code within it when your project is based out of your Google Drive. Other platforms have only recently been testing this capability

  2. Smaller projects that don't require git can still be utilized

    There are certainly situations where perhaps you (or the company you work for) don't see the benefit yet in storing your code away in a repository. Simply keeping it in your Google Drive offers you this flexibility, while losing almost none of the benefits of GitHub.

  3. It's free*

    Whether you're a student or someone who worries about your GPU credits, Google Colaboratory has a free option for you to utilize both Jupyter and GPU's without requiring a credit card to get started. Recently Google has released a Colab Pro option for $10/month (only for the US and Canada right now), which allows for better access to GPU's and an in-house terminal, but neither of these are truly needed to get you utilizing Colab for nbdev and your software engineering projects!

Using the Template Repository

Now let's build a library!

The first step in beginning your new nbdev project should be to use the nbdev_template that fastai provides.

To start your project, select "Use this template":

Use this template

From there you will be asked for:

  1. A Repository name - for this tutorial we'll call it "nbdev_colab_tutorial"
  2. Whether the repository will be public or private - for this tutorial use Public
  3. After filling out the above fields select "Create repository from template"

After a few seconds your repository should now be live.

The last step we need to complete for initializing our repository is filling out the settings.ini (You may have received an email notice stating that a run failed, this is why). Open it in your repository:

We need to edit this file with everything that nbdev needs to be properly configured

Read more on settings.ini here You'll see these commented out lines in settings.ini. Uncomment them, and set each value as needed.

# lib_name = your_project_name
# user = your_github_username
# description = A description of your project
# keywords = some keywords
# author = Your Name
# author_email = [email protected]
# copyright = Your Name or Company Name

Afterwards hit the "Commit changes" button at the bottom of the editor:

Now we can get started over in Colab!

Setting up your Google Drive and Git Configuration

For this next step we are going to take advantage of Colab's Scratchpad notebook, as we're going to do a few housekeeping steps that don't need to be logged in a notebook somewhere.

First you should mount your Google Drive in your instance by running the following cell and following the prompt:

from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive

Your drive is now located at /content/drive/MyDrive

Next we need to change our working directory into our Google Drive, as this is where our repository will get saved to and where we can setup git

%cd drive/MyDrive
/content/drive/MyDrive

Now we can clone our repository into our Drive:

!git clone https://github.com/{insert_username_here}/nbdev_colab_tutorial
Cloning into 'nbdev_colab_tutorial'...
remote: Enumerating objects: 25, done.
remote: Counting objects: 100% (25/25), done.
remote: Compressing objects: 100% (22/22), done.
remote: Total 25 (delta 3), reused 13 (delta 0), pack-reused 0
Unpacking objects: 100% (25/25), done.

We need to change directories again into our project:

%cd nbdev_colab_tutorial
/content/drive/MyDrive/nbdev_colab_tutorial

To let git know who we are, we need to run two bash commands:

  • git config user.name
  • git config user.email

Let's do so:

!git config user.name "Your name"
!git config user.email "[email protected]"

Lastly we need to export the template 00_core notebook so that we have a module:

!nbdev_build_lib
Converted 00_core.ipynb.
Converted index.ipynb.

The project has now been completely primed and is ready to be worked on

Writing a New Module and the Setup Cell for Each Notebook

To add in a new module or notebook to your library you should create a new Colab notebook through your Google Drive in whichever folder your notebooks are stored in.

From there all that is needed is for you to run a small cell that should be at the top of each notebook, setting up your Colab environment and performs the following setup steps:

  1. Mount Google Drive
  2. Change to the library directory
  3. Installs the library
  4. Installs nbdev
from google.colab import drive
drive.mount('/content/drive')
%cd 'drive/MyDrive/{path_to_repository}/{library_name}'
!pip3 install -e . -q
!pip3 install nbdev -q

From there you can build your library following the tutorial. Any changes you save in your notebook are reflected in your Google Drive, so you can always call nbdev_build_lib, _clean_nbs, etc and have it function as expected.

Pushing to Git

The last step is pushing our changes to GitHub. We can check in our files to our GitHub repo with the following commands:

!git add *
!git commit -m "Test commit message"
!git push origin master

And that's it! You now know how to run nbdev entirely out of your Google Drive and Google Colaboratory.