TensorFlow Cloud

The TensorFlow Cloud repository provides APIs that will allow to easily go from debugging, training, tuning your Keras and TensorFlow code in a local environment to distributed training/tuning on Cloud.

Introduction

TensorFlow Cloud `run` API for GCP training/tuning

Installation

Requirements

Python >= 3.6
A Google Cloud project
An authenticated GCP account
Google AI platform APIs enabled for your GCP account. We use the AI platform for deploying docker images on GCP.
Either a functioning version of docker if you want to use a local docker process for your build, or create a cloud storage bucket to use with Google Cloud build for docker image build and publishing.
Authenticate to your Docker Container Registry
(optional) nbconvert if you are using a notebook file as entry_point as shown in usage guide #4.

For detailed end to end setup instructions, please see Setup instructions.

Install latest release

pip install -U tensorflow-cloud

Install from source

git clone https://github.com/tensorflow/cloud.git
cd cloud
pip install src/python/.

High level overview

TensorFlow Cloud package provides the run API for training your models on GCP. To start, let's walk through a simple workflow using this API.

Let's begin with a Keras model training code such as the following, saved as mnist_example.py.

import tensorflow as tf

(x_train, y_train), (_, _) = tf.keras.datasets.mnist.load_data()

x_train = x_train.reshape((60000, 28 * 28))
x_train = x_train.astype('float32') / 255

model = tf.keras.Sequential([
  tf.keras.layers.Dense(512, activation='relu', input_shape=(28 * 28,)),
  tf.keras.layers.Dropout(0.2),
  tf.keras.layers.Dense(10, activation='softmax')
])

model.compile(loss='sparse_categorical_crossentropy',
              optimizer=tf.keras.optimizers.Adam(),
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, batch_size=128)

After you have tested this model on your local environment for a few epochs, probably with a small dataset, you can train the model on Google Cloud by writing the following simple script scale_mnist.py.
```
import tensorflow_cloud as tfc
tfc.run(entry_point='mnist_example.py')
```
Running scale_mnist.py will automatically apply TensorFlow one device strategy and train your model at scale on Google Cloud Platform. Please see the usage guide section for detailed instructions and additional API parameters.

You will see an output similar to the following on your console. This information can be used to track the training job status.

user@desktop$ python scale_mnist.py
Job submitted successfully.
Your job ID is:  tf_cloud_train_519ec89c_a876_49a9_b578_4fe300f8865e
Please access your job logs at the following URL:
https://console.cloud.google.com/mlengine/jobs/tf_cloud_train_519ec89c_a876_49a9_b578_4fe300f8865e?project=prod-123

Setup instructions

End to end instructions to help set up your environment for Tensorflow Cloud. You use one of the following notebooks to setup your project or follow the instructions below.

<table align="left"> <td> <a href="https://colab.research.google.com/github/tensorflow/cloud/blob/master/examples/google_cloud_project_setup_instructions.ipynb"> <img width="50" src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo">Run in Colab </a> </td> <td> <a href="https://github.com/tensorflow/cloud/blob/master/examples/google_cloud_project_setup_instructions.ipynb"> <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">View on GitHub </a> </td> <td> <a href="https://www.kaggle.com/nitric/google-cloud-project-setup-instructions"> <img width="90" src="https://www.kaggle.com/static/images/site-logo.png" alt="Kaggle logo">Run in Kaggle </a> </td> </table>

Create a new local directory

mkdir tensorflow_cloud
cd tensorflow_cloud

Make sure you have python >= 3.6
```
python -V
```

Set up virtual environment

virtualenv tfcloud --python=python3
source tfcloud/bin/activate

Set up your Google Cloud project

Verify that gcloud sdk is installed.
```
which gcloud
```
Set default gcloud project
```
export PROJECT_ID=<your-project-id>
gcloud config set project $PROJECT_ID
```

Authenticate your GCP account

Create a service account.

export SA_NAME=<your-sa-name>
gcloud iam service-accounts create $SA_NAME
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member serviceAccount:$SA_NAME@$PROJECT_ID.iam.gserviceaccount.com \
    --role 'roles/editor'

Create a key for your service account.

gcloud iam service-accounts keys create ~/key.json --iam-account $SA_NAME@$PROJECT_ID.iam.gserviceaccount.com

Create the GOOGLE_APPLICATION_CREDENTIALS environment variable.

export GOOGLE_APPLICATION_CREDENTIALS=~/key.json

Create a Cloud Storage bucket. Using Google Cloud build is the recommended method for building and publishing docker images, although we optionally allow for local docker daemon process depending on your specific needs.
```
BUCKET_NAME="your-bucket-name"
REGION="us-central1"
gcloud auth login
gsutil mb -l $REGION gs://$BUCKET_NAME
```
(optional for local docker setup) shell sudo dockerd
Authenticate access to Google Cloud registry.
```
gcloud auth configure-docker
```
Install nbconvert if you plan to use a notebook file entry_point as shown in usage guide #4.
```
pip install nbconvert
```
Install latest release of tensorflow-cloud
```
pip install tensorflow-cloud
```

Usage guide

As described in the high level overview, the run API allows you to train your models at scale on GCP. The run API can be used in four different ways. This is defined by where you are running the API (Terminal vs IPython notebook), and your entry_point parameter. entry_point is an optional Python script or notebook file path to the file that contains your TensorFlow Keras training code. This is the most important parameter in the API.

run(entry_point=None,
    requirements_txt=None,
    distribution_strategy='auto',
    docker_config='auto',
    chief_config='auto',
    worker_config='auto',
    worker_count=0,
    entry_point_args=None,
    stream_logs=False,
    job_labels=None,
    **kwargs)

Using a python file as entry_point.

If you have your tf.keras model in a python file (mnist_example.py), then you can write the following simple script (scale_mnist.py) to scale your model on GCP.
```
import tensorflow_cloud as tfc
tfc.run(entry_point='mnist_example.py')
```
Please note that all the files in the same directory tree as entry_point will be packaged in the docker image created, along with the entry_point file. It's recommended to create a new directory to house each cloud project which includes necessary files and nothing else, to optimize image build times.
Using a notebook file as entry_point.

If you have your tf.keras model in a notebook file (mnist_example.ipynb), then you can write the following simple script (scale_mnist.py) to scale your model on GCP.
```
import tensorflow_cloud as tfc
tfc.run(entry_point='mnist_example.ipynb')
```
Please note that all the files in the same directory tree as entry_point will be packaged in the docker image created, along with the entry_point file. Like the python script entry_point above, we recommended creating a new directory to house each cloud project which includes necessary files and nothing else, to optimize image build times.
Using run within a python script that contains the tf.keras model.

You can use the run API from within your python file that contains the tf.keras model (`mnist_scale.

Cloud

Install / Use

README

TensorFlow Cloud

Introduction

TensorFlow Cloud `run` API for GCP training/tuning

Installation

Requirements

Install latest release

Install from source

High level overview

Setup instructions

Usage guide

Cloud

Install / Use

README

TensorFlow Cloud

Introduction

TensorFlow Cloud run API for GCP training/tuning

Installation

Requirements

Install latest release

Install from source

High level overview

Setup instructions

Usage guide

TensorFlow Cloud `run` API for GCP training/tuning