Custom model with Estimators in TensorFlow

Custom model with Estimators in TensorFlow

In the last tutorial we saw how easy it is to do a simple linear regression using TensorFlow. We used a LinearRegressor, which did most of the work for us. This approach is great to get started and perform simple tasks. However, if you want to have more freedom about how your model is built and how it is trained, this approach will be insufficient. Fortunately, TensorFlow is also providing us with a more flexible way of doing things.
Using the example of linear regression, we want to show you how to create a custom model and train it. The techniques you will see here are basically the same later on to train more complex models like neural networks and do more complex tasks like image classification.

Preparation

Similar to our previous example we want to train the model y = m * x + b to find the best fitting values for m and b with a given set of x and y values. We are starting by generating the values for x and y.

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
# if you are using Juypter and want to see the plot inlined
%matplotlib inline 

x = np.linspace(0, 200, 100, dtype=np.float32) + np.random.uniform(-100, 100, size=100).astype(np.float32)
y = np.linspace(0, 200, 100, dtype=np.float32) + np.random.uniform(-100, 100, size=100).astype(np.float32)

input_fn = tf.estimator.inputs.numpy_input_fn({"x": x},  y,  shuffle=True)

This code should be familiar to you, if you read the previous tutorial on doing Simple linear regression with TensorFlow.

Creating a model function

With the newly created values for x and y we can start creating our custom model function. Instead of using the specific LinearRegressor we will use the more general Estimator class. The Estimator requires a model function, describing the model, as well as the training processes being used.
To create such a function, we can start with

def model_fn(features, labels, mode, params):

The function needs to be defined with four parameters:

The first thing we want to do in the model function is to get x from the given features. Remember we are providing a dictionary with the key "x" in the input function.

x = features["x"]

As a next step we create the variables for m and b needed in our model. To get these variables we simple create them using the TensorFlow API.

m = tf.Variable(tf.random_uniform([1]))
b = tf.Variable(tf.zeros([1]))

We initialize m with a random value, because we want to multiply m with x. If we would use 0 as initial value, the calculation would always stay 0 and our model is not able to learn. This is avoided doing the random initialization. For b we can use 0, because we will simply do an addition and it is no problem to start from 0.

Now that we got x, m and b we can create our model by simply writing the line function.

y_pred = m * x + b

In order to do the actual training, we have to compare the result of our model y_pred with the known correct values of y and tell the model to adjust its values. This step should only be performed during training. We can use the parameter mode to check if a training is being done. If so, we calculate the error our model produces on the given data.

train_op = None
error = None

if mode == tf.estimator.ModeKeys.TRAIN:
    error = tf.reduce_mean((y_pred - labels) ** 2)

Given the labels parameter, which is equal to y, we subtract it from the calculated result of our model y_pred. We then square the result so that the error always is a positive number and positive and negative errors don't cancel each other out. Last step in calculating the error is to get the reduced mean. y_predand labels are both vectors or arrays, but we want a single value representing the overall error of our model. This is why we are doing the reduction here.

With the calculated error we need to establish a training method that is trying to minimize the error and adjust the values of m and b. TensorFlow is providing us with a lot of different options of so called optimizers. In this case we will use the FtrlOptimizer to perform the training. This optimizer will run a "Follow the regularized leader" algorithm to adjust mand b. If you want to learn more about this see FRTL (Wiki).

    optimizer = tf.train.FtrlOptimizer(0.1)
    train_op = optimizer.minimize(error, global_step=tf.train.get_global_step())

First, we create the FtrlOptimizer and give it a learning rate of 0.1. The learning rate is determining how fast an optimizer is trying to converge to the goal of the training. You can play around with these values and see the difference in the change of the calculated error between multiple epochs.
On the optimizer we call minimize to tell it that our training goal is to get the error as low as possible. Therefore our first parameter is the error we calculated above. The second parameter is the current training step we are in. This has to be passed in, so TensorFlow can keep track of the steps (training iterations) it already took.

The last thing that has to be done is wrap everything together in an EstimatorSpec and return it from our model function. Similar to the four parameters the model function has to have, it always has to return an EstimatorSpec.

return tf.estimator.EstimatorSpec(mode, y_pred, error, train_op)

The parameters for the EstimatorSpec are

Altogether the model function should look like this

def model_fn(features, labels, mode, params):
    x = features["x"]

    m = tf.Variable(tf.random_uniform([1]))
    b = tf.Variable(tf.zeros([1]))

    y_pred = m * x + b

    train_op = None
    error = None

    if mode == tf.estimator.ModeKeys.TRAIN:
        error = tf.reduce_mean((y_pred - labels) ** 2)
        optimizer = tf.train.FtrlOptimizer(0.1)
        train_op = optimizer.minimize(error, global_step=tf.train.get_global_step())

    return tf.estimator.EstimatorSpec(mode, y_pred, error, train_op)

Perform the training

Now that we have defined our model function we can do the training similar to the previous tutorial. The key difference is that we are now using the Estimator class instead of the LinearRegressor. The first parameter of the Estimator is our freshly created model_fn. Note that you have to pass in the method reference (don't add parenthesis to it).

estimator = tf.estimator.Estimator(model_fn, model_dir="/tmp/tutorial/custom_model")

for i in range(20):
    print("Running epoch ", i+1)
    estimator.train(input_fn)
    print()

x_test = np.linspace(0, 200, 2, dtype=np.float32) + np.random.uniform(-100, 100, size=2).astype(np.float32)
test_input_fn = tf.estimator.inputs.numpy_input_fn({"x": x_test}, shuffle=False)

y_pred = [predictions for predictions in estimator.predict(test_input_fn)]
plt.plot(x, y, '*')
plt.plot(x_test, y_pred, 'r')

As a result of the training you can see the calculated line, using our model y = m * x + b, in red.

If you want to see the code it its entirety you can download it here.

Conclusion

That's it! We completed a training using our own custom model function. It took a little bit more code than using the LinearRegressor, but we gained more freedom on how we want our model to look like and how it is being trained. This tutorial is just a starting point to show you the basics of creating your own custom model function. Going forward you can use the same approach you saw here to train more complex models like neural networks, with more complex data.