Getting Started with Tensorflow (Linear Regression)

Linear regression example using tensorflow and matplotlib for data visualization

Hello everyone. If you have been following python36.com for some time now, you probably already know what tensorflow is and how to install tensorflow. Today, we are going to look at how to perform linear regression with tensorflow. We will also cover various concepts of tensorflow such as Tensorflow Ops, Session etc. as we go along. Before jumping right into the code, let’s clear what we will be doing first.

What is Linear Regression?

The problem we will be looking at is regression, more specifically, linear regression. Linear regression is a basic and commonly used type of predictive analysis. It is just a fancy way of saying, “Given the value of x, what is the value of y?” The simplest form of the Linear regression equation with one dependent and one independent variable is defined by the formula y = W*X + b, where:

y = estimated dependent variable score,
W = regression coefficient, also known as weight,
b = constant, also known as bias, and
X = score on the independent variable

Problem Definition

First, we will generate 20 random x values and calculate it’s corresponding y values with weights and bias equal to 0.1 and 0.3 respectively. We will be using numpy for this. After that, we will write a simple program using tensorflow and feed our model the generated values of X and y and expect the model to calculate the value of Weight and bias (approx. 0.1 and 0.3 for our data). We will use matplotlib for data visualization. So, the libraries we will be using are:

  1. tensorflow version 1.5.0
  2. numpy version 1.14.0
  3. matplotlib version 2.1.0

Getting Started

Let’s begin writing our program then. I will explain the concepts as we go along. First, import all the necessary dependencies:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt 

The following code is to hide any runtime warnings given by tensorflow. It’s not really a necessity so you can skip it:

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

The seed for the random generator is set so that it will return the same random numbers each time, which is sometimes useful for debugging.

np.random.seed(36)
tf.set_random_seed(36)

Now, we will create 20 phony x, y data points using numpy. We will use numpy random.rand() function to create x_data and generate y_data such that y = x * 0.1 + 0.3

x_data = np.random.rand(20).astype(np.float32)
y_data = x_data * 0.1 + 0.3

If we print the x_data and y_data, we will see something like:

tensorflow linear regression input samples

Let’s define some constants

n_steps = 100 #Total number of steps
n_iterations = [] #Nth iteration value
n_loss = [] #Loss at nth iteration
learned_weight = [] #weight at nth iteration
learned_bias = [] #bias value at nth iteration

Define TensorFlow Ops

In TensorFlow, you collectively call constants, variables, operators as ops. It is good to separate the definition of ops and their execution when you can. Now let’s define Weight and bias. Since the value of weight and bias changes after each iteration, we need to define them as tensorflow variables. We will take random variable for weight and keep our bias as 0. Let’s also define our Linear Regression function y = W*x + b

W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

The loss function we will be using is Mean Square Error and the optimizer is Gradient Descent Optimizer. The objective of each training step will be to adjust Weight and bias in such a way that the loss is minimized. We define those as:

loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

Initialize and Evaluate values of TensorFlow variables

You have to initialize a variable before using it. If you try to evaluate the variables before initializing them you’ll run into FailedPreconditionError: Attempting to use uninitialized value. The easiest way to initialize all variables at once is to use tf.global_variables_initializer(). To get the value of a variable, we need to fetch it within a session. We will do this for the number of steps defined and store the value of weight, bias, and loss at each step in the corresponding variables defined.

with tf.Session() as sess:
  # Before starting, initialize the variables.  We will 'run' this first.
  sess.run(tf.global_variables_initializer())
  for step in range(n_steps):
    sess.run(train)
    n_iterations.append(step)
    n_loss.append(loss.eval())
    learned_weight.append(W.eval())
    learned_bias.append(b.eval())

After running the session for the number of steps defined, we print out the final value of Weight and Bias using the print command:

print("Final Weight: "+str(learned_weight[-1])+"\nFinal Bias: "+str(learned_bias[-1]))

The final value of Weight and Bias after evaluation for me are:

Final Weight: [0.09973574]

Final Bias: [0.30012596]

It may be slightly different for you. However, the value of Weight will always be close to 0.1 and the value of Bias will always be close to 0.3.

Plotting Loss, Weight and Bias vs iteration

Finally, visualize the changing value of Loss, Weight, and Bias after each iteration using the code below:

plt.figure('Loss, Weight, and Bias')

plt.subplot(311)
plt.plot(n_iterations,n_loss)
plt.ylabel('Loss')
plt.xlabel('Iterations')

plt.subplot(312)
plt.plot(n_iterations,learned_weight)
plt.ylabel('Weight')
plt.xlabel('Iterations')

plt.subplot(313)
plt.plot(n_iterations,learned_bias)
plt.ylabel('Bias')
plt.xlabel('Iterations')

plt.show()

The graph of Loss, Weight, and Bias vs the number of iteration is obtained as below. Here we can see how the weight is adjusted near to 0.1, Bias near to 0.3 and Loss near to 0 after each iteration.
tensorflow linear regression visualization

Below is the entire code for the tutorial:

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt 

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

np.random.seed(36)
tf.set_random_seed(36)

# Create 20 phony x, y data points in NumPy, y = x * 0.1 + 0.3
x_data = np.random.rand(20).astype(np.float32)
y_data = x_data * 0.1 + 0.3

n_steps = 100		#Total number of steps
n_iterations = []	#Nth iteration value
n_loss = []			#Loss at nth iteration
learned_weight = []	#weight at nth iteration
learned_bias = []	#bias value at nth iteration

# Try to find values for W and b that compute y_data = W * x_data + b
# (We know that W should be 0.1 and b 0.3, but TensorFlow will
# figure that out for us.)
W = tf.Variable(tf.random_uniform([1], -1.0, 1.0))
b = tf.Variable(tf.zeros([1]))
y = W * x_data + b

# Minimize the mean squared errors.
loss = tf.reduce_mean(tf.square(y - y_data))
optimizer = tf.train.GradientDescentOptimizer(0.5)
train = optimizer.minimize(loss)

with tf.Session() as sess:
  # Before starting, initialize the variables.  We will 'run' this first.
  sess.run(tf.global_variables_initializer())
  for step in range(n_steps):
    sess.run(train)
    n_iterations.append(step)
    n_loss.append(loss.eval())
    learned_weight.append(W.eval())
    learned_bias.append(b.eval())

print("Final Weight: "+str(learned_weight[-1])+"\nFinal Bias: "+str(learned_bias[-1]))
# # Learns best fit is W: [0.1], b: [0.3]

plt.figure('Loss, Weight, and Bias')

plt.subplot(311)
plt.plot(n_iterations,n_loss)
plt.ylabel('Loss')
plt.xlabel('Iterations')

plt.subplot(312)
plt.plot(n_iterations,learned_weight)
plt.ylabel('Weight')
plt.xlabel('Iterations')

plt.subplot(313)
plt.plot(n_iterations,learned_bias)
plt.ylabel('Bias')
plt.xlabel('Iterations')

plt.show()

 

Be the first to comment

Leave a Reply

Your email address will not be published.