HomeDeep LearningDive Into TensorFlow, Part IV: Hello MNIST
Deep Learning Specialization on Coursera

This is the fourth article in the series “Dive Into TensorFlow“, here is an index of all the articles in the series that have been published to date:

Part I: Getting Started with TensorFlow
Part II: Basic Concepts
Part III: Ubuntu16.04+GTX 1080+CUDA8.0+cuDNN5.0+TensorFlow
Part IV: Hello MNIST (this article)

Hello MNIST
Like programming language has “Hello World”, machine learning has “Hello MNIST”. The MNIST database (Mixed National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for machine learning model training and testing.

The MNIST data is hosted on Yann LeCun’s website. TensorFlow provides some python scripts to download and split the mnist data automatically:

In [1]: from tensorflow.examples.tutorials.mnist import input_data
 
In [2]: mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
Extracting MNIST_data/train-images-idx3-ubyte.gz
Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
Extracting MNIST_data/train-labels-idx1-ubyte.gz
Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
Extracting MNIST_data/t10k-images-idx3-ubyte.gz
Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
Extracting MNIST_data/t10k-labels-idx1-ubyte.gz
 
In [3]: mnist.train
Out[3]: <tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet at 0x10e219d50>
 
In [4]: dir(mnist.train)
Out[4]: 
['__class__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__format__',
 '__getattribute__',
 '__hash__',
 '__init__',
 '__module__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 '_epochs_completed',
 '_images',
 '_index_in_epoch',
 '_labels',
 '_num_examples',
 'epochs_completed',
 'images',
 'labels',
 'next_batch',
 'num_examples']
 
In [5]: mnist.train.num_examples
Out[5]: 55000
 
In [6]: mnist.test.num_examples
Out[6]: 10000
 
In [7]: mnist.validation.num_examples
Out[7]: 5000

Every MNIST data point has two parts: an image of a handwritten digit and a corresponding label. We call the images “xs” and the labels “ys”. Both the training set and test set contain xs and ys, for example the training images are mnist.train.images and the train lables are mnist.train.labels:

In [22]: mnist.train.images
Out[22]: 
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32)
 
In [23]: mnist.train.images.shape
Out[23]: (55000, 784)
 
In [24]: mnist.train.labels
Out[24]: 
array([[ 0.,  0.,  0., ...,  1.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  1.,  0.]])
 
In [25]: mnist.train.labels.shape
Out[25]: (55000, 10)
 
In [26]: mnist.test.images
Out[26]: 
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32)
 
In [27]: mnist.test.images.shape
Out[27]: (10000, 784)
 
In [28]: mnist.test.labels
Out[28]: 
array([[ 0.,  0.,  0., ...,  1.,  0.,  0.],
       [ 0.,  0.,  1., ...,  0.,  0.,  0.],
       [ 0.,  1.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]])
 
In [29]: mnist.test.labels.shape
Out[29]: (10000, 10)
 
In [30]: mnist.validation.images
Out[30]: 
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32)
 
In [31]: mnist.validation.images.shape
Out[31]: (5000, 784)
 
In [32]: mnist.validation.labels
Out[32]: 
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 1.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  1., ...,  0.,  0.,  0.],
       [ 0.,  1.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  1., ...,  0.,  0.,  0.]])
 
In [33]: mnist.validation.labels.shape
Out[33]: (5000, 10)

Note that the mnist.train.images is a tensor (an n-dimensional array) with a shape of [55000, 784]. The first dimension indexes the images and the second dimension indexes the pixels in each image. Each entry in the tensor is the pixel intensity between 0 and 1, for a particular pixel in a particular image.

The mnist.train.labels is a [55000, 10] array of floats, and the labels is represent as “one-hot vecotrs”, in this case, the nth digit will be represented as a vector which is 1 in the nth dimensions. For example, label 5 would be [0,0,0,0,0,1,0,0,0,0].

Build a Softmax Regression Model with TensorFlow

Softmax regression is just another name for Multinomial logistic regression, if you are familiar with NLP or text mining, maximum entropy model is also another name of softmax regression:

In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).

Multinomial logistic regression is known by a variety of other names, including polytomous LR, multiclass LR, softmax regression, multinomial logit, maximum entropy (MaxEnt) classifier, conditional maximum entropy model.

Back to this MINST example, it’s a classic case where a softmax regression is a natural simple model for it. We can picture the softmax regression as looking something like the following:

softmax-regression-scalargraph

Write that out as equations:

softmax-regression-scalarequation

Vectorize this procedure, turning it into a matrix multiplication and vector addition:

softmax-regression-vectorequation

More compactly, we can just write:

softmax2

Where:

softmax

Now it’s time to implement softmax regression model with TensorFlow:

In [37]: import tensorflow as tf
 
In [38]: x = tf.placeholder(tf.float32, [None, 784])
 
In [39]: W = tf.Variable(tf.zeros([784, 10]))
 
In [40]: b = tf.Variable(tf.zeros([10]))
 
# It only takes one line to get the softmax in tensorflow
In [41]: y = tf.nn.softmax(tf.matmul(x, W) + b)
 
In [42]: y_ = tf.placeholder(tf.float32, [None, 10])
 
# Training the softmax regression model
In [43]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
 
In [44]: train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy)
 
In [45]: init = tf.initialize_all_variables()
 
In [46]: sess = tf.Session()
 
In [47]: sess.run(init)
 
In [49]: for i in range(1000):
    batch_xs, batch_ys = mnist.train.next_batch(100)
    sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys})
   ....:     
 
In [50]: batch_xs
Out[50]: 
array([[ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       ..., 
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.],
       [ 0.,  0.,  0., ...,  0.,  0.,  0.]], dtype=float32)
 
In [51]: batch_ys
Out[51]: 
array([[ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.],
       ......
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  1.,  0.,  0.,  0.,  0.,  0.,  0.],
       [ 0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  0.,  1.]])
 
# Evalute the model
In [56]: correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_,1))
 
In [57]: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
 
In [58]: print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels}))
0.9201

We get a 92% accuracy, is this good? In fact, it’s pretty bad. This is because we’re using a very simple model, just use it here as a tensorflow case study. You can check the MINST classify result here: who is the best in MNIST ?

We will get a better result in the next chapter, base on a convolutional neural network, just wait.

Reference:
MNIST For ML Beginners
Deep MNIST for Experts
THE MNIST DATABASE of handwritten digits

Posted by TextMiner

Deep Learning Specialization on Coursera

Comments

Dive Into TensorFlow, Part IV: Hello MNIST — No Comments

Leave a Reply

Your email address will not be published.