Dive Into TensorFlow, Part IV: Hello MNIST
This is the fourth article in the series “Dive Into TensorFlow“, here is an index of all the articles in the series that have been published to date:
Part I: Getting Started with TensorFlow
Part II: Basic Concepts
Part III: Ubuntu16.04+GTX 1080+CUDA8.0+cuDNN5.0+TensorFlow
Part IV: Hello MNIST (this article)
Hello MNIST
Like programming language has “Hello World”, machine learning has “Hello MNIST”. The MNIST database (Mixed National Institute of Standards and Technology database) is a large database of handwritten digits that is commonly used for machine learning model training and testing.
The MNIST data is hosted on Yann LeCun’s website. TensorFlow provides some python scripts to download and split the mnist data automatically:
In [1]: from tensorflow.examples.tutorials.mnist import input_data In [2]: mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes. Extracting MNIST_data/train-images-idx3-ubyte.gz Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes. Extracting MNIST_data/train-labels-idx1-ubyte.gz Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes. Extracting MNIST_data/t10k-images-idx3-ubyte.gz Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes. Extracting MNIST_data/t10k-labels-idx1-ubyte.gz In [3]: mnist.train Out[3]: <tensorflow.contrib.learn.python.learn.datasets.mnist.DataSet at 0x10e219d50> In [4]: dir(mnist.train) Out[4]: ['__class__', '__delattr__', '__dict__', '__doc__', '__format__', '__getattribute__', '__hash__', '__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', '_epochs_completed', '_images', '_index_in_epoch', '_labels', '_num_examples', 'epochs_completed', 'images', 'labels', 'next_batch', 'num_examples'] In [5]: mnist.train.num_examples Out[5]: 55000 In [6]: mnist.test.num_examples Out[6]: 10000 In [7]: mnist.validation.num_examples Out[7]: 5000 |
Every MNIST data point has two parts: an image of a handwritten digit and a corresponding label. We call the images “xs” and the labels “ys”. Both the training set and test set contain xs and ys, for example the training images are mnist.train.images and the train lables are mnist.train.labels:
In [22]: mnist.train.images Out[22]: array([[ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32) In [23]: mnist.train.images.shape Out[23]: (55000, 784) In [24]: mnist.train.labels Out[24]: array([[ 0., 0., 0., ..., 1., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 1., 0.]]) In [25]: mnist.train.labels.shape Out[25]: (55000, 10) In [26]: mnist.test.images Out[26]: array([[ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32) In [27]: mnist.test.images.shape Out[27]: (10000, 784) In [28]: mnist.test.labels Out[28]: array([[ 0., 0., 0., ..., 1., 0., 0.], [ 0., 0., 1., ..., 0., 0., 0.], [ 0., 1., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.]]) In [29]: mnist.test.labels.shape Out[29]: (10000, 10) In [30]: mnist.validation.images Out[30]: array([[ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32) In [31]: mnist.validation.images.shape Out[31]: (5000, 784) In [32]: mnist.validation.labels Out[32]: array([[ 0., 0., 0., ..., 0., 0., 0.], [ 1., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 1., ..., 0., 0., 0.], [ 0., 1., 0., ..., 0., 0., 0.], [ 0., 0., 1., ..., 0., 0., 0.]]) In [33]: mnist.validation.labels.shape Out[33]: (5000, 10) |
Note that the mnist.train.images is a tensor (an n-dimensional array) with a shape of [55000, 784]. The first dimension indexes the images and the second dimension indexes the pixels in each image. Each entry in the tensor is the pixel intensity between 0 and 1, for a particular pixel in a particular image.
The mnist.train.labels is a [55000, 10] array of floats, and the labels is represent as “one-hot vecotrs”, in this case, the nth digit will be represented as a vector which is 1 in the nth dimensions. For example, label 5 would be [0,0,0,0,0,1,0,0,0,0].
Build a Softmax Regression Model with TensorFlow
Softmax regression is just another name for Multinomial logistic regression, if you are familiar with NLP or text mining, maximum entropy model is also another name of softmax regression:
In statistics, multinomial logistic regression is a classification method that generalizes logistic regression to multiclass problems, i.e. with more than two possible discrete outcomes. That is, it is a model that is used to predict the probabilities of the different possible outcomes of a categorically distributed dependent variable, given a set of independent variables (which may be real-valued, binary-valued, categorical-valued, etc.).
Multinomial logistic regression is known by a variety of other names, including polytomous LR, multiclass LR, softmax regression, multinomial logit, maximum entropy (MaxEnt) classifier, conditional maximum entropy model.
Back to this MINST example, it’s a classic case where a softmax regression is a natural simple model for it. We can picture the softmax regression as looking something like the following:
Write that out as equations:
Vectorize this procedure, turning it into a matrix multiplication and vector addition:
More compactly, we can just write:
Where:
Now it’s time to implement softmax regression model with TensorFlow:
In [37]: import tensorflow as tf In [38]: x = tf.placeholder(tf.float32, [None, 784]) In [39]: W = tf.Variable(tf.zeros([784, 10])) In [40]: b = tf.Variable(tf.zeros([10])) # It only takes one line to get the softmax in tensorflow In [41]: y = tf.nn.softmax(tf.matmul(x, W) + b) In [42]: y_ = tf.placeholder(tf.float32, [None, 10]) # Training the softmax regression model In [43]: cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) In [44]: train_step = tf.train.GradientDescentOptimizer(0.5).minimize(cross_entropy) In [45]: init = tf.initialize_all_variables() In [46]: sess = tf.Session() In [47]: sess.run(init) In [49]: for i in range(1000): batch_xs, batch_ys = mnist.train.next_batch(100) sess.run(train_step, feed_dict={x: batch_xs, y_: batch_ys}) ....: In [50]: batch_xs Out[50]: array([[ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], ..., [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.], [ 0., 0., 0., ..., 0., 0., 0.]], dtype=float32) In [51]: batch_ys Out[51]: array([[ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.], [ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], [ 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 1., 0.], [ 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.], ...... [ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 1., 0., 0., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 1., 0., 0., 0.], [ 0., 0., 0., 0., 1., 0., 0., 0., 0., 0.], [ 0., 0., 0., 1., 0., 0., 0., 0., 0., 0.], [ 0., 0., 0., 0., 0., 0., 0., 0., 0., 1.]]) # Evalute the model In [56]: correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_,1)) In [57]: accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) In [58]: print(sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) 0.9201 |
We get a 92% accuracy, is this good? In fact, it’s pretty bad. This is because we’re using a very simple model, just use it here as a tensorflow case study. You can check the MINST classify result here: who is the best in MNIST ?
We will get a better result in the next chapter, base on a convolutional neural network, just wait.
Reference:
MNIST For ML Beginners
Deep MNIST for Experts
THE MNIST DATABASE of handwritten digits
Posted by TextMiner
Comments
Dive Into TensorFlow, Part IV: Hello MNIST — No Comments