Creating a Neural Network for Image Recognition with Keras in Python

Waleed Mousa
Artificial Intelligence in Plain English
6 min readMar 1, 2023

--

Image recognition is a computer vision task that involves identifying and categorizing objects or features within an image.

It is a subset of computer vision and has many practical applications such as self-driving cars, facial recognition, and medical imaging.

Now that we have a basic understanding of image recognition, let’s get started with the tutorial.

Step 1: Importing the Required Libraries

The first step is to import the required libraries. For this tutorial, we will be using the following libraries:

  • Keras
  • NumPy
  • Matplotlib

To import these libraries, open a new Python file and add the following code:

# Importing the required libraries
import keras
import numpy as np
import matplotlib.pyplot as plt

Step 2: Loading the Dataset

The next step is to load the dataset. For this tutorial, we will be using the MNIST dataset, which is a collection of 70,000 handwritten digits. The dataset is split into 60,000 training images and 10,000 testing images.

To load the dataset, we will use the keras.datasets module, which provides a number of popular datasets for machine learning. To load the MNIST dataset, add the following code:

# Loading the MNIST dataset
from keras.datasets import mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

This will download the dataset and split it into training and testing sets. The train_images and train_labels variables contain the training images and labels, while the test_images and test_labels variables contain the testing images and labels.

Step 3: Exploring the Dataset

Before we start building our neural network, let’s take a look at the dataset. We can use the matplotlib library to display some of the images from the dataset.

To display the first image in the training set, add the following code:

# Displaying the first image in the training set
plt.imshow(train_images[0], cmap='gray')
plt.show()

This will display the first image in the training set as a grayscale image as follows:

Step 4: Preprocessing the Dataset

Before we can train our neural network, we need to preprocess the dataset. This involves converting the images to a format that can be used by the neural network.

The first step is to convert the images to grayscale and normalize the pixel values. To do this, add the following code:

# Preprocessing the dataset
train_images = train_images.reshape((60000, 28, 28, 1))
train_images = train_images.astype('float32') / 255

test_images = test_images.reshape((10000, 28, 28, 1))
test_images = test_images.astype('float32') / 255

This code will reshape the images to have a height and width of 28 pixels and a depth of 1 (since the images are grayscale). It will also normalize the pixel values to be between 0 and 1.

The next step is to one-hot encode the labels. This involves converting each label to a vector of length 10, where the index corresponding to the true class is set to 1 and all other indices are set to 0. To do this, add the following code:

# One-hot encoding the labels
from keras.utils import to_categorical

train_labels = to_categorical(train_labels)
test_labels = to_categorical(test_labels)

This code will convert the labels to a one-hot encoded format.

Step 5: Building the Neural Network

Now that we have preprocessed the dataset, we can start building our neural network. For this tutorial, we will be using a simple convolutional neural network (CNN) with the following architecture:

  1. Convolutional layer with 32 filters, each with a size of 3x3
  2. ReLU activation function
  3. Max pooling layer with a pool size of 2x2
  4. Convolutional layer with 64 filters, each with a size of 3x3
  5. ReLU activation function
  6. Max pooling layer with a pool size of 2x2
  7. Convolutional layer with 64 filters, each with a size of 3x3
  8. ReLU activation function
  9. Flatten layer
  10. Dense layer with 64 neurons and a ReLU activation function
  11. Dense layer with 10 neurons and a softmax activation function

To build this network, add the following code:

# Building the neural network
from keras import models
from keras import layers

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.summary()

This will create a new sequential model and add the layers specified in the architecture. The summary() method will print out a summary of the model, including the number of parameters in each layer as follows:

Step 6: Compiling the Model

Before we can train the model, we need to compile it. This involves specifying the loss function, optimizer, and metrics to use during training.

For this tutorial, we will be using the categorical crossentropy loss function, the RMSprop optimizer, and the accuracy metric.

To compile the model, add the following code:

# Compiling the model
model.compile(optimizer='rmsprop',
loss='categorical_crossentropy',
metrics=['accuracy'])

Step 7: Training the Model

Now that we have compiled the model, we can start training it. To do this, we use the fit() method, which takes the training images and labels as input.

For this tutorial, we will train the model for 5 epochs with a batch size of 64.

To train the model, add the following code:

# Training the model
history = model.fit(train_images, train_labels, epochs=5, batch_size=64, validation_data=(test_images, test_labels))

This will train the model on the training set and validate it on the testing set as follows :

Epoch 1/5
938/938 [==============================] - 35s 36ms/step - loss: 0.1774 - accuracy: 0.9447 - val_loss: 0.0551 - val_accuracy: 0.9837
Epoch 2/5
938/938 [==============================] - 47s 50ms/step - loss: 0.0464 - accuracy: 0.9862 - val_loss: 0.0322 - val_accuracy: 0.9894
Epoch 3/5
938/938 [==============================] - 34s 37ms/step - loss: 0.0324 - accuracy: 0.9895 - val_loss: 0.0328 - val_accuracy: 0.9894
Epoch 4/5
938/938 [==============================] - 139s 148ms/step - loss: 0.0241 - accuracy: 0.9927 - val_loss: 0.0320 - val_accuracy: 0.9889
Epoch 5/5
938/938 [==============================] - 129s 137ms/step - loss: 0.0192 - accuracy: 0.9945 - val_loss: 0.0257 - val_accuracy: 0.9917

Step 8: Evaluating the Model

Once the model has finished training, we can evaluate its performance on the testing set. To do this, we use the evaluate() method, which takes the testing images and labels as input.

To evaluate the model, add the following code:

# Evaluating the model
test_loss, test_acc = model.evaluate(test_images, test_labels)
print('Test accuracy:', test_acc)

This will print out the test accuracy of the model as follows:

313/313 [==============================] - 2s 6ms/step - loss: 0.0257 - accuracy: 0.9917
Test accuracy: 0.9916999936103821

Step 9: Making Predictions

Now that we have trained and evaluated the model, we can use it to make predictions on new images. To do this, we use the predict() method, which takes an array of images as input and returns an array of predictions.

For this tutorial, we will use the first 5 images from the testing set to make predictions.

To make predictions, add the following code:

# Making predictions
predictions = model.predict(test_images[:5])

print(predictions)

This will print out the predictions for the first 5 images in the testing set as follows :

1/1 [==============================] - 0s 111ms/step
[[7.29186069e-12 3.20411386e-09 1.10701659e-09 1.24971899e-09
1.30131211e-10 1.20133530e-11 4.06263604e-16 1.00000000e+00
1.55383987e-11 1.13229692e-08]
[8.00600674e-07 1.82235237e-06 9.99997377e-01 1.34031059e-11
3.01229180e-10 7.96471789e-14 1.69590155e-08 1.49224140e-11
2.20731988e-09 3.26718172e-13]
[8.41643699e-11 9.99996543e-01 1.90355204e-10 1.28589468e-13
1.13481292e-06 9.23991994e-09 5.43194878e-10 2.26726706e-06
2.85804580e-09 4.99099428e-09]
[9.99999046e-01 3.61908142e-10 2.04993089e-09 2.19719301e-08
2.60740648e-11 7.64901387e-09 5.90288607e-07 8.89923513e-08
8.99829455e-09 2.75555749e-07]
[2.56195212e-14 9.25312344e-13 2.94301259e-14 1.95156747e-14
9.99999881e-01 7.27259529e-12 1.04306746e-11 1.31689182e-11
1.21775923e-10 7.51667457e-08]]

Congratulations! You have successfully created a neural network for image recognition with Keras in Python. In this tutorial, you learned how to:

  1. Load the dataset using Keras
  2. Preprocess the dataset
  3. One-hot encode the labels
  4. Build the neural network
  5. Compile the model
  6. Train the model
  7. Evaluate the model
  8. Make predictions

With this knowledge, you can now start exploring the world of neural networks and computer vision. Good luck!

If you enjoyed reading this tutorial and found it helpful, please consider supporting me on Buy Me a Coffee 😎

More content at PlainEnglish.io. Sign up for our free weekly newsletter. Join our Discord community and follow us on Twitter, LinkedIn and YouTube.

Learn how to build awareness and adoption for your startup with Circuit.

--

--