Case Study: Traffic Signs ClassificatonΒΆ

Girl in a jacket

IntroductionΒΆ

This case study is particularly important in self-driving car applications similar to human drivers. Self-driving cars need to make decisions based on what they see using camera data. Self-driving cars need to detect objects and classify traffic signs so it can know when it stops yield and how fast it will drive 30 kilometers an hour or 50 kilometers an hour.

Problem:

  • In this case study, there are images of traffic signs and the goal is to train a Deep Network to classify them

Datasets

  • The dataset contains 43 different classes of images.
  • Classes are as listed below:

    • ( 0, b'Speed limit (20km/h)') ( 1, b'Speed limit (30km/h)')
    • ( 2, b'Speed limit (50km/h)') ( 3, b'Speed limit (60km/h)')
    • ( 4, b'Speed limit (70km/h)') ( 5, b'Speed limit (80km/h)')
    • ( 6, b'End of speed limit (80km/h)') ( 7, b'Speed limit (100km/h)')
    • ( 8, b'Speed limit (120km/h)') ( 9, b'No passing')
    • (10, b'No passing for vehicles over 3.5 metric tons')
    • (11, b'Right-of-way at the next intersection') (12, b'Priority road')
    • (13, b'Yield') (14, b'Stop') (15, b'No vehicles')
    • (16, b'Vehicles over 3.5 metric tons prohibited') (17, b'No entry')
    • (18, b'General caution') (19, b'Dangerous curve to the left')
    • (20, b'Dangerous curve to the right') (21, b'Double curve')
    • (22, b'Bumpy road') (23, b'Slippery road')
    • (24, b'Road narrows on the right') (25, b'Road work')
    • (26, b'Traffic signals') (27, b'Pedestrians') (28, b'Children crossing')
    • (29, b'Bicycles crossing') (30, b'Beware of ice/snow')
    • (31, b'Wild animals crossing')
    • (32, b'End of all speed and passing limits') (33, b'Turn right ahead')
    • (34, b'Turn left ahead') (35, b'Ahead only') (36, b'Go straight or right')
    • (37, b'Go straight or left') (38, b'Keep right') (39, b'Keep left')
    • (40, b'Roundabout mandatory') (41, b'End of no passing')
    • (42, b'End of no passing by vehicles over 3.5 metric tons')

Sources:

  • J. Stallkamp, M. Schlipsing, J. Salmen, and C. Igel. The German Traffic Sign Recognition Benchmark: A multi-class classification competition. In Proceedings of the IEEE International Joint Conference on Neural Networks, pages 1453–1460. 2011.

  • @inproceedings{Stallkamp-IJCNN-2011, author = {Johannes Stallkamp and Marc Schlipsing and Jan Salmen and Christian Igel}, booktitle = {IEEE International Joint Conference on Neural Networks}, title = {The {G}erman {T}raffic {S}ign {R}ecognition {B}enchmark: A multi-class classification competition}, year = {2011}, pages = {1453--1460} }

Libraries and Dataset ImportationΒΆ

InΒ [1]:
# Import libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import pickle
InΒ [2]:
# The pickle module implements binary protocols for serializing and de-serializing a Python object structure.
# import dataset
with open("traffic-signs-data/train.p", mode='rb') as training_data:
    train = pickle.load(training_data)
with open("traffic-signs-data/valid.p", mode='rb') as validation_data:
    valid = pickle.load(validation_data)
with open("traffic-signs-data/test.p", mode='rb') as testing_data:
    test = pickle.load(testing_data)
InΒ [3]:
# Data splitting
X_train, y_train = train['features'], train['labels']
X_validate, y_validate = valid['features'], valid['labels']
X_test, y_test = test['features'], test['labels']

Dataset ExplorationΒΆ

InΒ [4]:
# Check train dataset dimension
X_train.shape, y_train.shape
Out[4]:
((34799, 32, 32, 3), (34799,))
InΒ [5]:
# Check validation dataset dimension
X_validate.shape, y_validate.shape
Out[5]:
((4410, 32, 32, 3), (4410,))
InΒ [6]:
# Check test dataset dimension
X_test.shape, y_test.shape
Out[6]:
((12630, 32, 32, 3), (12630,))
InΒ [7]:
# Check train image
i = 3000
plt.imshow(X_train[i])
y_train[i]
Out[7]:
1
InΒ [8]:
# Check validation image
i = 4000
plt.imshow(X_validate[i])
y_validate[i]
Out[8]:
17
InΒ [9]:
# Check validation image
i = 5000
plt.imshow(X_test[i])
y_test[i]
Out[9]:
26

Dataset PreparationΒΆ

InΒ [10]:
#Shuffle the training dataset
from sklearn.utils import shuffle
X_train, y_train = shuffle(X_train, y_train)
InΒ [11]:
# Make the image greyscale by taking it average
X_train_gray = np.sum(X_train/3, axis = 3, keepdims = True)
X_validate_gray = np.sum(X_validate/3, axis = 3, keepdims = True)
X_test_gray = np.sum(X_test/3, axis = 3, keepdims = True)
InΒ [12]:
# Check the train dimension
X_train_gray.shape, X_validate_gray.shape, X_test_gray.shape
Out[12]:
((34799, 32, 32, 1), (4410, 32, 32, 1), (12630, 32, 32, 1))
InΒ [13]:
# Normalize dataset
X_train_gray_norm = (X_train_gray - 128)/128 
X_validate_gray_norm = (X_validate_gray - 128)/128
X_test_gray_norm = (X_test_gray - 128)/128
InΒ [14]:
# Check dataset dimension
X_train_gray.shape, X_validate_gray.shape, X_test_gray.shape
Out[14]:
((34799, 32, 32, 1), (4410, 32, 32, 1), (12630, 32, 32, 1))
InΒ [15]:
# Visualize the training dataset
i = 6000

# Original image
plt.imshow(X_train[i])
plt.figure()
plt.show()

# Greyscaled image
plt.imshow(X_train_gray[i].squeeze(), cmap='gray')
plt.figure()
plt.show()

# Normalized image
plt.imshow(X_train_gray_norm[i].squeeze(), cmap='gray')
plt.figure()
plt.show()
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
InΒ [16]:
# Visualize the validation dataset
i = 1000

# Original image
plt.imshow(X_validate[i])
plt.figure()
plt.show()

# Greyscaled image
plt.imshow(X_validate_gray[i].squeeze(), cmap='gray')
plt.figure()
plt.show()

# Normalized image
plt.imshow(X_validate_gray_norm[i].squeeze(), cmap='gray')
plt.figure()
plt.show()
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
InΒ [17]:
# Visualize the validation dataset
i = 2000

# Original image
plt.imshow(X_test[i])
plt.figure()
plt.show()

# Greyscaled image
plt.imshow(X_test_gray[i].squeeze(), cmap='gray')
plt.figure()
plt.show()

# Normalized image
plt.imshow(X_test_gray_norm[i].squeeze(), cmap='gray')
plt.figure()
plt.show()
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>
<Figure size 432x288 with 0 Axes>

Model TrainingΒΆ

The model consists of the following layers:

STEP 1: THE FIRST CONVOLUTIONAL LAYER #1

- Input = 32x32x1
- Output = 28x28x6
- Output = (Input-filter+1)/Stride* => (32-5+1)/1=28
- Used a 5x5 Filter with input depth of 3 and output depth of 6
- Apply a RELU Activation function to the output
- pooling for input, Input = 28x28x6 and Output = 14x14x6


Stride is the amount by which the kernel is shifted when the kernel is passed over the image.

STEP 2: THE SECOND CONVOLUTIONAL LAYER #2

- Input = 14x14x6
- Output = 10x10x16
- Layer 2: Convolutional layer with Output = 10x10x16
- Output = (Input-filter+1)/strides => 10 = 14-5+1/1
- Apply a RELU Activation function to the output
- Pooling with Input = 10x10x16 and Output = 5x5x16

STEP 3: FLATTENING THE NETWORK

- Flatten the network with Input = 5x5x16 and Output = 400

STEP 4: FULLY CONNECTED LAYER

- Layer 3: Fully Connected layer with Input = 400 and Output = 120
- Apply a RELU Activation function to the output

STEP 5: ANOTHER FULLY CONNECTED LAYER

- Layer 4: Fully Connected Layer with Input = 120 and Output = 84
- Apply a RELU Activation function to the output

STEP 6: FULLY CONNECTED LAYER

- Layer 5: Fully Connected layer with Input = 84 and Output = 43
InΒ [18]:
# Import Libraries
from tensorflow.keras import Sequential
from keras.layers import Conv2D, MaxPooling2D, AveragePooling2D, Dense, Flatten, Dropout
from keras.optimizers import Adam
from keras.callbacks import TensorBoard

from sklearn.model_selection import train_test_split
InΒ [19]:
# Build model
cnn_model = Sequential()

# First Layer
cnn_model.add(Conv2D(filters = 6, kernel_size = (5,5), activation = 'relu', input_shape = (32,32,1)))
cnn_model.add(AveragePooling2D())

# Second layer
cnn_model.add(Conv2D(filters = 16, kernel_size = (5,5), activation = 'relu'))
cnn_model.add(AveragePooling2D())

# Flattening
cnn_model.add(Flatten())

# Connecting layer
cnn_model.add(Dense(units = 120, activation = 'relu'))

# Connecting layer
cnn_model.add(Dense(units = 84, activation = 'relu'))

# Connecting layer to output
cnn_model.add(Dense(units = 43, activation = 'softmax'))
InΒ [20]:
# Compile the model
cnn_model.compile(loss ='sparse_categorical_crossentropy', optimizer=Adam(lr=0.001),metrics =['accuracy'])
InΒ [21]:
# Train the model
history = cnn_model.fit(X_train_gray_norm, y_train, batch_size = 500, epochs = 50, verbose = 1, validation_data = (X_validate_gray_norm, y_validate))
Epoch 1/50
70/70 [==============================] - 25s 352ms/step - loss: 3.1278 - accuracy: 0.1861 - val_loss: 2.7057 - val_accuracy: 0.3170
Epoch 2/50
70/70 [==============================] - 27s 385ms/step - loss: 1.6815 - accuracy: 0.5301 - val_loss: 1.5810 - val_accuracy: 0.5315
Epoch 3/50
70/70 [==============================] - 27s 390ms/step - loss: 0.9810 - accuracy: 0.7160 - val_loss: 1.1104 - val_accuracy: 0.6723
Epoch 4/50
70/70 [==============================] - 25s 351ms/step - loss: 0.6682 - accuracy: 0.8109 - val_loss: 0.8697 - val_accuracy: 0.7456
Epoch 5/50
70/70 [==============================] - 26s 371ms/step - loss: 0.5099 - accuracy: 0.8604 - val_loss: 0.8117 - val_accuracy: 0.7626
Epoch 6/50
70/70 [==============================] - 24s 348ms/step - loss: 0.4154 - accuracy: 0.8875 - val_loss: 0.7400 - val_accuracy: 0.7853
Epoch 7/50
70/70 [==============================] - 30s 432ms/step - loss: 0.3576 - accuracy: 0.9013 - val_loss: 0.7340 - val_accuracy: 0.8016
Epoch 8/50
70/70 [==============================] - 32s 463ms/step - loss: 0.3070 - accuracy: 0.9172 - val_loss: 0.7002 - val_accuracy: 0.8086
Epoch 9/50
70/70 [==============================] - 32s 458ms/step - loss: 0.2713 - accuracy: 0.9290 - val_loss: 0.6669 - val_accuracy: 0.8170
Epoch 10/50
70/70 [==============================] - 34s 481ms/step - loss: 0.2391 - accuracy: 0.9375 - val_loss: 0.6701 - val_accuracy: 0.8209
Epoch 11/50
70/70 [==============================] - 34s 487ms/step - loss: 0.2151 - accuracy: 0.9441 - val_loss: 0.6415 - val_accuracy: 0.8324
Epoch 12/50
70/70 [==============================] - 34s 486ms/step - loss: 0.1961 - accuracy: 0.9492 - val_loss: 0.6648 - val_accuracy: 0.8211
Epoch 13/50
70/70 [==============================] - 34s 492ms/step - loss: 0.1794 - accuracy: 0.9536 - val_loss: 0.6405 - val_accuracy: 0.8324
Epoch 14/50
70/70 [==============================] - 36s 518ms/step - loss: 0.1582 - accuracy: 0.9604 - val_loss: 0.6945 - val_accuracy: 0.8270
Epoch 15/50
70/70 [==============================] - 38s 537ms/step - loss: 0.1419 - accuracy: 0.9641 - val_loss: 0.6714 - val_accuracy: 0.8324
Epoch 16/50
70/70 [==============================] - 38s 541ms/step - loss: 0.1308 - accuracy: 0.9662 - val_loss: 0.6804 - val_accuracy: 0.8286
Epoch 17/50
70/70 [==============================] - 35s 502ms/step - loss: 0.1183 - accuracy: 0.9718 - val_loss: 0.6391 - val_accuracy: 0.8404
Epoch 18/50
70/70 [==============================] - 37s 531ms/step - loss: 0.1116 - accuracy: 0.9721 - val_loss: 0.6685 - val_accuracy: 0.8299
Epoch 19/50
70/70 [==============================] - 36s 514ms/step - loss: 0.1022 - accuracy: 0.9749 - val_loss: 0.6907 - val_accuracy: 0.8338
Epoch 20/50
70/70 [==============================] - 37s 528ms/step - loss: 0.0965 - accuracy: 0.9757 - val_loss: 0.7136 - val_accuracy: 0.8342
Epoch 21/50
70/70 [==============================] - 39s 550ms/step - loss: 0.0879 - accuracy: 0.9780 - val_loss: 0.6324 - val_accuracy: 0.8456
Epoch 22/50
70/70 [==============================] - 35s 496ms/step - loss: 0.0805 - accuracy: 0.9800 - val_loss: 0.6507 - val_accuracy: 0.8410
Epoch 23/50
70/70 [==============================] - 36s 514ms/step - loss: 0.0743 - accuracy: 0.9814 - val_loss: 0.6730 - val_accuracy: 0.8440
Epoch 24/50
70/70 [==============================] - 39s 552ms/step - loss: 0.0691 - accuracy: 0.9834 - val_loss: 0.6789 - val_accuracy: 0.8388
Epoch 25/50
70/70 [==============================] - 38s 538ms/step - loss: 0.0677 - accuracy: 0.9835 - val_loss: 0.6927 - val_accuracy: 0.8420
Epoch 26/50
70/70 [==============================] - 33s 468ms/step - loss: 0.0609 - accuracy: 0.9847 - val_loss: 0.7434 - val_accuracy: 0.8351
Epoch 27/50
70/70 [==============================] - 31s 443ms/step - loss: 0.0566 - accuracy: 0.9860 - val_loss: 0.7316 - val_accuracy: 0.8488
Epoch 28/50
70/70 [==============================] - 32s 456ms/step - loss: 0.0537 - accuracy: 0.9867 - val_loss: 0.7144 - val_accuracy: 0.8435
Epoch 29/50
70/70 [==============================] - 31s 442ms/step - loss: 0.0485 - accuracy: 0.9881 - val_loss: 0.7562 - val_accuracy: 0.8467
Epoch 30/50
70/70 [==============================] - 32s 458ms/step - loss: 0.0469 - accuracy: 0.9883 - val_loss: 0.7367 - val_accuracy: 0.8515
Epoch 31/50
70/70 [==============================] - 29s 419ms/step - loss: 0.0462 - accuracy: 0.9878 - val_loss: 0.6842 - val_accuracy: 0.8565
Epoch 32/50
70/70 [==============================] - 32s 458ms/step - loss: 0.0406 - accuracy: 0.9895 - val_loss: 0.7525 - val_accuracy: 0.8515
Epoch 33/50
70/70 [==============================] - 30s 423ms/step - loss: 0.0370 - accuracy: 0.9905 - val_loss: 0.7316 - val_accuracy: 0.8485
Epoch 34/50
70/70 [==============================] - 31s 442ms/step - loss: 0.0388 - accuracy: 0.9895 - val_loss: 0.7402 - val_accuracy: 0.8420
Epoch 35/50
70/70 [==============================] - 33s 477ms/step - loss: 0.0410 - accuracy: 0.9887 - val_loss: 0.7119 - val_accuracy: 0.8594
Epoch 36/50
70/70 [==============================] - 31s 448ms/step - loss: 0.0311 - accuracy: 0.9922 - val_loss: 0.7780 - val_accuracy: 0.8476
Epoch 37/50
70/70 [==============================] - 30s 422ms/step - loss: 0.0278 - accuracy: 0.9924 - val_loss: 0.7400 - val_accuracy: 0.8537
Epoch 38/50
70/70 [==============================] - 28s 397ms/step - loss: 0.0250 - accuracy: 0.9941 - val_loss: 0.7514 - val_accuracy: 0.8537
Epoch 39/50
70/70 [==============================] - 27s 389ms/step - loss: 0.0237 - accuracy: 0.9944 - val_loss: 0.8924 - val_accuracy: 0.8433
Epoch 40/50
70/70 [==============================] - 27s 390ms/step - loss: 0.0248 - accuracy: 0.9939 - val_loss: 0.8169 - val_accuracy: 0.8528
Epoch 41/50
70/70 [==============================] - 30s 424ms/step - loss: 0.0382 - accuracy: 0.9891 - val_loss: 0.8088 - val_accuracy: 0.8506
Epoch 42/50
70/70 [==============================] - 29s 412ms/step - loss: 0.0394 - accuracy: 0.9882 - val_loss: 0.8586 - val_accuracy: 0.8463
Epoch 43/50
70/70 [==============================] - 30s 430ms/step - loss: 0.0254 - accuracy: 0.9929 - val_loss: 0.7811 - val_accuracy: 0.8560
Epoch 44/50
70/70 [==============================] - 28s 404ms/step - loss: 0.0196 - accuracy: 0.9950 - val_loss: 0.7524 - val_accuracy: 0.8580
Epoch 45/50
70/70 [==============================] - 38s 543ms/step - loss: 0.0190 - accuracy: 0.9954 - val_loss: 0.8337 - val_accuracy: 0.8494
Epoch 46/50
70/70 [==============================] - 36s 515ms/step - loss: 0.0151 - accuracy: 0.9964 - val_loss: 0.7576 - val_accuracy: 0.8592
Epoch 47/50
70/70 [==============================] - 37s 533ms/step - loss: 0.0206 - accuracy: 0.9945 - val_loss: 0.8555 - val_accuracy: 0.8596
Epoch 48/50
70/70 [==============================] - 38s 547ms/step - loss: 0.0216 - accuracy: 0.9941 - val_loss: 0.7775 - val_accuracy: 0.8574
Epoch 49/50
70/70 [==============================] - 39s 556ms/step - loss: 0.0146 - accuracy: 0.9967 - val_loss: 0.7883 - val_accuracy: 0.8583
Epoch 50/50
70/70 [==============================] - 39s 558ms/step - loss: 0.0121 - accuracy: 0.9973 - val_loss: 0.8124 - val_accuracy: 0.8612

Model EvaluationΒΆ

InΒ [23]:
# Check model accuracy
score = cnn_model.evaluate(X_test_gray_norm, y_test)
395/395 [==============================] - 7s 17ms/step - loss: 1.3257 - accuracy: 0.8592
Test Accuracy: 0.8592240810394287
InΒ [24]:
history.history.keys()
Out[24]:
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
InΒ [36]:
# Label the metrics
accuracy = history.history['accuracy']
val_accuracy = history.history['val_accuracy']
loss = history.history['val_loss']
val_loss = history.history['loss']

# Number of epochs
epochs = range(len(accuracy))

# Plot the evaluation metrics
plt.plot(epochs, accuracy, label='Training Accuracy', c = 'green')
plt.plot(epochs, val_accuracy, label='Validation Accuracy', c = 'red')
plt.title('Training and Validation accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epochs')
plt.legend()
plt.show()
InΒ [37]:
# Plot the evaluation metrics
plt.plot(epochs, loss, label='Training Loss', c = 'green')
plt.plot(epochs, val_loss, label='Validation Loss', c = 'red')
plt.title('Training and Validation accuracy')
plt.ylabel('Loss')
plt.xlabel('Epochs')
plt.legend()
plt.show()
InΒ [40]:
# Get the predictions for the test data
predicted_classses = np.argmax(cnn_model.predict(X_test_gray_norm), axis=-1)

#get the indices to be plotted
y_true = y_test
InΒ [42]:
# Create confusion matrix
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_true, predicted_classes)
plt.figure(figsize = (25,25))
sns.heatmap(cm, annot=True)
plt.show()
InΒ [43]:
# Create image subplots for image predictions
L = 7
W = 7
fig, axes = plt.subplots(L, W, figsize = (12,12))
axes = axes.ravel() # 

for i in np.arange(0, L * W):  
    axes[i].imshow(X_test[i])
    axes[i].set_title("Prediction={}\n True={}".format(predicted_classes[i], y_true[i]))
    axes[i].axis('off')

plt.subplots_adjust(wspace=1)

ConclusionΒΆ

In this case study, LeNet5 Convolutional Neural Network was implemented to classify traffic sign images and was able to achieved 86% accuracy. There are some images that are really hard to recognize even in human naked eye that possibly gave the machine difficulty in identifying the correct traffic signs. This could be further improve in many ways such as removing the unrecognizable traffic signs, adding more sample signs and tuning the parameters.