This document outlines the development pipeline for an image classification model built using a convolutional neural network (CNN) with TensorFlow/Keras. The model is designed to classify images into two categories, leveraging standard deep learning techniques for data cleaning, preprocessing, model architecture, training, and evaluation. Additional details provide context, best practices, and insights into image classification.
The dataset images were cleaned to ensure data integrity before training. A Python script verifies each image file using the PIL library’s Image.verify()
method, removing corrupted or unreadable images. This process was applied to the train, test, and validation datasets.
Why Data Cleaning Matters: Corrupted images (e.g., truncated files, incorrect formats) can cause errors during training or degrade model performance. Cleaning ensures robustness and consistency in the dataset. The script below checks for corrupted images and logs the number of files removed.
import os
from PIL import Image
def clean_the_data(files):
cleaned = 0
for temp, _, file in os.walk(files):
for f in file:
paths = os.path.join(temp, f)
try:
img = Image.open(paths)
img.verify() # Verify image integrity
except Exception as e:
print(f"Removing corrupt image: {paths} ({str(e)})")
os.remove(paths)
cleaned += 1
print(f"Cleaned {cleaned} corrupt images")
return cleaned
# Example usage
clean_the_data("dataset/train")
clean_the_data("dataset/valid")
clean_the_data("dataset/test")
Additional Considerations:
Note: For large datasets, consider parallelizing the cleaning process using multiprocessing to improve efficiency.
Images were resized to 64x64 pixels and batched for efficient training using TensorFlow’s image_dataset_from_directory
method. This approach leverages directory structures for automatic labeling and supports batch processing for GPU optimization.
import tensorflow as tf
img_size = (64, 64)
batch_size = 32
trained_data = tf.keras.preprocessing.image_dataset_from_directory(
"dataset/train",
seed=42,
image_size=img_size,
batch_size=batch_size,
label_mode='int' # For sparse categorical crossentropy
)
valid_data = tf.keras.preprocessing.image_dataset_from_directory(
"dataset/valid",
seed=42,
image_size=img_size,
batch_size=batch_size,
label_mode='int'
)
tested_data = tf.keras.preprocessing.image_dataset_from_directory(
"dataset/test",
seed=42,
image_size=img_size,
batch_size=batch_size,
label_mode='int'
)
Visualization: Sample images from the training set were visualized to verify correct loading and labeling. This step is critical to detect issues like mislabeled data or incorrect preprocessing.
Enhanced Preprocessing:
data_augmentation = tf.keras.Sequential([
tf.keras.layers.RandomFlip("horizontal"),
tf.keras.layers.RandomRotation(0.1),
tf.keras.layers.RandomZoom(0.1),
])
trained_data = trained_data.map(lambda x, y: (data_augmentation(x, training=True), y))
trained_data.prefetch(tf.data.AUTOTUNE)
to reduce I/O bottlenecks.The CNN model was built using Keras’ Sequential API, designed for binary image classification with the following layers:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
model = Sequential([
Conv2D(32, (3,3), activation='relu', input_shape=(64,64,3)),
MaxPooling2D(2,2),
Conv2D(64, (3,3), activation='relu'),
MaxPooling2D(2,2),
Conv2D(128, (3,3), activation='relu'),
MaxPooling2D(2,2),
Conv2D(256, (3,3), activation='relu'),
MaxPooling2D(2,2),
Flatten(),
Dense(256, activation='relu'),
Dropout(0.4),
Dense(2, activation='softmax')
])
model.summary()
Architecture Insights:
Improvements:
BatchNormalization()
.GlobalAveragePooling2D()
.The model was compiled with the Adam optimizer and sparse categorical crossentropy loss, then trained for 5 epochs on the training dataset with validation on the validation dataset.
model.compile(
optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
history = model.fit(
trained_data,
validation_data=valid_data,
epochs=5,
callbacks=[
tf.keras.callbacks.EarlyStopping(patience=2, restore_best_weights=True)
]
)
Training Details:
Monitoring: Training and validation accuracy/loss were plotted to diagnose issues like overfitting or underfitting. Example visualization code:
import matplotlib.pyplot as plt
plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
The trained model was evaluated on the test dataset, achieving approximately 96% accuracy and 12.4% loss.
test_loss, test_acc = model.evaluate(tested_data)
print(f"Test Accuracy: {test_acc:.5f}")
print(f"Test Loss: {test_loss:.5f}")
Evaluation Insights:
Additional Metrics: Compute precision, recall, and F1-score to assess performance, especially for imbalanced datasets.
from sklearn.metrics import classification_report
import numpy as np
y_pred = []
y_true = []
for images, labels in tested_data:
preds = model.predict(images)
y_pred.extend(np.argmax(preds, axis=1))
y_true.extend(labels.numpy())
print(classification_report(y_true, y_pred, target_names=['Class 0', 'Class 1']))
Best Practices for Image Classification:
Future Improvements: