Image Classification Model Documentation

This document outlines the development pipeline for an image classification model built using a convolutional neural network (CNN) with TensorFlow/Keras. The model is designed to classify images into two categories, leveraging standard deep learning techniques for data cleaning, preprocessing, model architecture, training, and evaluation. Additional details provide context, best practices, and insights into image classification.

Table of Contents

Data Cleaning

The dataset images were cleaned to ensure data integrity before training. A Python script verifies each image file using the PIL library’s Image.verify() method, removing corrupted or unreadable images. This process was applied to the train, test, and validation datasets.

Why Data Cleaning Matters: Corrupted images (e.g., truncated files, incorrect formats) can cause errors during training or degrade model performance. Cleaning ensures robustness and consistency in the dataset. The script below checks for corrupted images and logs the number of files removed.

    
    
import os
from PIL import Image

def clean_the_data(files):
    cleaned = 0
    for temp, _, file in os.walk(files):
        for f in file:
            paths = os.path.join(temp, f)
            try:
                img = Image.open(paths)
                img.verify()  # Verify image integrity
            except Exception as e:
                print(f"Removing corrupt image: {paths} ({str(e)})")
                os.remove(paths)
                cleaned += 1
    print(f"Cleaned {cleaned} corrupt images")
    return cleaned

# Example usage
clean_the_data("dataset/train")
clean_the_data("dataset/valid")
clean_the_data("dataset/test")
    
  

Additional Considerations:

Note: For large datasets, consider parallelizing the cleaning process using multiprocessing to improve efficiency.

Data Preprocessing

Images were resized to 64x64 pixels and batched for efficient training using TensorFlow’s image_dataset_from_directory method. This approach leverages directory structures for automatic labeling and supports batch processing for GPU optimization.

    
    
import tensorflow as tf

img_size = (64, 64)
batch_size = 32

trained_data = tf.keras.preprocessing.image_dataset_from_directory(
    "dataset/train",
    seed=42,
    image_size=img_size,
    batch_size=batch_size,
    label_mode='int'  # For sparse categorical crossentropy
)

valid_data = tf.keras.preprocessing.image_dataset_from_directory(
    "dataset/valid",
    seed=42,
    image_size=img_size,
    batch_size=batch_size,
    label_mode='int'
)

tested_data = tf.keras.preprocessing.image_dataset_from_directory(
    "dataset/test",
    seed=42,
    image_size=img_size,
    batch_size=batch_size,
    label_mode='int'
)
    
  

Visualization: Sample images from the training set were visualized to verify correct loading and labeling. This step is critical to detect issues like mislabeled data or incorrect preprocessing.

Enhanced Preprocessing:

    
    
data_augmentation = tf.keras.Sequential([
    tf.keras.layers.RandomFlip("horizontal"),
    tf.keras.layers.RandomRotation(0.1),
    tf.keras.layers.RandomZoom(0.1),
])

trained_data = trained_data.map(lambda x, y: (data_augmentation(x, training=True), y))
    
  

Model Architecture

The CNN model was built using Keras’ Sequential API, designed for binary image classification with the following layers:

    
    
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(64,64,3)),
    MaxPooling2D(2,2),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(128, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Conv2D(256, (3,3), activation='relu'),
    MaxPooling2D(2,2),
    Flatten(),
    Dense(256, activation='relu'),
    Dropout(0.4),
    Dense(2, activation='softmax')
])

model.summary()
    
  

Architecture Insights:

Improvements:

Training

The model was compiled with the Adam optimizer and sparse categorical crossentropy loss, then trained for 5 epochs on the training dataset with validation on the validation dataset.

    
    
model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

history = model.fit(
    trained_data,
    validation_data=valid_data,
    epochs=5,
    callbacks=[
        tf.keras.callbacks.EarlyStopping(patience=2, restore_best_weights=True)
    ]
)
    
  

Training Details:

Monitoring: Training and validation accuracy/loss were plotted to diagnose issues like overfitting or underfitting. Example visualization code:

    
    
import matplotlib.pyplot as plt

plt.plot(history.history['accuracy'], label='Training Accuracy')
plt.plot(history.history['val_accuracy'], label='Validation Accuracy')
plt.title('Model Accuracy')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
    
  

Evaluation

The trained model was evaluated on the test dataset, achieving approximately 96% accuracy and 12.4% loss.

    
    
test_loss, test_acc = model.evaluate(tested_data)
print(f"Test Accuracy: {test_acc:.5f}")
print(f"Test Loss: {test_loss:.5f}")
    
  

Evaluation Insights:

Additional Metrics: Compute precision, recall, and F1-score to assess performance, especially for imbalanced datasets.

    
    
from sklearn.metrics import classification_report
import numpy as np

y_pred = []
y_true = []
for images, labels in tested_data:
    preds = model.predict(images)
    y_pred.extend(np.argmax(preds, axis=1))
    y_true.extend(labels.numpy())

print(classification_report(y_true, y_pred, target_names=['Class 0', 'Class 1']))
    
  

Best Practices and Future Improvements

Best Practices for Image Classification:

Future Improvements: