This is not just a beginner's introduction to the amazing world of artificial intelligence. In this blog, we will be exploring artificial intelligence through the eyes of deep learning. You might be thinking, I've heard of AI, but what is deep learning? Deep learning is the subset of machine learning methods based on artificial neural networks with representation learning. The adjective 'deep' refers to the use of multiple layers in the network. For this tutorial, you won't need an in-depth understanding of artificial intelligence or deep learning. It is a code-along type of blog. You'll need basic understanding of python. And the best part? At the end of this blog, we'll build an image classification application using deep learning, a thrilling journey into the world of AI.
Image classification is not just a task, it's a crucial pillar in the field of computer vision. Its goal is to understand the content of an image and assign it to one or more predefined categories or labels. This process has gained significant popularity, especially after the introduction of the ImageNet challenge and the availability of the massive ImageNet dataset, showcasing its relevance and impact in the world of AI.
The advent of deep neural networks, specifically the groundbreaking AlexNet architecture, revolutionized image classification by achieving remarkable performance on the ImageNet dataset. Deep learning models have become the driving force behind state-of-the-art image classification systems, capable of automatically learning hierarchical representations from raw data.
For this tutorial, I'll be using the Oxford Flower Image dataset. The Oxford Flower Image Dataset is a meticulously curated collection of 8,189 flower images spanning across 102 different categories. This dataset is a veritable treasure trove for those working on training and evaluating image classification models. Each photograph captures the unique essence of a specific flower species, showcasing the incredible diversity found in nature's floral kingdom.
In the world of artificial intelligence and deep learning, where complexity often reigns supreme, Fastai emerges as a beacon of simplicity and accessibility. This powerful open-source library, built on top of the widely used PyTorch framework, has revolutionized the way developers and researchers approach AI projects.
At its core, Fastai is designed to streamline the development process, abstracting away many of the low-level complexities that typically bog down deep learning projects. With its intuitive and concise API, Fastai empowers developers to focus on the problem at hand rather than getting bogged down in intricate implementation details.
Before you get bored with all the information, let's get to coding.
1. Import Necessary Modules
from fastai.vision.all import *
First, we import all the necessary modules and functions from the fastai.vision
library, which provides a high-level API for computer vision tasks.
2. Define Dataset Path
You can download the dataset from here.
path = Path('flowers')
We create a Path
object pointing to the directory named 'flowers', which contains the flower image dataset we'll be working with.
3. Create Data Block
dls = DataBlock(
blocks=(ImageBlock, CategoryBlock),
get_items=get_image_files,
splitter=RandomSplitter(valid_pct=0.2, seed=42),
get_y=parent_label,
item_tfms=[Resize(224, method='squish')]
).dataloaders(path)
This step is crucial for loading and preprocessing the data. We create a `DataBlock` object, which handles these tasks. Here's what each parameter does:
- blocks=(ImageBlock, CategoryBlock): Specifies that the data consists of images and categories.
- get_items=get_image_files: The function to use for getting the image file paths.
- splitter=RandomSplitter(valid_pct=0.2, seed=42): Splits the data into training and validation sets, with 20% of the data used for validation. The `seed=42` ensures reproducibility.
- get_y=parent_label: The function to use for getting the label (category) of each image, which is assumed to be the name of the parent directory.
- item_tfms=[Resize(224, method='squish')]: Applies a transformation to each image by resizing it to 224x224 pixels using the 'squish' method, which preserves the aspect ratio.
- .dataloaders(path): Creates the actual data loaders using the specified `path` for the dataset.
4. Display Sample Data
dls.show_batch(max_n=6)
This command displays a batch of 6 images from the dataset, along with their labels, allowing us to visually inspect the data.
5. Choose a Pre-trained Model
For this tutorial, we'll employ transfer learning, a powerful technique that leverages knowledge gained from solving one problem and applies it to a different but related problem. Specifically, we'll use a pre-trained model that has been trained on a large dataset (such as ImageNet for computer vision tasks) and fine-tune it on our flower image dataset.
We'll use the DenseNet architecture, short for Densely Connected Convolutional Networks, which is an innovative deep learning architecture that has demonstrated remarkable performance and efficient utilization of computational resources in computer vision and image classification tasks.
6. Create Learner and Fine-tune
learn = vision_learner(dls, densenet121, metrics=accuracy)
learn.fine_tune(10)
We create a Learner
object, which is responsible for training the model. The vision_learner
function takes the following arguments:
- dls: The data loaders created earlier.
- densenet121: The pre-trained DenseNet-121 model to use as the starting point.
- metrics=accuracy: The metric to use for evaluating the model's performance (in this case, accuracy).
Then, we fine-tune the pre-trained model on the flower image dataset for 10 epochs (iterations over the entire dataset).
7. Evaluate Model Performance
final_accuracy = learn.recorder.final_record[2]
print(f"Final Accuracy: {final_accuracy:.2f}")
After fine-tuning, we access the final accuracy achieved by the model and print it out.
8. Visualize Results
interp = ClassificationInterpretation.from_learner(learn)
interp.plot_confusion_matrix(figsize=(15, 20))
learn.show_results(max_n=20)
These lines create a ClassificationInterpretation
object from the trained Learner
and plot the confusion matrix for the model's predictions on the validation set. Additionally, we display the top 20 predictions made by the model on the validation set, along with the ground truth labels.
9. Export Model
learn.export(f'model.pkl')
Finally, we export the trained model to a file named 'model.pkl' for future use.
This step-by-step guide demonstrates how to leverage the powerful fastai library for computer vision tasks, specifically training an image classification model on a flower image dataset using transfer learning and the DenseNet architecture.