Applying CNN on cifar10

1 minute read

Published: March 03, 2024

This is a starting of Deep learning and machine learning. Mainly talking basic model of CNN and results. Happy Experimenting 🙂

How to think an DL architecture

CIFAR-10 is a dataset with 50K images of 10 classes for training and 10K for tests. There are state of the art models available, but i wanted to go with my own.

Started with basic model, 2 CNN layers and 3 fully connected layers. Here is link to my experiment .ipynb file. Try to play with it.

Why this architecture, and not different?

I always have this question when i see other’s solutions. But i think i got an answer by doing this exercise.

First you go with whatever knowledge of any model components you have.
Try to add more and more complexity. Find why accuracy is not getting better and add a little something which your intuition says.
Try to tune hyperparameters for the current architecture to get best out of the model.

I am able to increase accuracy from 17% to 50%. This is ofcourse not good. Best accuracy is around 80s. But it was a starting and without seeing any other models. So, i am happy with my experiment.

What i did

I tried to make model deeper and larger. Using CNN at first layer is obvious. CNN make use of spacial relation of an image. But i used only 4 layers of kernel size 3. this means it can get information of 4*3-2=10 pixels around it. I think this is enough. After this i used fully connected layer. Using fully connected layer is easiest guess in Deep Learning.

Observations

When we increase the model size (number of layers) we need to decrease learning rate. Because it becomes sensitive to parameters value. Because when we have multiple layers and using relu we are multiplying scalers again and again and the last output becomes sensitive to first layer.

I think 2 most important parameters are # epoch and learning rate. We can set learning rate low and increase # epoch. But, this will take larger time and in some cases model will start to remember things rather than genrelizing.

Share on

Twitter Facebook LinkedIn

Mihir Sutariya

Applying CNN on cifar10

How to think an DL architecture

Why this architecture, and not different?

What i did

Observations

Share on

You May Also Enjoy

Bayesian Learning - All of Machine Learning

LSTM model trained on wikitext-2 (pytorch)

VGG Network on cifar-100

Univeral Approximation Theorem