Applying CNN on cifar10
Published:
This is a starting of Deep learning and machine learning. Mainly talking basic model of CNN and results. Happy Experimenting π
How to think an DL architecture
CIFAR-10 is a dataset with 50K images of 10 classes for training and 10K for tests. There are state of the art models available, but i wanted to go with my own.
Started with basic model, 2 CNN layers and 3 fully connected layers. Here is link to my experiment .ipynb file. Try to play with it.
Why this architecture, and not different?
I always have this question when i see otherβs solutions. But i think i got an answer by doing this exercise.
- First you go with whatever knowledge of any model components you have.
- Try to add more and more complexity. Find why accuracy is not getting better and add a little something which your intuition says.
- Try to tune hyperparameters for the current architecture to get best out of the model.
I am able to increase accuracy from 17% to 50%. This is ofcourse not good. Best accuracy is around 80s. But it was a starting and without seeing any other models. So, i am happy with my experiment.
What i did
I tried to make model deeper and larger. Using CNN at first layer is obvious. CNN make use of spacial relation of an image. But i used only 4 layers of kernel size 3. this means it can get information of 4*3-2=10 pixels around it. I think this is enough. After this i used fully connected layer. Using fully connected layer is easiest guess in Deep Learning.
Observations
When we increase the model size (number of layers) we need to decrease learning rate. Because it becomes sensitive to parameters value. Because when we have multiple layers and using relu we are multiplying scalers again and again and the last output becomes sensitive to first layer.
I think 2 most important parameters are # epoch and learning rate. We can set learning rate low and increase # epoch. But, this will take larger time and in some cases model will start to remember things rather than genrelizing.