In my previous article of this series “Fundamentals of AI”, I explained the term Artificial Intelligence from the viewpoint of a common man. If a general overview of ‘A.I’ is what you are looking for then you can start here. In this article, I will be taking an in-depth tour of A.I and Machine Learning VS Deep Learning.

Why this distinction is important?

When starting off my data science journey I thought these two terms meant the same thing and that it was all just fancy jargon for Artificial Intelligence. It was not until one of my experienced colleagues asked me to differentiate them, that I realized that I knew nothing of the field I was pursuing (bummer!). This article will serve as a starting point for all those looking for data science jobs or a data science internship, making sure that you don’t make the same mistakes I did.


Let’s Get Started

As mentioned in my previous article, both these approaches to Artificial Intelligence have their differences and uses depending on your situation. Let’s first talk about them individually and then delve into comparisons.

  • Machine Learning

Machine Learning is the basis of Artificial Intelligence and has been around for longer than you can imagine. The first mathematical Machine Learning algorithms were actually developed in the 1940s!!.

You can read about the history of Machine Learning here.

Machine Learning is the art of training a machine to understand data and making conclusions. When you get buying recommendations on Amazon or a movie recommendation on Netflix, it is all because a Machine Learning algorithm at the back-end analyzed your usage trends based on your historic data and made those recommendations for you.

Certain algorithms called clustering algorithms are utilized by certain businesses to group together similar customers, they then use these clusters to create targeted advertisements or products for individual groups.

Machine Learning algorithms are usually trained on small datasets and since they are comparatively simpler, they do not require heavy computational power (in comparison to Deep Learning) however due to their simplicity they are unable to draw finer distinctions between given data.

  • Deep Learning
How deep is your learning

The concept of Neural Network was first introduced during the 1980s, however, at that time people were unfamiliar with its power and potential — mostly because it was an absolutely new concept but also because the computers available at the time did not even come close to the type of computing power that we use today to train a Deep Neural Network.

Today big tech giants like Google, Amazon, IBM, etc. all utilize Deep Learning to benefit their clients. Their services like Speech-To-Text Transcription, OCR, Chat Bots, etc. all have huge trained Neural Networks at their back. Tesla AI technology is considered state-of-the-art in autonomous cars.

In the world of A.I., Deep Learning is the big gun that has revolutionized multiple industries. It has transformed the way our world operated.

But how do we differentiate it from Machine Learning? Both Machine Learning and Deep Learning take very different approaches to solve given problems.

Machine Learning algorithms look at data as a whole and usually tend to draw decision boundaries between different samples; the caveat here is that since they treat all given input features similarly, a change in one feature (As insignificant as it may be) can cause the model to misclassify the example. A Deep Learning model, on the other hand, consists of multiple layers all of which work on different aspects of the input example. This is also one of the main reasons why a Deep Learning model is trained on a huge dataset (generally 100,000 + samples) because a larger dataset means greater diversity and only multi-layered Deep neural network can take advantage of this diversity and make out the minute distinctions — A Machine Learning model fails here.

An important caveat to mention here is that as complex a Deep Learning model gets, the higher is its training cost. This includes both time and financial costs. Even a simplistic DL model requires a hefty — expensive — GPU to train in a reasonable time. Tesla has disclosed that their current model took over 70,000 hours of GPU training!!!.


Let’s see a practical example.

I know I said there will not be any coding examples in this series but a practical comparison is important here to wind up the discussion.

I used the Fashion mnist dataset and trained it on a Machine Learning classifier called SVM — Important to note, this is one of the most complex ML classifiers — and a very simple Convolutional Neural Network — the model architecture can be seen in the image below.

CNN model architechture
CNN Architecture used for comparison.

The 712, 202 total param count may seem like a huge number but just for reference I’ll let you know Google’s latest language model has over 1.6 Trillion parameters, and OpenAIs latest GPT-3 model also has 175 billion such parameters 😬!!.

Ours doesn’t seem so significant now does it? 😃

**For a fair comparison, I passed the exact data to both the models and let both of them train on a CPU.**

  • The dataset

The dataset consisted of a total 0f 70,000 images each of size 28×28.

The split was 60,000 for training and 10,000 for validation.

  • The Results

The SVM model took several hours to train (3 Hours and 45 Mins to be precise) and the result was a blown-up model that overfitted certain classes and performed extremely poorly — It gave a validation accuracy of 10%.

The CNN network used is a very basic one, it took around 37 mins to train (Significantly faster than the SVM model) and after 10 epochs we ended with an accuracy of 90.05%!!


Conclusion

As evident from the results above; Deep Learning provides much better results however, as mentioned before, this does not mean that it is the right approach for you. The selection of which method to use depends on your problem. When encountering an AI problem, you must always ask yourself the following questions;

  1. How much data do I have available?
  2. Is the data labeled?
  3. Is time complexity a problem?
  4. How complex are the data features?

Answers to these will help you out in narrowing down to what algorithm to use.

I hope this article helped clear all your doubts and questions.

If you are interested in Data Science or A.I. in general, I write articles explaining different aspects of Artificial Intelligence.

Leave a Reply

Your email address will not be published. Required fields are marked *