Machine Learning is an umbrella term used to describe a variety of different tools and techniques which allow a machine or a computer program to learn and improve over time. ML tools and techniques include but are not limited to Statistical Reasoning, Data Mining, Mathematics and Programming.
This definition can be primarily divided into 2 subsets; formal and informal, while formal deals with the specifics of what constitutes a Machine Learning technique the latter deals with simplifying this definition making it easier to grasp by a broader audience.
1) Formal Definition : Before I quote a definition which effectively captures the essence of Machine Learning, let's understand the prerequisites. To learn, a machine needs Data, Processing Power/Performance and Time. It could be said that if a machine gets better at something over time and improves its performance as more data is acquired, then this machine is said to be learning and we could call this process Machine Learning.
Tom Mitchell very aptly describes Machine Learning as follows :
A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P, if its performance at tasks in T, as measured by P, improves with experience E.
2) Informal Definition : In relatively simple terms we can summarize Machine Learning as giving machines/computers an ability to learn the way humans do, ie without explicitly telling them what to do. Instead we let them learn on their own and even fail in some instances so they learn from that failure. OFC this is an oversimplification but I think it gets the point across.
Arthur Samuel explains Machine Learning as :
The field of study that gives computers the ability to learn without being explicitly programmed.
The History of Machine Learning is quite Convoluted(Pun intended), since Machine Learning, the term I mean could be deceptive. Machine Learning is not a monolithic concept but a collection or tools and techniques which have their own separate origins throughout the past 70 years or more. But still there were points in time which could be labeled significant enough to have manifested Machine Learning in the form we see it today.
Before we take a tour down memory lane, let's pay homage to the father of Automata, Alan Turing.
Year 1950 : Alan Turing developed the Turing Test during this year.
The Turing Test also called the "Imitation Game" had 1 objective; to predict if a machine is able to think like a human. While the technique is quite primitive by today's standards, the philosophical implications have had a big impact on the development of AI.
Turing Test is defined as a game of question and answers played by a human and a machine, and the person asking those questions is also a machine. This machine's job is to process the response provided by the machine and human player and judge whether the machine is a human or otherwise. Turing predicted that by the 21st century we would have machines capable of passing as humans, unfortunately that is not the case. I mean just take ChatBots for example, even when we do not explicitly know that it's a ChatBot we are talking to, we are easily able to see through it's disguise and identify it as a program and not a real person.
Year 1957 : Perceptron
Deemed the first ever Neural Network was designed this year by Frank Rosenblatt. Neural Networks comprise a very popular and promising subset of Machine Learning called Deep Learning. It is one of the most promising Machine Learning tools we have at our disposal today.
Year 1960 : MIT developed a Natural Language Processing program to act as a therapist. The program was called ELIZA, It was quite a success experimentally. But it was still using scripting to do its magic. Nonetheless it was a key milestone for the development of NLP - Natural Language Processing which is again a subset of Machine Learning and is widely used today.
Year 1967 : The advent of Nearest Neighbor algorithm, very prominently used in Search and Approximation. K-Nearest Neighbor or KNN is one of the most popular Machine Learning algorithms.
Year 1970 : Backpropagation takes shape. Backpropagation is a set of algorithms used extensively in Deep Learning, they dynamically alter the Deep Learning Neural Network to effectively do self correction. Backpropagation scientific paper was published by Seppo Linnainmaa but at that time it was called Automatic Differentiation(AD).
Year 1980 : Kunihiko Fukushima successfully built a multilayered Neural Network called ANN - Artificial Neural Network which acted as a platform for the development of Convoluted Neural Networks down the line.
Year 1981 : Gerald Dejong built a new way to teach machines and he called it Explanation Based Learning, this was a very early Machine Learning implementation and it processed Data to create a set of rules which is another way of saying that it created an algorithm.
Year 1989 : Reinforcement Learning is finally realized. Q-Learning algorithm is developed by Christopher Watkins which made it possible to use Reinforcement Learning in practical applications, for example, teaching a machine to play a risk vs reward game.
Year 1995 : Rise of 2 very important algorithms in the Machine Learning space; Random Forest Algorithm and Support Vector Machines.
Year 1997/98 : LSTM was introduced by Sepp Hochreiter and Jürgen Schmidhuber, LSTN revolutionized NLP research and application. Along with this MNIST database was also developed courtesy of a team led by Yann LeCun. MNIST database is regarded as a benchmark in training Machine Learning algorithms for Handwriting Recognition
Year 2006 : Geoffrey Hinton, regarded as the father of Deep Learning, coined this very term this year along with Netflix starting a competition to beat its Recommender System's accuracy in predicting user scores by 10%. This competition was won in 2009.
Year 2009 : ImageNet is created, which facilitated Computer Vision research by giving researchers access to a vast database categorized by objects and features. It was a project initiated by Fei-Fei Li from Stanford University.
Year 2010 till now : Google Brain and Facebook's DeepFace are now revolutionizing Machine Learning and pushing boundaries. Google Brain has successfully reached a Cat's level of intelligence and can now even browse and use youtube and correctly predict or identify which videos contain a cat, on the other hand Facebook's DeepFace can now identify people with an accuracy figure exceeding 97%.
To understand the need for Machine Learning and it's subsequent benefits, we need to go back to the roots.
Let's ask ourselves what is a computer program ?
Isn't it a set of rules applied on a certain input to get a desired output !
In other words explicitly programming a machine to do a task based on some parameters is what loosely defines traditional programming. While it has served us well till now, at the current pace of technological progress it is getting very complex and hard to write code for higher order problems.
To substantiate what I just said, let's compare 2 programs, one which has to deal with only 2 parameters and another which has to deal with n number of parameters, n being a very very large number. Coding explicitly for the former seems plausible and while it isn't impossible to code explicitly for the latter, it will be a mammoth task and the complexity will rise to a level where it will be very difficult to maintain such a convoluted code.
With the data explosion in recent decades accompanied by the advent of big data and huge strides made in the sector of performance computing; Industry leaders across multiple disciplines are now asking bigger and better questions, to improve customer experience(Entertainment), to improve yield(Semiconductor), to reduce wait times(E-Commerce) and also to improve diagnosis(Healthcare).
The graph above does a very good job of explaining what Machine Learning Model encompasses.
To delve deeper - Machine Learning covers such a vast variety of techniques and algorithms, there is no simple answer to this question, but we definitely can deduce the essence of it considering an algorithm.
Linear Regression, Logistic Regression and Neural Networks pretty much work on the same core principles :
This is a technique in which Machine Learning algorithm takes labeled data as input and then predicts output for an unlabeled set of data. What that means is that along with a question we are also providing our algorithm with the right answer and then we let the algorithm figure out a relationship between the answer and the given question. Once the algorithm is able to figure this out, it can effectively use this knowledge to predict the answers for new questions which we feed it.
Deep Learning is a type of Supervised Learning and has been very successful at things like Object Detection and Analyzing Medical Scans for Tumor to name a few.
Unlike Supervised Learning, Machine Learning algorithm is not given a labeled data, meaning it does not have an answer to a given question, or more aptly put; we do not provide any context to the algorithm about the data, the algorithm is expected to mine that data and derive patterns and relationships to formulate the context.
Anomaly Detection is a type of Unsupervised Learning technique used to detect fraudulent transactions in the Finance Industry.
Autoencoders is a technique used for compression, but it does more than that, it's capable of capturing the essence of the object it’s compressing. It uses this essence to create the original object back from the compressed version of it with a very high degree of accuracy. Examples include removing noise from an image.
I am sure most of us have played a video game or two while growing up(or still do). In those games whenever we did well, we were rewarded with some coins or with a new ability or a high score for that matter. This constant feedback gave us incentive to try and try again to get better at the game, Reinforcement Learning algorithms work on the same principal. The goal here is to improve the efficiency of a machine by providing it cues about how it's doing, if it does well, we need to reward it, and if it does bad, we shouldn't reward it. Repeating this technique a numerous times has positive implications when it comes to the performance of the algorithm.
Video Games are a great example of where these techniques are being used and researched the most. I mean it is now possible to teach a computer program to successfully beat the very famous game Doom.
Despite Machine Learning's newfound success, it is not the be all end all solution to all of our problems.
Machine Learning is plagued by the following limitations :
Their effectiveness depends on the amount of data you can provide the algorithms, and sometimes this need is too high and finding that much amount of labeled data to train the algorithm will not be an easy task. And even if we could find a lot of data for our use case, it's possible to run into a dead end when the algorithm faces an unforeseen situation where the previous data cannot empower the algorithm to produce the desired output.
It should also be noted that if the data set being fed to the algorithm itself has discrepancies or inadequacies then the algorithm's output will also be less than ideal.
By verifiability we define whether a system's inner workings are clear and understandable. As often is the case with complex Machine Learning algorithms, even the best of researchers struggle to diagnose and understand the key points affecting the decisions made by the algorithms. The best example of this would be a Convoluted Neural Network or CNN for short, it's so utterly complex in it's working that if you aren't careful, you could start rolling down a rabbit hole.
In short, unlike traditional programs which can be reverse engineered relatively easily to understand key metrics, Machine Learning algorithms are a much tougher nut to crack.
Machine Learning algorithms need a lot of time and computing power to reach an acceptable level of performance and even with state of the art technology, complex problem can take months to properly train. It is not an optimal situation as quite a few times developers realize quite late in the training process that they could improve the algorithm. By then, thanks to the long time taken to iterate over just 1 version of the Machine Learning algorithm, they have already wasted a lot of time.
AI or even Machine Learning could be categorized as Top-Down or Bottom-Up in nature, and the jury is still out on which is the best type of approach here.
To flesh it out a little,
A bottom-Up AI approach means we have a good grasp of the underlying logic which dictates what our algorithm does, think of it like having a brain and understanding that the brain works by using a billion cells called neurons, which collectively makes up a very complex neural network. Deep Learning is a very prominent example of this.
On the other hand Top Down approach says that we have a good understanding of what we want the algorithm to do, but we do not care about it's inner workings. Taking the example of a brain again, let's say we write a few rules and give those rules to our brain, based on that and the question being asked brain will provide us an answer which adheres to the rules we had previously given it. A good example of this would be Reinforcement Learning.
Machine Learning has invaded most of our daily lives and we barely notice it. Let's go through a couple of examples which exemplify the extent to which Machine Learning has invaded our daily routine.
The music streaming platform is liked unanimously by most users due to its Machine Learning algorithm which does a pretty good job at understanding what the user likes, and it's definitely not as simple as understanding what genre you like and then picking up popular music from that genre for you, it's much deeper than that, you know you have got a gem of an algorithm if it can segregate music by sub genres, tone, tempo and general mood of the song. This is why despite there being competing services out there which arguably provide higher bit rate music, people tend to stick with spotify.
Ever wondered how amazon is able to suggest you products which are pretty close to something you might be interested in. Yep you guessed it right, thats Machine Learning at work here. It basically tracks your past purchases and browsing patterns to create a profile based on which it is able to suggest you useful products.
Facebook is investing heavily into AI and the results are quite evident for someone who has a keen eye. How do you think facebook is able to tag people in photographs and able to suggest you people you might know ? Former is a technique called Face Recognition and the latter is more complicated than that, but for simplicity's sake, let's call it a Recommender System.
Spam has become quite an issue in the last decade or so with the emergence of cheap and accessible internet with people trying to come up with new ways to scam you or new ways to gather data, bombarding you with information you do not need nor desire. In lieu of that google and other mailing platforms employ Spam Detection mechanisms which segregate arriving mails into 2 classes, Spam and Not Spam, and send them to appropriate mail folders.
We have all used either Siri or Google Assistant in our daily lives and know how incredibly utilitarian those assistants are. Isn't it great that they can understand you irrespective of your dialect or accent ? This application of Machine Learning is called Natural Language Processing or NLP for short and it's getting better day by day. Soon it will be integrated with all our devices and with the help of IOT has the potential to revolutionize our lives.
Banking Services is one area which requires extra measures to build a brand and assure customers of great security in this day and age of technology. Banks have a really large user base these days and they simply do not have the bandwidth to monitor each and every transaction by employing an actual person, so these Fraud detection systems come into play and alert bank employees who in turn get in touch with the customers to verify if the transaction was legit. Fraud is a big concern for any Financial Institution and Machine Learning is being employed in this sector to identify fraudulent transactions.
This is the last but one of the most important Machine Learning systems in development right now. Just imagine not having to spend all that time behind the wheel and using it instead to work or relax or for recreational activities without the risk of violating any traffic rules or being in an accident. This tech has the potential to get rid of unnecessary traffic jams and untimely deaths caused each year due to accidents. Most major players in the automobile industry are investing heavily into this right now.
It's unfortunate to see how even today people tend to confuse AI with ML and tend to use the terms interchangeably. It's like saying there is no difference between calling a potato a potato vs calling a potato a vegetable. To clarify Vegetable loosely represents AI and Potato loosely represents ML, So with this it should be clear that ML is a subset of AI which has a much broader scope and an inadequately defined boundary.
Let's discuss about the superset ie AI, AI is defined as a field of science working towards creating computers, machines and systems capable of displaying intelligence close to that of humans. From a simple Chess playing computer program to a highly sophisticated self driving car program, all of these constitute the term AI. AI has no definite scope and solid definition because it's a highly evolving field made of up of numerous disciplines which come together to loosely define it. It also doesn't help that this is one of the fastest growing fields of science right now.
To summarize and provide some much needed clarity, any program which can autonomously learn, act, react, adapt and evolve without human intervention and can think and reason like us would be called an AI program.
ML is a subset of AI which primarily deals with creation of algorithms which learn with experience and improve themselves over time by feeding on data, they do not need to show human like intelligence to be regarded as Machine Learning systems, they simply need to follow or showcase a self improving quality without being explicitly programmed to do so. A simple example of this would be Image recognition algorithms, the more images and the more variations of those images are fed to it, the more it improves.
To summarize, ML systems are systems which self improve given adequate data to improve their accuracy.
To drive the point home, philosophically speaking, we can say that AI bestows wisdom to a machine while ML bestows knowledge.
Machine Learning has become such a widely adopted area of research and development that programmers coming from various backgrounds can quickly get started.
To name a few popular programming languages which have adequate community support :
Conclusion and Summary
I am sure everyone must be tired by now reading through it all, so I promise I will keep this short. AI and ML are going to revolutionize every industry in the coming decades. And the market is ripe for the taking. This has seen an emergence in the need for good engineers who would then become the driving force behind AI and ML.
I wrote this article to spread awareness about the topic, to get people interested in it, and to help those who want to put their first foot in but are uncertain and afraid of the scope. I can empathize with them, the technology is still growing rapidly, we do not have streamlined paths for aspiring engineers and there is lack of a common standard in the community when it comes to the choice of algorithms and tools to be used, which all adds to the complexity.
But if you followed this article you should be good to go.
Leave a Reply
Your email address will not be published. Required fields are marked *