Most likely, when you select a video to watch on YouTube, related videos or ads will appear in your home page. Your customized reproduction lists in Spotify are also generated automatically with songs you like. You may think: “Damn! How do they know that?” And after googling how companies predict our preferences, you may find theories featuring alien machines that can see the future, or magical creatures that brainwash you at night. But the truth is that companies use a resource that comes from applied sciences.
Machine learning is an area within what’s known as Artificial Intelligence. In simple words, machine learning involves creating programs that are able to generalize behavior from input information in the form of examples. It’s a process that infers knowledge.
There are many algorithms we can use to train programs to generalize information and predict behaviors. We will focus on three of them.
This branch of machine learning is used when we want to obtain a prediction from data against which we can validate the result. The model is trained with known values, for greater chances of success. For example, we could feed the model with pictures of different aircrafts so that it can learn how they look like and determine whether an aircraft appears in a given image. Another example: if you have a database with information of the products purchased by customers of a supermarket, you can feed the model with the kind of things that one client has and hasn’t purchased, which helps predict if a certain customer will purchase a certain product. In addition, if you want to advertise a specific product, you can estimate how likely it is for people to actually wish the product, so you can know if it’s worth the cost to advertise it to them.
This branch of machine learning uses data that hasn’t been labeled, classified or categorized. It learns how to categorize by relating data, and then, it can discern if an input data belongs to a category it has previously inferred.
Often, you cannot tell how the classification will go nor what it means. They say that data scientists are sort of alchemists, like Edward Elric. “Similar” data are classified into the same category, based on a criterion defined by the person in charge of training the model, limited to his or her imagination. The results produced by those algorithms are often surprising. They can discover categories or show unknown relationships between data, and also show which data don’t belong to any category because they aren’t “similar” to any others —in general, uncategorized data are anomalous.
Unsupervised learning could be useful for training purchases made through a specific electronic payment. The categories found may be important or not (they can reveal types of expenditure and buyers) and may be useful or not for understanding customers. But unclassified data (since they are anomalous, unlike the others) are more interesting, because they can disclose illegal transactions, fraud, phishing, etc., and can tell the system to block the user or get in touch with him or her to prevent other fraudulent transactions.
This branch of machine learning is based on “trial and error”: the algorithm tries to deduce which is the best way to achieve a specific goal. It’s quite useful for automatic driving systems and even to complete videogames.
One of the disadvantages of reinforcement learning compared to supervised and unsupervised learning has to do with the ease of training, and besides, it’s not common to find problems where it’s strictly necessary to apply it, given that it learns from positive and negative stimuli, just like a pet does.
It could be useful, for example, to recommend songs or movies. The model learns the characteristics of different songs and movies and suggests a few to the user, who can then accept the recommendations or not. A new user could receive recommendations by a supervised learning model that generalizes for the majority.
Here is another, more visual example:
This model is mainly used in research because it’s expensive to train, since at its purest form, it must start from scratch, and achieving useful results can take a long time. In some cases, if combined with other types of machine learning, it can use some information as a starting point so that the learning curve is not as long and complex.
How Could I Use this in My Business?
You are the only one who can answer this question, but actually, there is a variety of possibilities and applications we haven’t imagined yet. The key is to understand that these are tools that can be implemented in virtually all areas of the business.
In our opinion, two things are crucial for a successful implementation. The first one is the business information you have collected or are in process of collecting. The second one is a team of data scientists able to determine the algorithms and help train them. A machine learning implementation takes time, so you won’t see the fruits immediately. Nevertheless, companies that are able to complete the implementation will see how machine learning quickly becomes a handy resource.