Is machine learning too math

The 3 basics of AI (excluding math)

There is a lot of talk about machine learning and AI. Even so, many still think that machine learning is a complicated business. But it doesn't have to be.

If you've always been interested in these questions and want to find your own use cases for AI, here's what you need to know:

  1. What is machine learning?
  2. When can machine learning be used?
  3. What are the most common misunderstandings?

1. What is machine learning?

Let's say you want to fill a knowledge gap. You want to use the information you have to effectively predict other information. Conveying knowledge is usually a human task. But the exciting thing about machine learning is that computers can now learn something too.

Machine learning is just one tool - software that can help you find valuable patterns in large amounts of data. Once the software has identified and saved these patterns, you can use them to make predictions.

So machine learning is about automated predictions: automatically find and use patterns.

Data → Patterns → Forecasts

example

Take a desired result (a customer buys a product) and some influencing factors (everything that the customer has done before). One might ask: what behavior means that a customer makes a purchase and which does not? The better your answer to this question, the better at predicting each customer's purchases in the future.

Behavior → pattern → purchase

How do you find patterns without machine learning?

If you're not using machine learning, you can try getting the patterns from an expert. For example, what does your product manager think differentiates buyers from non-buyers?

Based on the answers to this question, you can write down some rules and work with a programmer to automate those rules with software. An example rule could be:

“Every user who has seen more than 4 products in the last 24 hours is marked as high potential”.

However, this approach has some drawbacks:

It's slow and expensive. Gathering the information and especially writing the software takes a lot of time and resources.

He is rough. Humans do not have a very precise idea of ​​the best line between high and low potential: are there 4 product views or 7? The input you get is also influenced by human opinions, which means you won't find any surprising patterns.

It is not complex. We're all bad at jotting down very complex interactions, be it on paper or in software. Once you have 6 or more rules it becomes difficult to add more without losing sight of the bigger picture.

Machine learning automates these steps

Machine learning can automatically find patterns in data and then write them into a rule set (also called a model).

Compared to human intuition, this procedure has clear advantages:

It's based on data. The algorithm is not biased. He finds every pattern in the data, including quirky, surprising patterns that may not make sense to us at first glance.

It's quick and inexpensive. Even if you run a model on a huge terabyte of data set, it usually doesn't cost you more than $ 10 in cloud computing costs. Usually we're talking about a few cents here.

It's smart. Computers are very good at accurate calculations and have perfect memories. So it's pretty easy for software to create large, interrelated rule sets. The models that a machine learning algorithm generates can easily have thousands of rules that capture many tiny patterns. And they only have one goal in mind: to make the most accurate predictions.

2. When can machine learning be used?

Machine learning is a powerful tool but only if certain conditions are met:

  • 1. You need lots of examples
  • 2. The examples must contain the exact outcome that you want to predict
  • 3. The input data must be relevant
  • 4. Your data needs to be properly quantified
1. You need lots of examples

To be more specific, you need at least 500 examples of each of the outcomes you want to predict. If you want to predict when a user will buy a particular product, you need at least 500 examples from users who bought that product and 500 examples from users who did not.

2. The examples must contain the exact outcome that you want to predict

For example: From an algorithm based on how many users like a product, you cannot expect any predictions about how many users will buy that product.

In contrast to humans, machine learning algorithms do not transfer learning from one situation to a similar, but basically completely different situation. It has to be the same situation: exactly the same input factors and the same forecast results.

3. The input data must be relevant

Machine learning can only find patterns if they are present in your data. The most important information for making a correct prediction needs to be in the data on which you run the model. Only then is the model able to find the right patterns.

For example, if you want to predict the next purchase a user will make, you must have his previous purchases as input data, as these naturally contain valuable information about his next purchase.

If you only have this information in the CRM while the other information, e.g. how often he logged in, is contained in your web analytics data, then you should make the effort to connect these records.

There is a quick way to check which records you need as input factors: As a human, what information would you use to make a good prediction? What could be relevant?

Our brains are very powerful at subconsciously finding patterns, so our intuitions about what data could be useful are very good.

Conversely, if you believe that as a human being you are unlikely to be able to predict the outcome based on these inputs, this is a warning sign. Machine learning is unlikely to find anything in that case either, so this may not be a good context to apply it to.

The human subconscious is a master of pattern finding. We may take longer, but we're still better. If we can't, an algorithm probably won't either.

4. Your data needs to be properly quantified

Machine learning algorithms can typically only work with data in what looks like an Excel spreadsheet: tabular form.

The algorithm has neither our senses nor our understanding. For example, he cannot read any text. He understands text as numbers, as a number of words and letters.

You need to properly prepare the input factors in a machine-readable format. This process is extremely important and is called Feature engineering designated. Feature engineering is much more important than the algorithm you are working with. The better you prepare the data, the better the algorithm can learn and the better your predictions will be.

3. What are the most common misunderstandings?

In relation to widespread misunderstandings, there are 3 things to note:

  1. AI does not replace people
  2. The algorithm is not that important
  3. Machine learning is not a "learning system"

AI does not replace people

If a job is so important that it is currently being done by a human, don't try to automate it with AI.

To some extent, the AI ​​software is built the way we think our brain is learning. But it is an extreme oversimplification of even what little we know about brains. Our brain is still far superior to any algorithm when it comes to learning and applying knowledge.

Comparing AI to a human brain is like comparing a car to the human body: we certainly don't see cars as competition for our bodies. Under very specific circumstances (long stretches of more or less flat ground) you can cover faster, but you are no better at moving. You are just a tool that can do one thing well.

AI is good at finding patterns in well-ordered data sets, but that's about it.

Most human jobs involve work that AI is not good at:

  • We are constantly dealing with new situations.
  • We usually gather information from many different sources. The input factors for our work are not properly quantified and recorded in databases.
  • We are constantly interacting with other people who see the world more or less the way we do and who we can understand.

As a rule of thumb, if a job is so important that it is currently being done by a human, don't try to automate it with AI. You will most likely fail.

You can, however Use AI to replace rules written by humans. Use intelligent automation to replace naive automation.

The algorithm is not that important

The algorithm is not decisive for success or failure, but rather the problem selection, the data sets and the data processing.

Algorithms get a lot of attention today. Products that use the latest types of "neural networks" are reportedly superior. But the truth is, the ROI of your machine learning project is not determined by the algorithm.

What determines the performance of your system is how well you pick the right problem, which data you include and how well you prepare your data(see feature engineering).

More than 80% of the effort in a machine learning project flows into these decisions. It's not often talked about, but it is precisely these decisions that determine success or failure.

After making the right decisions about the problem, the data, and the preparation, you can spend some time finding a very good algorithm and then a little more time tweaking that algorithm.

Machine learning is not a "learning system"

It's a popular notion that machine learning systems are designed to learn from feedback and improve over time. This increasing learning should be the genius of it, similar to an active interaction with users, markets, etc.

It's an attractive idea because it sounds like it we learn. But that's not how machine learning works in practice. Machine learning requires a large number of structured examplesfrom which it can learn.

At a later point in time - perhaps after a few weeks when you have collected more data - you can run the model again. Or you can even run a new model every day. But they are small, incremental updates that don't make a big difference to the result.

The most important thing is the large amount of data that you have to begin with. If you don't have that amount of data, you won't be able to build AI into your product.

TL; DR

So machine learning is a great tool for finding complex patterns in your large data sets and then using those patterns for predictions.

If you are already using some automated decision making in your business, it is worth considering whether you can use machine learning programs to make them more accurate.