Bayes Theorem: The Basis for Self-Driving Cars and Other Machine Learning Applications

Photo by Tomáš Malík from Pexels

Bayes theorem, invented by Thomas Bayes in the 18th century, describes a simple and powerful methodology for calculating the probability of a belief/hypothesis occurring given a new piece of evidence/observation.

Throughout history, the Bayes theorem has been applied to find nuclear bombs and is the basis for machine learning algorithms (classifiers). It’s used in spam filtering, self-driving cars, to access financial risk and more.

The algorithms can accurately identify the probability of an event occurring and therefore make good decisions.

Let’s go through a classic example.

Imagine you’re not feeling well and suspects it could be something serious. You then go and do a test which unfortunately comes back with a positive result. Considering the test has a 95% accuracy rate — What’s the probability of having the disease given that it tested positive? That’s the key question.

Faced with such facts, it’s common that you get desperate and jump to conclusions. It may sound reasonable to think that there is a 95% chance, or close, of having it because that’s the accuracy of the test, after all.

But there is more to the story.

Before I continue, let me be specific about the 95% accuracy of this test—it means that 95 out of 100 people that have the disease will test positive, and 95 out of 100 that don’t have it will test negative. The false-positive rate is 5% (We will need that too, to calculate)

As per Bayes theorem, to more accurately calculate the probability, you have to account for three components which are then plugged into this formula: P(B|E) = P(E|B)*P(B)/P(E)

P=probability; B=belief; E=evidence. Let’s look at each one.

  • P(B) or P(person has the disease). What’s the probability of the person having the disease (belief)? It’s called the “Prior probability”. Let’s say 2 out of 100 in the population has the condition. Then the probability is 0.02 or 2%.
  • P(E|B) or (person tests positive|given the disease). What’s the probability of testing positive given the person has the disease? As the test has 95% accuracy, then the value is 0.95.
  • P(E) or P(person tests positive). What’s the probability of someone testing positive regardless of having the disease or not? This element is a combination of having the disease and testing positive (the “true positive” / the numerator part of the formula) and not having the disease, but testing positive anyway (false positive). For the latter, you multiply the “people in the population that don’t have the disease”, which is 0.98, with the “false positives”, which is 0.05.
P(B|E) = 0.95 * 0.02 / (0.95 * 0.02) + (0.98 * 0.05)
= 0.019 / 0.019 + 0.049
= 0.019 / 0.068
= 0.27941 (27%)

The power of the Bayes theorem is more noticed when you apply the formula iteratively. As you calculate the probability upon every new evidence, the result gets more and more accurate. If you do a second test, the probability of having the disease jumps from 27% to 86%! That’s because, after the first test, the P(B) / “Prior” becomes 0.27. It’s not 0.02 anymore.

Bayes theorem is the basis for a supervised machine learning algorithm/classifier called Naive Bayes. In the next article, I’ll cover the implementation of it from scratch in Python.

Thanks for reading.


For further reading:

https://towardsdatascience.com/deep-learning-understand-the-principle-ad146d9f54dd

https://towardsdatascience.com/deep-learning-understand-the-principle-ad146d9f54dd

Leave a comment

Blog at WordPress.com.

Up ↑