Naive Bayes is a Machine Learning Classifier that is based on the Bayes Theoram of conditional probability. In this article, we will be understanding conditional probability (Bayes Theoram) and then moving on to how it translates to the Naive Byes classifier. we will be understanding the mathematics behind this classifier and then finally coding it in Python. If you are interested in more classification algorithms then you can start here.
Bayes Theoram

Named after the statistician Thomas Bayes, this theorem is also known as the theorem of conditional probability. It allows us to calculate the probability of a particular event GIVEN a set of prior conditions. For example, the probability that it will rain tomorrow GIVEN that it rained yesterday.
The formula for calculating conditional probability is shown below.

The term on the left-hand side is read as ‘the probability of event A occurring given that event B has occurred’. The term on the right-hand side is the probability of both events occurring together divided by the probability of event B occurring. The formula is quite straightforward. I will not be delving into its derivation or intuition as this article is not about Bayes Theorem but rather Naive Bayes Classifier.
The Naive Bayes Classifier
Since we are classifying our data into discrete labels, just like any other classifier, for Naive Bayes we will have a set of input features as well as their corresponding output class. A Naive Bayes classifier calculates probability using the following formula.

The left side means, what is the probability that we have y_1 as our output given that our inputs were {x_1 ,x_2 ,x_3}. Now let’s suppose that our problem had a total of 2 classes i.e. {y_1, y_2}. We will now use the above formula twice first to calculate the probability of y_1 occurring and then for y_2 occurring. Whichever has a higher probability will be our predicted class.
An important point to note here is that this classifier makes one assumption. It assumes that every input feature is independent of the other, this is specifically why the term ‘Naive‘ in the name.
This is how Naive Bayes is used for classification.
Naive Bayes Python Implementation
Let’s start coding it in Python.
This entire code can be found in the following Github repository.
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
Code language: JavaScript (javascript)
Our necessary libraries for playing with data. If you do not know how to work around with Pandas then you might want to read about it here first.
Now let’s load our dataset. I have used the Heart disease prediction data set which can be found on Kaggle.
data = pd.read_csv('heart-disease-data/heart.csv') #Read the dataset
data.head()
Code language: PHP (php)

For a Naive Bayes Classifier, we need discrete variables since we can not use continuous variables in calculating probabilities. So we need to drop some columns here such as cholesterol and trestbps.
data.drop(["age", "trestbps", "chol", "thalach", "oldpeak", "slope"],axis = 1 ,inplace=True) #drop irrelevant columns
data.head()
Code language: PHP (php)

X = data[data.keys()[:-1]]
y = data[data.keys()[-1]]X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.10, random_state=42)#test-train split
data_train = pd.concat([X_train, y_train],axis = 1) #concat back
data_test = pd.concat([X_test, y_test],axis = 1)
Code language: PHP (php)
Now we need to code the helper function that would help us calculate all the necessary probabilities.
According to the formula we will need to calculate the probability of occurrence of every input feature as well as output feature and their conditional probabilities given each class label.
First, we will calculate the probability for each input variable.
#Calcualting probabilites for inputs independantly
def get_probabilities_for_inputs(n, column_name, data_frame):
temp = data_frame[column_name] #isolate targetted column
temp = temp.value_counts() #get counts of occurences of each input variable
return (temp/n) #return probiblity of occurence by dividing with total no. of data points
Code language: PHP (php)
Next, we will calculate conditional probabilities for the input given an output class.
#calculating conditional probability
def get_conditional_probabilities(data_frame, n,target, given):
focused_data = data[[target, given]] #isolate target column an dfocus input column
targets_unique = data[target].unique()#list of unique outputs in data
inputs_unique = data[given].unique()
groups = focused_data.groupby(by = [given, target]).size().reset_index()
groups[0] = groups[0]/ n
for targets in targets_unique:
current_target_length = len(focused_data[focused_data[target] == targets])
groups[0] = np.where(groups[target] == targets, groups[0].div(current_target_length),groups[0])
return groups
Code language: PHP (php)
Next, we will write down our ‘fit’ function that will calculate and return all the necessary probabilities which we will then use for making classifications.
def calculate_probabilities(data):
#splititng input data
x = data[data.keys()[:-1]]
y = data[data.keys()[-1]]
target = y.name
#get length of dataframe
n = len(data)
#get probabilities for each individual input and output
f_in = lambda lst: get_probabilities_for_inputs(n, lst, x)
input_probablities = list(map(f_in,x.keys()))
output_probabilities = get_probabilities_for_inputs(n ,target, y.to_frame())
#get conditional probabilities for every input against every output
f1 = lambda lst: get_conditional_probabilities(data, n, target,lst)
conditional_probabilities = list(map(f1, data.keys()[:-1]))
return input_probablities, output_probabilities, conditional_probabilities
Code language: PHP (php)
Now that we have all the necessary calculations done and out of the way, we need to make a function that will give us our output class label by making calculations according to the Naive Bayes formula that we wrote above.
def naive_bayes_calculator(target_values, input_values, in_prob, out_prob, cond_prob):
target_values.sort()#sort the target values to assure ascending order
classes = [] #initialise empty probabilites list
for target_value in target_values:
num = 1 #initilaise numerator
den = 1 #initialise denominator
#calculate denominator according to the formula
for i,x in enumerate(input_values):
den *= in_prob[i][x]
#calculate numerator according to the formula
for i, x_1 in enumerate(input_values):
temp_df = cond_prob[i]
num *= temp_df[(temp_df.iloc[:,0] == x_1) & (temp_df.iloc[:,1] == target_value)][0].values[0]
num *= out_prob[target_value]
final_probability = (num/den) #final conditional probability value
classes.append(final_probability) #append probability for current class in a list
return (classes.index(max(classes)), classes)
Code language: PHP (php)
Now we have all our functions out of the way, we can move on to running them and storing our calculations.
in_prob, out_prob, cond_prob = calculate_probabilities(data_train)#use training data for the initial calculations
Code language: PHP (php)
in_prob, out_prob, cond_prob = calculate_probabilities(data_train)#use training data for the initial calculations.
#testing with dummy data
naive_bayes_calculator([1,0], [1,1,0,2,1,3,3],in_prob,out_prob,cond_prob)
Code language: CSS (css)

We have our class prediction and the probabilities for each class inside a tuple.
Testing the Naive Bayes classifier
Now it’s time to test on our ‘test data’.
The following function takes a set of inputs and returns the predicted class against each in a list.
def naive_bayes_predictor(test_data, outputs, in_prob, out_prob, cond_prob):
final_predictions = [] #initialise empty list to store test predictions
for row in test_data:
#get prediction for current data
predicted_class, probabilities = naive_bayes_calculator(outputs, row, in_prob, out_prob, cond_prob)
#append to list
final_predictions.append(predicted_class)
return final_predictions
Code language: PHP (php)
Now calculate accuracy.
test_data_as_list = X_test.values.tolist()
unique_targets = y_test.unique().tolist()
predicted_y = naive_bayes_predictor(test_data_as_list,unique_targets,in_prob,out_prob,cond_prob)
print("Accuracy:", (np.count_nonzero(y_test == predicted_y)/len(y_test)) *100)
Code language: PHP (php)

An accuracy of 77.4% is certainly not a bad number considering that we dropped certain important columns and the naivety of the algorithm ignores certain correlations between the input variables.
Conclusion
Naive Bayes is a very simple classifier granted that you understand basic probability and the concept of inputs and outputs in Machine Learning. The algorithm does have certain shortcomings such as ignoring the dependency of input variables on each other. It is very simple to build and gives good results if your data is according to its requirements.