Bayesian classifiers |Naive Bayes classifiers

July 22, 2023

What are Bayesian classifiers?

Bayesian classifiers are statistical classifiers. They can predict class membership probabilities, such as the probability that a given tuple belongs to a particular class.

Bayesian classification is based on Baye’s theorem. Studies comparing classification algorithms have found a simple Bayesian classifier known as the naive Bayesian classifier to be comparable in performance with decision trees and selected neural network classifiers. Bayesian classifiers have also exhibited high accuracy and speed when applied to large databases.

Naive Bayesian classifiers assume that the effect of an attribute value on a given class is independent of the values of the other attributes. This assumption is called class conditional independence. It is made to simplify the computations involved and, in this sense, is considered ‘naive’. Bayesian belief networks are graphical models, which, unlike naive Bayesian classifiers, allow the representation of dependencies among subsets of attributes. Bayesian belief networks can also be used for classification.

Explain Naive Bays Classifier with an example.

The Naive Bayesian Classifier or simple Bayesian classifier, works as follows-

(i) Let D be a training set of tuples and their associated class labels. As usual, each tuple is represented by an n-dimensional attribute vector, X = (X₁, X₂, X_n), depicting n measurements made on the tuple from n attributes, respectively, A₁, A₂, …., A_n.

(ii) Suppose that there are m classes, C₁, C₂, ….., C_m. Given a tuple, X, the classifier will predict that X belongs to the class having the highest posterior probability, conditioned on X. That is, the naive Bayesian classifier predicts that tuple X belongs to the class C_i if the only if

P(C_i|X)>P(C_j|X) for 1 ≤ j ≤ m, j≠i

Thus we maximize P(C_i|X|). The class C_i for which P(C_i|X) is maximized is called the maximum posteriori hypothesis. By Baye’s theorem

(iii) As P(X) is constant for all classes, only P(X|C_i) P(C_i) need be maximized. If the class prior probabilities are not known, then it is commonly assumed that the classes are equally likely, that is, P(C₁) = P(C₂) = …… = P(C_m), and we would therefore maximize P(X|C_i). Otherwise, we maximize P(X|C_i) P(C_i).

(iv) Given data sets with many attributes, it would be extremely computationally expensive to compute P(X|C_i). In order to reduce computation in evaluating P(X|C_i), the naive assumption of class conditional independence is made. This presumes that the values of the attributes are conditionally independent of one another, given the class label of the tuple. Thus,

(v) In order to predict the class label of X, P(X|C_i) P(C_i) is evaluated for each class C_¡. The classifier predicts that the class label of tuple X is the class C_i if and only if

P(X|C_i) P(C_i) > P(X|C_j) P(C_j) for 1 ≤ j ≤ m, j ≠ i

In other words, the predicted class label is the class C_i for which P(X|C_i) P(C_i) is the maximum.

Bayesian classifiers |Naive Bayes classifiers

What are Bayesian classifiers?

Explain Naive Bays Classifier with an example.

LEAVE A REPLY Cancel reply

Sirfpadhai.in

ABOUT US

FOLLOW US

Deep Neural Networks | Principal Components of a Neural Network

GRE sample questions| gre test practice |gre practice questions

What are Bayesian classifiers?

Explain Naive Bays Classifier with an example.

RELATED ARTICLESMORE FROM AUTHOR

Deep Neural Networks | Principal Components of a Neural Network

Best machine learning model for sparse data

Reinforcement learning algorithms in machine learning|Reinforcement learning algorithms| Reinforcement learning

LEAVE A REPLY Cancel reply

Sirfpadhai.in

ABOUT US

FOLLOW US

Deep Neural Networks | Principal Components of a Neural Network

GRE sample questions| gre test practice |gre practice questions

RELATED ARTICLES MORE FROM AUTHOR