unsupervised learning in machine learning
Unsupervised learning:- Unlike supervised learning, in unsupervised learning, there is no labeled training data to learn from and no prediction to be made. In unsupervised learning, the objective is to take a dataset as input and try to find natural groupings or patterns within the data elements or records. Therefore, unsupervised learning is often termed a descriptive model and the process of unsupervised learning is referred to as pattern discovery of knowledge discovery. One critical application of unsupervised learning is customer segmentation.
Clustering is the main type of unsupervised learning. It intends to group or organize similar objects together. For that reason, objects belonging to the same cluster are quite similar to each other while objects belonging to different clusters are quite dissimilar. Hence, the objective of clustering is to discover the intrinsic grouping of unlabelled data and form clusters, as depicted in Figure 1.7. Different measures of similarity can be applied for clustering. One of the most commonly adopted similarity
measures is distance. Two data items are considered as a part of the same cluster if the distance between them is less. In the same way, if the distance between the data items is high, the items do not generally belong to the same cluster. This is also known as distance-based clustering. Figure 1.8 depicts the process of clustering at a high level.
Other than clustering of data and getting a summarized view from it, one more variant of unsupervised learning is association analysis. As a part of association analysis, the association between data elements is identified. Let’s try to understand the approach of association analysis in the context of one of the most common examples, i.e. market basket analysis as shown in Figure 1.9. From past transaction data in a grocery store, it may be observed that most of the customers who have bought item A. have also bought item B and item Cor at least once of them. This means that there is a strong association of the event purchase of item A with the event purchase of item B’, or ‘purchase of item C. Identifying these sorts of associations is the goal of association analysis. This helps in boosting up the sales pipeline, hence a critical input for the sales group. Critical applications of association analysis include market basket analysis and recommender systems.
TransID | Items Bought |
1 | {Butter, Bread} |
2 | {Diaper, Bread, Milk, Beer} |
3 | {Milk, Chicken, Beer, Diaper} |
4 | {Bread, Diaper, Chicken, Beer} |
5 | {Diaper, Beer, Cookies, Ice Cream} |
Market Basket transactions
Frequent itemsets-> (Diaper, Beer)
Possible association: Diaper-> Beer
FIG 1.9
Market Basket analysis