FP Growth Algorithm

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 10

Mining Frequent Item sets without

Candidate Generation

Apriori with candidate generation is costly for two reasons:

1. It may need to generate a huge number of candidate sets.

For Example : if there are 104 frequent 1-itemsets, the Apriori


algorithm will need to generate more than 107 candidate 2-itemsets.

2. It is costly to go over each transaction in the database to determine the


support of the candidate item sets

December 7, 2021 Data Mining: Concepts and Techniques 1


Mining Frequent Itemsets without
Candidate Generation

“Can we design a method that mines the complete set of frequent itemsets
without candidate generation?”

FP-growth (frequent-pattern growth,): adopts a divide-and-conquer strategy


as follows :
1. First, it compresses the database representing frequent items into a
frequent-pattern tree, or FP-tree
2. It then divides the compressed database into a set of conditional databases
,each associated with one frequent item and mines each such database
separately.

December 7, 2021 Data Mining: Concepts and Techniques 2


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example
We re-examine the mining of transaction database, D

December 7, 2021 Data Mining: Concepts and Techniques 3


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example
We re-examine the mining of transaction database, D

The first scan of the database is the same as Apriori. Let the min_sup = 2

Here the set of frequent items is sorted in the order of descending


support count. we have L ={{I2: 7}, {I1: 6}, {I3: 6}, {I4: 2}, {I5: 2}}

Sorted itemsets

We say that the


items are in L - order

December 7, 2021 Data Mining: Concepts and Techniques 4


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example (Constructing FP – tree)

December 7, 2021 Data Mining: Concepts and Techniques 5


Mining Frequent Itemsets without
Candidate Generation

FP-growth : Example (Constructing conditional databases)


Next FP-tree is mined as follows :

1. Starting with last item in table and constructing its conditional pattern
base : A “subdatabase” which consists of the set of prefix paths in the FP-tree co-
occurring with the suffix pattern
2. Construct conditional FP-tree
3. Frequent itemsets are found by the concatenation of the suffix pattern with the
frequent patterns generated from a conditional FP-tree

December 7, 2021 Data Mining: Concepts and Techniques 6


Mining Frequent Itemsets without
Candidate Generation
FP-growth : Example (Constructing conditional databases)

December 7, 2021 Data Mining: Concepts and Techniques 7


2. Example

You might also like