1. Sort Phase
The original transaction database is sorted with customer-id as the major key and transaction time as the minor key, the result is set of customer sequences. The table shows the sorted transaction data. |
|
{
30}
,{
30, 50}
,{
50, 70}
,{
30, 50, 70}
, etc. from the above sorted transaction data.
Suppose the minimal support is 40%, in this case the minimal support count is 2, the result of large itemsets is listed in table. For example, |
|
{
30}
is a large itemset because its number of appearance (in Customer IDs 1, 2, 3, and 4) is 4/5 ≥
40%.
{
40, 70}
is a large itemset because its number of appearance (in Customer IDs 2 and 4) is 2/5 ≥
40%.
{
50}
is NOT a large itemset because its number of appearance (in Customer ID 3) is 1/5 <
40%.
{
60, 70}
is NOT a large itemset because its number of appearance (in Customer ID 2) is 1/5 <
40%.