|
3. Transformation Phase
The customer sequences are replaced by those large itemsets they contain. All the large itemsets are mapped into a series of integers to make the mining more efficient. |
|
(10, 20) is dropped because it does not contain any large itemset.
Parentheses ‘(’ and ‘)’ instead of ‘{’ and ‘}’ are used to avoid the ambiguity of doubly using ‘{’ and ‘}’ as shown next.
(40, 60, 70) is replaced by the set of itemsets {(40), (70), (40, 70)} because each itemset is large/frequent.
〈{1} {2, 3, 4}〉.
| Customer-id | Customer Sequence | Transformed DB | After Mapping |
|---|---|---|---|
| 1 | ⟨ (30) (90) ⟩ | ⟨ { (30) } { (90) } ⟩ | ⟨ {1} {5} ⟩ |
| 2 | ⟨ (10, 20) (30) (40, 60, 70) ⟩ | ⟨ { (30) } { (40) (70) (40, 70) } ⟩ | ⟨ {1} {2, 3, 4} ⟩ |
| 3 | ⟨ (30, 50, 70) ⟩ | ⟨ { (30) (70) } ⟩ | ⟨ {1, 3} ⟩ |
| 4 | ⟨ (30) (40, 70) (90) ⟩ | ⟨ { (30) } { (40) (70) (40, 70) } { (90) } ⟩ | ⟨ {1} {2, 3, 4} {5} ⟩ |
| 5 | ⟨ (90) ⟩ | ⟨ { (90) } ⟩ | ⟨ {5} ⟩ |
|
I remember in the old days when people used to get mad if you read their diary. Now people put everything online and get mad when you don’t read it. |