Marzena Kryszkiewicz DaWak 2009 Non-Derivable Item Set and Non- Derivable Literal Set...

Click here to load reader

download Marzena Kryszkiewicz DaWak 2009 Non-Derivable Item Set and Non- Derivable Literal Set Representations of Patterns Admitting Negation.

of 18

Transcript of Marzena Kryszkiewicz DaWak 2009 Non-Derivable Item Set and Non- Derivable Literal Set...

  • Slide 1
  • Marzena Kryszkiewicz DaWak 2009 Non-Derivable Item Set and Non- Derivable Literal Set Representations of Patterns Admitting Negation
  • Slide 2
  • Outline Motivation Preliminary Representing Frequent Itemsets with Non-derivable itemsets Patterns admitting negation Properties of Derivable and Non-derivable Lisets Representing frequent positive and negative patterns Conclusion
  • Slide 3
  • Motivation Patterns and association rules can be generalized by admitting negation. E.g. 75% of customers who buy coke also buy chips and neither beer nor milk. Admitting negation in patterns usually results in an abundance of mined patterns, which makes analysis of the discovered knowledge infeasible. It is preferable to discover and store a possibly small fraction of patterns, from which one can derive all other significant patterns when required.
  • Slide 4
  • (Cont.) In this paper, the properties of derivable and non-derivable patterns are examined. The important relationships among patterns admitting negation that have the same canonical variation are established. Lossless representations of frequent positive patterns were discussed. E.g. NDRL(non-derivable literal sets lossless representation), and NDIR( a concise representation)
  • Slide 5
  • Downward Closed Sets A set is defined as down ward closed, if Property Let. If, then sup(X) sup(Y) The set of all frequent itemsets is down ward closed.
  • Slide 6
  • Generalized Disjunctive Rules Let, is defined a generalized disjunctive rule based on Z, if and sup( ) is defined as the number of transactions in D in which X occurs together with at least one item from A. E.g., and
  • Slide 7
  • (Cont.) Thm: Let be a generalized disjunctive rule. Then: E.g. err ( ) is defined as the number of transactions containing X that do not contain any item from A is defined a certain rule, if err ( ) =0
  • Slide 8
  • (Cont.) Let be a generalized disjunctive rule. Then : Let be a generalized disjunctive rule. Then : doubt!! E.g. be a generalized disjunctive rule. Then:
  • Slide 9
  • Using Generalized association rules to estimate supports of itemsets, when |Y|is even, when |Y|is odd Given itemset B, we obtain the folowing set of 2 |B| inequalities bounding sup(B):
  • Slide 10
  • (Cont.)
  • Slide 11
  • Representing Frequent Itemsets with Non-derivable itemsets An itemset X is defined as non-derivable if l(X)u(X) NDR was defined as the set of all frequent non-derivable itemsets stored altogether with their supports:
  • Slide 12
  • Patterns admitting negation A liset is defined as a set consisting of non-contradictory literals A liset is called positive if all literals contained in it are positive.
  • Slide 13
  • (Cont.) A canonical variation of a liset X is defined as an itemset obtained from X by replacing all negative literals in X. That is, All lisets having tha same canonical variation as liset X are denoted by
  • Slide 14
  • (Cont.) Example:
  • Slide 15
  • Properties of Derivable and Non- derivable Lisets Thm: Let B be a liset. The bound on the length of non-derivable lisets contains at most at least 2 |Z| -1 variations of Z have supports greater than 0. Hence, 2 |Z| -1 |D|, so |Z|
  • Slide 16
  • Representing frequent positive and negative patterns NDLR(non-derivable liset representation of frequent patterns admitting negation) as the family of all frequent non-derivable lisets stored altoghther with their supports: NDIR (non-derivable itemset representation of frequent patterns admitting negation) is defined as non-derivable itemsets stored altogether with their supports each of which has at least one frequent variation:
  • Slide 17
  • (Cont.)
  • Slide 18
  • Conclusion It introduced two lossless representations of frequent patterns admitting negation doubt!!