Application of Rough Set Theory to EMI diagnosis · 368 管理與系統 surrounding environment...

19
應用約略集合理論建立電磁干擾診斷系統 Application of Rough Set Theory to EMI diagnosis 黃承龍 1 Cheng-Lung Huang 華梵大學資訊管理學系 李得盛 2 Te-Sheng Li 明新科技大學工業工程與管理系 彭定國 2 Ting-Kuo Peng 明新科技大學工業工程與管理系 施政男 1 Cheng-Nan Shih 華梵大學資訊管理學系 1 The Department of Information Management, Huafan University and 2 Department of Industrial Engineering and Management, Ming Hsin University of Science and Technology (Revised July 18, 2003; Accepted September 22, 2003) 摘要「電磁干擾」(EMI, Electromagnetic Interference) 可能造成工廠機器故障、錯誤動作而發 生不可預料的情況。如果產品有EMI品質不良,必須針對不符合測試標準的產品進行修改,直到 產品通過標準測試值之後,才能將產品出貨。EMI診斷由有經驗的工程師找到EMI問題的根源, 此工作耗時且困難,建立一套EMI診斷決策系統有助於縮短診斷時間。本研究應用資料探勘領域 中的知識發掘理論約略集合理論 (RST, Rough Set Theory) 找到診斷知識。RST用來處理含糊 與不精確資訊,此法不需要對資料集有先驗或是額外的資訊。以RST探勘輸入變數與不良位置之 間存在的法則知識,其步驟如下:資料蒐集、資料前處理、屬性值離散化、屬性化簡、最小屬性 集過濾、產生法則、過濾法則、做分類、計算分類正確率以及建立診斷系統等步驟。以國內知名 主機板廠商的EMI歷史資料做資料探勘,透過上述步驟所建立的RST診斷法則平均有八成的正確 率,已達到實務應用的要求。 關鍵詞資料探勘、電磁干擾、約略集合理論、屬性化簡 Abstract: Electromagnetic emissions are radiated from every part of motherboards of personal computers, and thus electromagnetic interference (EMI) occurs. EMI has a bad effect on the 第十一卷 第三期 民國九十三年七月 367 - 385Journal of Management & Systems Vol. 11, No. 3, July 2004 pp. 367-385

Transcript of Application of Rough Set Theory to EMI diagnosis · 368 管理與系統 surrounding environment...

  • Application of Rough Set Theory to EMI diagnosis

    1 Cheng-Lung Huang

    2 Te-Sheng Li

    2 Ting-Kuo Peng

    1 Cheng-Nan Shih

    1The Department of Information Management, Huafan University and 2Department of Industrial Engineering and Management, Ming Hsin University of Science and

    Technology

    (Revised July 18, 2003; Accepted September 22, 2003)

    (EMI, Electromagnetic Interference)

    EMI

    EMIEMI

    EMI

    (RST, Rough Set Theory) RST

    RST

    EMIRST

    Abstract: Electromagnetic emissions are radiated from every part of motherboards of personal

    computers, and thus electromagnetic interference (EMI) occurs. EMI has a bad effect on the

    367 - 385

    Journal of Management & Systems Vol. 11, No. 3, July 2004

    pp. 367-385

  • 368

    surrounding environment because EMI may cause malfunctions or fatal problems of other digital

    devices. EMI engineers diagnose EMI problems of motherboard from the electromagnetic noise data

    measured by the Spectrum Analyzer. It is time consuming to find out the sources (PS2, USB, VGA, etc.)

    of electromagnetic noise. Rough set theory (RST) is a new mathematical approach to data analysis.

    This paper constructs an EMI diagnostic system based on RST. There are the following steps: Data

    Collection, Data Preprocessing, Descretization, Attribute Reduction, Reduction Filtering, Rule

    Generation, Rule Filtering, Classification, and Accuracy Calculation. Historical EMI noise data,

    colleted from a famous motherboard company in Taiwan, are used to generate diagnostic rules. The

    result of our research (average diagnostic accuracy of 80%) shows that RST model is a promising

    approach to EMI diagnostic support system.

    Keywords: Data Mining, Rough Set Theory, Attributes Reduction, Electromagnetic Interference

    1.

    EM I (Electromagnetic Interference) (Morgan, 1994; Mills, 1993)

    (Archambeault, 2002)

    ()

    EMI (IBMNECHPDellFujitsu)

    EMI

    EMIEMIEMI

    EMI (

    USB)

    EMI

  • 369

    EMI

    (Data

    Mining)EMI

    EMI1EMI

    EMI (EMC, Electromagnetic

    Compatibility)

    (RST, Rough Set Theory) (Pawlak

    1982, 1994, 1995; Kusiak, 2001) EMIEMI

    EMI

    2RST3

    4

    EMI EMI

    EMI

    D ata M in ing

    EMI

    EMC

    Y es

    N o

    1 EMI

  • 370

    2.

    2.1

    Data Mining

    (Han, 2000) (Classification) (Prediction)

    (Association) (Clustering

    (Sushmita, 2002; Kusiak,

    2000; Kusiak, 2001; Bayardo, 1999; Hui, 2000; McKee, 2002; Ben-Arieh, 1998; Yoshida, 1999)

    Gardner (2000) SOM (Self-Organizing

    Map Neural Networks) (Rule Induction)

    (90) Kruskal-Wallis

    (Wafer Map) (Chen 2000; Lee, 2001; Skinner, 2002)

    EMI

    2.2

    (Rough Set Theory, RST) Z. Pawlak1982

    (Pawlak, 1982) (Vagueness and imprecise)

    (Pawlak, 1991)RST

    RST (1) (2)

    (3) (Pawlak, 1982; Kusiak, 2001)RST

    (Dimitras, 1999; Kusiak, 2001; Ahna, 2000; McKee, 2002)

    RSTRSTRSTEMI

    (1) (Information System)

    (Decision Table) (Information Table)S= (Pawlak, 1991), U (Universe)U={x1, x2,, x10}A (Attributes)

    A={a1, a2, a3}={,,}VVa1a1Va1={,,}(Information Function) VAUf :

  • 371

    UxAaVaxf a ,,),(

    (Decision Attribute) (Condition Attributes)

    (2) (Indiscenibility Relation) S=P AP Uyx , Ind(P)

    x y

    Paayfaxf = ),,(),( Ind(P)P

    (Elementary Sets))(][ PIndix

    (3) (Approximation of Sets)

    (Lower Approximation)(Upper Approximation)RSTXU UX AP XP

    }][|{ )( XxUxPX PIndii =

    xixiX

    PX XP (P-Lower Approximation)

    XP

    }0][|{ )( = XxUxPX PIndii

    xixiX

    PX XP (P-Upper Approximation)XU (Boundary) PXPXPNX =

    BX = BXXXU4

    (Undefinable Sets): 1) PX UPX X U (Roughly definable in U)

    2) PX UPX = X U (Externally undefinable in U)

    3) =PX UPX X U (Internally undefinable in U)

    4) =PX UPX = X U (Totally undefinable in U)

    1)(0),(/)()( = XPXcardPXcardX PP

    (4) (Reduction of Attributes) )()( iaAIndAInd = aiai (Dispensable in A)

    aiai

  • 372

    S ( ) P A ,...,,{ 21 XX =

    UXXXUXX ijiin == ,,}, U (Classification) Xi(Classes)

    ( ) d =P

    },...,,{ 21 nPXPXPX },...,,{ 21 nPXPXPXP = (Quality of Classification)

    )(/)()(1

    UcardPXcardn

    iiP

    =

    =

    P AP PA )(P P

    R APR

    )()( PR = P )(PRED

    (Minimal Subsets of Attributes)

    (Core)

    )()( PREDPCORE =

    RST

    (Discernibility Matrix) (Walczak, 1999)2.3

    (5) ( Decision Rules)

    2.3

    2.3 RST

    RST8

    (1) A={Diploma, Experience, French, Reference } =

    {a1, a2, a3, a4}, D={Decision}={d}

    2

    Diploma{MBA, MCE, MSc} = {1, 2, 3}

    Experience{High, Medium, Low} = {1, 2, 3}

    French{Yes, No} = {1, 2}

    Reference{Excellent, Good, Neutral} = {1, 2, 3}

    Decision{Accept, Reject} = {1, 2}

  • 373

    1 ()

    U Diploma Experience French Reference Decision x1 MBA Medium Yes Excellent Accept

    x2 MBA Low Yes Neutral Reject

    x3 MCE Low Yes Good Reject

    x4 MSc High Yes Neutral Accept

    x5 MSc Medium Yes Neutral Reject

    x6 MSc High Yes Excellent Accept

    x7 MBA High No Good Accept

    x8 MCE Low No Excellent Reject

    2

    U a1 a2 a3 a4 d x1 1 2 1 1 1

    x2 1 3 1 3 2

    x3 2 3 1 2 2

    x4 3 1 1 3 1

    x5 3 2 1 3 2

    x6 3 1 1 1 1

    x7 1 1 2 2 1

    x8 2 3 2 1 2

    Xd=1 = {x1,x4,x6,x7}Xd=2 = {x2,x3,x5,x8}

    3x1x2

    a2a4a2a4

    {x1,x4,x6,x7}

    -

    (Boolean function) a1a221 aa

    a1 and a2(a1, a2) (a1 + a2) (a1 or a2)

    (a1+a2+a3)a2=a2, a2(a1+a2+a3)(a1+a3)(a1+a3) = a2(a1+a3)=a2a1+a2a3

  • 374

    3 A

    x1 x2 x3 x4 x5 x6 x7 x8 x1 x2 a2, a4 x3 a1,a2,a4 - x4 - a1,a2 a1,a2,a4 x5 a1, a4 - - a2 x6 - a1,a2,a4 a1,a2,a4 - a2,a4 x7 - a2,a3,a4 a1,a2,a3 - a1,a2,a3,a4 - x8 a1,a2,a3 - - a1,a2,a3,a4 - a1,a2,a3 a1,a2,a4

    )(Df A =(a2, a4) (a1,a2,a4)( a1, a4)( a1,a2,a3)( a1,a2)( a1,a2,a4)( a2,a3,a4)( a1,a2,a4)

    ( a1,a2,a4)( a1,a2,a3)( a2)( a1,a2,a3,a4)( a2,a4)( a1,a2,a3,a4)( a1,a2,a3)( a1,a2,a4)

    =( a1, a4) a2=a1a2+a2a4

    a1a2+a2a4a1a2a2a4

    a1a2a2a4a1a2

    a3a44

    4 (a1a2)

    U Diploma (a1) Experience (a2) Decision(d)

    x1 1 2 1

    x2 1 3 2

    x3 2 3 2

    x4 3 1 1

    x5 3 2 2

    x6 3 1 1

    x7 1 1 1

    x8 2 3 2

    4

    {a1,a2}5 (

    )

  • 375

    5 a1a2

    x1 x2 x3 x4 x5 x6 x7 x8 x1 a2 a1,a2 - a1 - - a1,a2 x2 a2 - a1,a2 - a1,a2 a2 - x3 a1,a2 - a1,a2 - a1,a2 a1,a2 - x4 - a1,a2 a1,a2 a2 - - a1,a2 x5 a1 - - a2 a2 a1,a2 - x6 - a1,a2 a1,a2 - a2 - a1,a2 x7 - a2 a1,a2 - a1,a2 - a1,a2 x8 a1,a2 - - a1,a2 - a1,a2 a1,a2

    )(1 Df =a2(a1,a2)a1(a1,a2)=a1a2 )(2 Df =a2(a1,a2)(a1,a2)a2=a2 )(3 Df =(a1,a2)(a1,a2)(a1,a2)(a1,a2)=a1+a2 )(4 Df =(a1,a2)(a1,a2)a2(a1,a2)=a2 )(5 Df =a1a2a2(a1,a2)=a1a2 )(6 Df =(a1,a2)(a1,a2)a2(a1,a2)=a2 )(7 Df =a2(a1,a2)(a1,a2)(a1,a2)=a2 )(8 Df =(a1,a2)(a1,a2)(a1,a2)(a1,a2)=a1+a2

    66x2, x4, x6,

    x7,x1, x5x3x8a1=2a2=3

    d=2x2a2=3 d=2x3x8

    a1(-)a1

    6 (-) (a1a2)

    U Diploma (a1) Experience (a2) Decision x1 1 2 1 x2 - 3 2 x3 - 3 2 x4 - 1 1 x5 3 2 2 x6 - 1 1 x7 - 1 1 x8 - 3 2

  • 376

    64 ()

    1) Experience = 1(High) Decision = 1 (Accept)

    2) Experience = 3(Low) Decision = 2 (Reject)

    3) Experience = 2(Medium) and Diploma = 1(MBA) Decision = 1 (Accept)

    4) Experience = 2(Medium) and Diploma = 3(MSc) Decision = 2 (Reject)

    a2a47884()

    1) Experience = 1(High) Decision = 1 (Accept)

    2) Experience = 3(Low) Decision = 2 (Reject)

    3) Experience = 2(Medium) and Reference = 1(High) Decision = 1 (Accept)

    4) Experience = 2(Medium) and Reference = 3(Low) Decision = 2 (Reject)

    7 (a2a4)

    U Experience (a2) Reference (a4) Decision (d) X1 2 1 1 X2 3 3 2 X3 3 2 2 X4 1 3 1 X5 2 3 2 X6 1 1 1 X7 1 2 1 X8 3 1 2

    8 (-) (a2a4)

    U Experience (a2) Reference (a4) Decision(d) X1 2 1 1 X2 3 - 2 X3 3 - 2 X4 1 - 1 X5 2 3 2 X6 1 - 1 X7 1 - 1 X8 3 - 2

  • 377

    3. RSTEMI

    3.1

    EMI (Spectrum Analyzer)EMI

    2 (Frequency) 30MHz1GHz1MHz

    ()dBdB

    MHzdB

    30MHz~230MHz30dB231MHz~1GHz37dB

    EMIEMI

    2

    Freq.(MHz)EMIWarning Mark

    X!

    EMI

    2 EMI

  • 378

    3.2 EMI

    RSTEMI3 (Data Collection)

    (Data Preprocessing) (Discretization) (Attribute

    Reduction) (Reduction Filtering) (Rule Generation)

    (Rule Filtering) (Classification) (Accuracy Calculation)

    (Diagnostic System)

    3 RST EMI

    (1)

    EMI

    111 (9)

    EMI415134 281 (2)

    Port (4)

    : A (Printer, Ps2, Com) B (VGA) C (LAN, USB, Audio, Game)

    (3)

    10

    1. 4.3.

    5.

    6. 7.8. 10.

    9.

    2.

  • 379

    9

    Chipset 3 3 3 a1

    IO Chipset I/O(Printer,Com,PS2) 7 3 7 a2

    Audio 3 2 3 a3

    Memory 2 2 2 a4

    PCB-size 2 2 2 a5

    CPU-slot CPU 5 3 5 a6

    USB USB 2 2 2 a7

    LAN LAN 4 4 4 a8

    VGA VGA 4 3 4 a9

    Failure-Level EMI Noise 3 3 3 a10

    Frequency-Level 8 5 3

    a11

    Failure-Area EMI 3 3 3 d

    4 ABC

    (4)

    30-1G MHzEMIdB (30-230 MHz)

    (230-1G)

    Area A

    Area B

  • 380

    (Johnson, 1974; Vinterbo, 2000)

    (Fitness function)

    Rosetta (2001)

    2Set

    {Chipset, Audio, Memory, PCB-size, CPU-slot, LAN, Failure-Level, Freq-Level}

    {Chipset, Audio, PCB-size, CPU-slot, LAN, VGA, Failure-Level, Freq-Level}

    3Set

    {Chipset, VGA, Failure-Level, Freq-Level}

    {Chipset, Memory, LAN, Failure-Level, Freq-Level}

    {Chipset, PCB-size, LAN, Failure-Level, Freq-Level}

    (5)

    {Chipset, Audio, Memory, PCB-size,

    CPU-slot, LAN, Failure-Level, Freq-Level} {Chipset, VGA,

    Failure-Level, Freq-Level}

    (6)

    16622

    Chipset (Via) AND Audio (CMI9738) AND Memory (SDRAM) AND PCB-size (Micro-ATX) AND

    CPU-slot (Socket478) AND LAN (None) AND Fail-Level (C) AND Freq-Level(2) => Fail-area (A)

    ChipsetViaAudioCMI9738Memory

    SDRAMPCB-sizeMicro-ATXCPU-slotSocket478LAN None

    Fail-LevelCFreq-Level2A

    (Chipset, Audio,)EMI (Freq-Level)

    (Fail-Level)EMIA (Printer, Ps2Com)

    (7)

    (8)

    (Slowinski, 1994)

    1) (Deterministic Logical Rule)

  • 381

    2) ()

    ()

    3) (Non-deterministic Logical Rule)

    ()

    4)

    (9)

    90.39%71.64% (1011)

    10

    A C B A 72 3 0 96% C 23 179 0 88.61% B 0 1 3 75%

    75.79% 97.81% 100% 90.39%

    11

    A C B A 46 12 0 79.31% C 18 42 0 70% B 4 4 8 50%

    67.65% 72.41% 100% 71.64%

    (10)

    EMIEMIEMI

    EMI

    81%

    (90%72%)

    EMI

    (1)

  • 382

    3.3

    (Decision Tree)

    EMIEMI

    ID3 (Quinlan, 1986) C4.5 (Quinlan, 1993)

    C5.0 (http://www.rulequest.com/)C5.0

    EMI

    (Testing Set)

    (Training Set)12C5.0

    77%11 (12) RST

    RST (84)

    RST

    12

    RST 8 4 88.4% 70.6% 79.5%

    11 11 85.4% 68.7% 77%

    4.

    EMIEMI

    RSTEMIEMI

    EMI

    RST

    (1)~(9)

    (Hybrid System)

  • 383

    (

    ) RST

    5.

    NSC-92-2213-E-211-012

    [1]

    18 4 90 37-48

    [2] Archambeault, B. and Connor, S., Measurements and Simulations for Ground-to-Ground Plane

    Noise for SDRAM and DDR RAM Daughter Cards and Motherboards for EMI Emissions,

    IEEE International Symposium on EMC, 2002, pp.105-108.

    [3] Ahna, B. S., Cho, S. S. and Kim, C. Y., The Integrated Methodology of Rough Set Theory and

    Artificial Neural Network for Business Failure Prediction, Expert Systems with Applications,

    Vol. 18, 2000, pp.6574.

    [4] Bayardo, R. J., Rakesh, A. and Dimitrios G., Constraint-Based Rule Mining in Large, Dense

    Databases, Proceeding of the 15th International Conference on Data Engineering, 1999,

    pp.188-197.

    [5] Ben-Arieh, D., Chopra, M. and Bleyberg, M. Z., Data Mining Application for Real-time

    Distributed Shop Floor Control, IEEE International Conference on Systems, Man, and

    Cybernetics, vol. 3, 1998, pp.27382743.

    [6] Chen, F.-L. and Liu, S.-F., A Neural-Network Approach To Recognize Defect Spatial Pattern In

    Semiconductor Fabrication, IEEE transactions on semiconductor manufacturing, 2000, Vol. 13,

    No. 3, pp.366-373.

    [7] Dimitras, A. I. Slowinski, R., Susmaga, R. and Zopounidis, C., Business Failure Prediction

    Using Rough Sets, European Journal of Operational Research, Vol. 114, 1999, pp.263-280.

    [8] Gardner, R. and Bieker, J., Solving Tough Semiconductor Manufacturing Problems Using Data

    Mining, IEEE/SEMI Advanced Semiconductor Manufacturing Conference, 2000, pp. 46-55.

    [9] Han, J, and Kamber, M., Data Mining: Concepts and Techniques, San Francisco, CA: Morgan

    Kaufmann Publishers, 2001.

    [10] Hui, S. C. and Jha, G., Data Mining for Customer Service Support, Information &

  • 384

    Management, Vol. 38, 2000, pp.1-13.

    [11] Johnson, D. S., Approximation Algorithms for Combinatorial Problems. Journal of Computer

    and System Sciences, Vol. 9, 1974, pp.256278.

    [12] Kusiak, A., Decomposition in Data Mining: An Industrial Case Study, IEEE Transactions on

    Electronics Packaging Manufacturing, Vol. 23, No. 4, 2000, pp.345-353.

    [13] Kusiak, A., Rough Set Theory: A Data Mining Tool for Semiconductor Manufacturing, IEEE

    Transactions on Electronics Packaging Manufacturing, Vol. 24, No. 1, 2001, pp.44-50.

    [14] Lee, J.-H., Yu, S.-J. and Park S.-C., Design of Intelligent Data Sampling Methodology Based on

    Data Mining, IEEE transactions on semiconductor manufacturing, 2001, Vol. 17, No. 5,

    pp.637-649.

    [15] McKee, T. E. and Lensberg, T., Genetic Programming and Rough Sets: A Hybrid Approach to

    Bankruptcy Classification. European Journal of Operational Research, Vol. 138, pp.436451,

    2002.

    [16] Mills, J. P., Electromagnetic Interference. New Jersey: Prentice Hall, 1993.

    [17] Sushmita, M., Sankar, K. P. and Pabitra, M., Data Mining in Soft Computing Framework: A

    Survey, IEEE Transactions on Neural Networks, Vol. 13, No. 1, 2002, pp.3-14.

    [18] Morgan, D., A Handbook for EMC Testing and Measurement, IEEE Electrical Measurement

    Series 8, London: Peter Peregrinus, 1994.

    [19] Pawlak, Z., Rough Sets. International Journal of Information and Computer Sciences, 11,

    1982, pp.341-356.

    [20] Pawlak, Z., Rough Sets: Theoretical Aspects of Reasoning About Data, Boston: Kluwer

    Academic Publishers, 1991.

    [21] Pawlak, Z., and Slowinski, R., Rough Set Approach to Multiattribute Decision Analysis.

    European Journal of Operational Research, Vol. 72, 1994, pp.443-459.

    [22] Pawlak, Z., Grzymala-Busse, J. W., Slowinski, R. and Ziarko, W., Rough Sets.

    Communications of the ACM, Vol. 38, No. 11, 1995, pp.89-95.

    [23] Rosetta, version 1.4.41 (2001. 5. 25), http://www.idi.ntnu.no/~aleks/rosetta/

    [24] Quinlan, J. R., Induction of Decision Trees, Machine Learning, Vol. 1, No. 1, 1986, pp.81-106.

    [25] Quinlan, J. R., C4.5: Programs for Machine Learning, San Francisco, CA: Morgan Kaufmann

    Publishers, 1993.

    [26] Skinner, K. R., Montgomery, D. C., Runger, G. C., Fowler, J. W., McCarville, D. R. Rhoads, T.

    R. and Stanley, J. D., Multivariate Statistical Methods for Modeling and Analysis of Wafer

  • 385

    Probe Test Data, IEEE transactions on semiconductor manufacturing, 2002, Vol. 15, No. 4,

    pp.523-530.

    [27] Slowinski, R. and Stefanowski, J., Rough Classification with Valued Closeness Relation, in: E.

    Diday, Y. Lechevallier, M. Schader, P. Bertrand, B. Burtschy Eds., New Approaches in

    Classification and Data Analysis, Berlin: Springer, 1994.

    [28] Vinterbo, S. and hrn, A., Minimal Approximate Hitting Sets and Rule Templates.

    International Journal of Approximate Reasoning, Vol. 25, No.2, 2000, pp.123143.

    [29] Walczak, B. and Massart, D. L., Rough Set Theory, Chemometrics and Intelligent Laboratory

    Systems, Vol. 47, 1999, pp.116.

    [30] Yoshida, T. and Touzaki, H., A Study on Association Among Dispatching Rules in

    Manufacturing Scheduling Problems, IEEE International Conference on Emerging

    Technologies and Factory Automation, vol. 2, 1999, pp.13551360.