A Method of Classification and Association Rules Obtaining

A Method of Classification and Association Rules Obtaining

论文摘要

Data mining popularly known as data or knowledge discovery is the process of analyzing data from different perspectives and summarizing it into useful information-information that can be used to increase revenue, cuts costs, or both. Being an interesting field, data mining has attracted many researchers. Extensive researches have been conducted on major data mining techniques but few researches have addressed the integration of these techniques.My thesis focuses on how to integrate the two major data mining techniques namely, classification and association rules to come up with An Optimal Class Association Rule Algorithm (OCARA). It is proposed by our study group. Classification and association rule mining algorithms are two important aspects of data mining. Classification association rule mining algorithm is a promising approach for it involves the use of association rule mining algorithm to discover classification rules.OCARA inherits the strength of Classification and association rule mining algorithms. Because of this reason, OCARA is a powerful algorithm when compared to either Classification or Association rule mining algorithms.To verify the strength of OCARA, we conducted experiment using eight different data sets of UCI (University of California at Irvine). We compared OCARA with other three popular algorithms (C4.5, CBA, RMR). The end result proved that the support threshold was greatly influenced by the rule accuracy and the rule number. If the support threshold is between 0.02 and 0.03, the accuracy will be much better, as discussed in this paper. The support threshold was set as 0.02, and the confidence was set as 0.30 in our work. Therefore, OCARA proved to more efficient when compared with others making it more robust in terms of its accuracy.The reason for OCARA’s high accuracy is because of optimal association rule mining algorithm and the rule set is sorted by priority of rules resulting into a more accurate classifier. Therefore, we can confidently say OCARA is an accurate classifier and has better performance and is more efficient when compared with C4.5, CBA, and RMR algorithm. This thesis makes major contribution to this young industry of data mining since it has successfully proposed and tested a new algorithm, OCARA.However, OCARA has many rules when compared with RMR when the support is lower. To overcome this limitation of having many rules, we are encouraging others researchers to focus on this promising algorithm by improving its efficiency.

论文目录

  • DEDICATION
  • ABSTRACT
  • LIST OF FIGURES
  • LIST OF TABLES
  • CHAPTER Ⅰ:INTRODUCTION
  • 1.1 General Introduction
  • 1.2 Data Mining in General
  • 1.3 Benefit of Data Mining
  • 1.3.1. Advantages of Data Mining
  • 1.3.2. Disadvantages of Data Mining
  • 1.4 Data Mining Applications
  • 1.5 Data Mining Techniques
  • 1.6 Data Mining Process
  • 1.7 Motivation and Contribution of Thesis
  • 1.8 Outline
  • CHAPTER Ⅱ:LITERATURE REVIEW
  • 2.1 Types of Rules Related to this Thesis
  • 2.1.1 What is a Rule?
  • 2.1.2 What to do with a Rule?
  • 2.1.3 Caveat:Rules do not imply Causality
  • 2.1.4 How to Evaluate the Rule
  • 2.2 Classification Rules
  • 2.3 Association Rules
  • 2.3.1 Finding Large Itemsets
  • 2.3.2 Generating Rules
  • 2.4 Optimal Association Rule Set
  • 2.4.1 Class Association Rules
  • 2.4.2 Optimal Rule Set
  • 2.5 Types of Algorithms Used for Comparison in this Thesis
  • 2.5.1 C4.5 Algorithm
  • 2.5.2 CBA Algorithm
  • 2.5.3 RMR Algorithm
  • CHAPTER Ⅲ:BASIC CONCEPTS AND THEORY
  • 3.1 Introduction
  • 3.2 Concepts and Theorems
  • CHAPTER Ⅳ:OCARA Algorithm
  • 4.1 Introduction
  • 4.2 Discovering the Optimal Rules Set
  • 4.3 Sorting Rules
  • 4.4 Matching Rules
  • CHAPTER Ⅴ:EXPERIMENTAL RESULTS
  • 5.1 Materials
  • 5.2 Description of the Environment
  • 5.2.1 Hardware
  • 5.2.2 Software
  • 5.3 Results and Data Analysis
  • 5.3.1 Graph of the prediction accuracy of C4.5 and OCARA algorithms
  • 5.3.2 Graph of the prediction accuracy of CBA and OCARA algorithms
  • 5.3.3 Graph of the prediction accuracy of RMR and OCARA algorithms
  • 5.3.4 Graph of the prediction accuracy of C4.5,CBA and OCARA algorithms
  • 5.3.5 Graph of the prediction accuracy of C4.5,RMR and OCARA algorithms
  • 5.5.6 Graph of the prediction accuracy of CBA,RMR and OCARA algorithms
  • 5.3.7 Graph of the prediction accuracy of C4.5,CBA,RMR and OCARA algorithms
  • CONCLUSION
  • REFERENCES
  • ACKNOWLEDGMENT
  • 附录A:详细中文摘要
  • APPENDIX B:PUBLISHED PAPER
  • 相关论文文献

    • [1].Simplifying Fractions[J]. 中学生数学 2020(22)
    • [2].Can You Figure Cut the Meanings of New Words?[J]. 疯狂英语(双语世界) 2019(03)
    • [3].SWIMMING AGAINST THE TIDE[J]. Beijing Review 2017(30)
    • [4].Five: The Figure to Decod China's Future[J]. China Today 2017(08)
    • [5].学英语(英文)[J]. 中学生数学 2017(14)
    • [6].What Time Is Now?[J]. 新高考(高三英语) 2012(01)
    • [7].Basketball Maze[J]. 英语大王 2011(06)
    • [8].Erratum to Simultaneous inhibition of PI3Kα and CDK4/6synergistically suppresses KRAS-mutated non-small cell lung cancer[J]. Cancer Biology & Medicine 2019(04)
    • [9].WHEN am I ever going to use this?[J]. 中学生数学 2020(06)
    • [10].News Roundup[J]. ChinAfrica 2019(12)
    • [11].Cover Figure[J]. Neural Regeneration Research 2011(10)
    • [12].Cover Figure[J]. Neural Regeneration Research 2011(01)
    • [13].Cover Figure[J]. Neural Regeneration Research 2011(17)
    • [14].Cover Figure[J]. Neural Regeneration Research 2011(21)
    • [15].Cover Figure[J]. Neural Regeneration Research 2011(27)
    • [16].Cover Figure[J]. Neural Regeneration Research 2011(25)
    • [17].Meeting Elderly Needs[J]. Beijing Review 2020(35)
    • [18].日积月累[J]. 新东方英语(中学生) 2014(12)
    • [19].令人印象深刻的中国动作冒险游戏——《黑神话:悟空》(英文)[J]. 新世纪智能 2020(87)
    • [20].Analysis of Chinese/English Codeswitching on Network Language by Figure/ground Theory[J]. 中国校外教育 2011(12)
    • [21].抽屉里的秘密(英文)[J]. 英语画刊(高级版) 2014(11)
    • [22].Interpreting Foregrounding with the Figure/Ground Theory with a Case Study[J]. 校园英语 2017(36)
    • [23].Design method for freeform reflective-imaging systems with low surface-figure-error sensitivity[J]. Chinese Optics Letters 2019(09)
    • [24].Restoring the Past[J]. Beijing Review 2018(45)
    • [25].Cover Figure[J]. Neural Regeneration Research 2011(23)
    • [26].Correction to:Projection from the Anterior Cingulate Cortex to the Lateral Part of Mediodorsal Thalamus Modulates Vicarious Freezing Behavior[J]. Neuroscience Bulletin 2020(09)
    • [27].Phenanthrene derivatives combined charge transport properties and strong solid-state emission[J]. Science China(Chemistry) 2019(07)
    • [28].Handsome Couple[J]. Beijing Review 2010(10)
    • [29].Erratum to spindle cell carcinoma of the breast as complex cystic lesion:a case report[J]. Cancer Biology & Medicine 2014(03)
    • [30].Instructions for TUMOR International Authors[J]. 肿瘤 2012(02)
    A Method of Classification and Association Rules Obtaining
    下载Doc文档

    猜你喜欢