Observation Analysis of Decision Tree Extraction from Artificial Neural Network

Observation Analysis of Decision Tree Extraction from Artificial Neural Network

论文摘要

The ability of artificial neural networks to learn and generalize complex relationships from a collection of training examples has been established though numerous research studies in recent years. The knowledge acquired by neural networks, however, is considered incomprehensible and not transferable to other knowledge representation schemes such as expert or rule-based system.Two of the commonly used techniques are artificial neural networks and decision trees. Artificial neural networks (ANNs) have empirically been shown to generalize well on several machine learning problems. Reasonably satisfactory answers to the questions like how many examples are needed for the neural network to learn a concept and what is the best neural network architecture for a particular problem domain are available in the form of learning theory, so it is now possible to train neural networks without guesswork. This makes them excellent tool for data mining, where the focus is to learn data relationships from huge databases. However, there are applications like credit approval and medical diagnosis where explaining the reasoning of the neural network is important. The major criticism against neural networks in such domains is that the decision making process of neural networks is difficult to understand. This is because the knowledge in the neural network is stored as real-valued parameters of the network, the knowledge is encoded in distributed fashion and the mapping learnt by the network could be non-linear as well as non-monotonic. One may wonder why neural networks should be used when comprehensibility is an important issue. The reason is that predictive have an appropriate inductive bias for many machine learning domains. The predictive accuracies obtained with neural networks are often significantly higher than those obtained with other learning paradigms, particularly decision trees. Decision trees have been preferred when a good understanding of the decision process is essential such as a medical diagnosis. Decision tree algorithms execute fast, are able to handle a high number of records with a high number of fields with predictable response times, handle both symbolic and numerical data well and are better understood and can easily be translated into if-then-else rules.However there are a few shortcomings of decision tree algorithms:Firstly, Tree induction algorithms are unstable, ie. Addition or deletion of a few samples can make the tree induction algorithm yield a radically different tree. This is because the partitioning features (splits) are chosen based on the sample statistics and even a few samples are capable of changing the sample statistics. The split selection (i.e. selection of the next attribute to the tested) is greedy. Once selected, there is no back tracking in the search. So the tree induction algorithms are subject to all the risks of hill climbing algorithms, mainly that of converging to locally optimal solutionsSecondly, the size of a decision tree (total number of internal nodes and leaves) depends on the complexity of the concept being learnt, sample statistics, noise in the sample and the number of training examples. It is difficult to control the size of the decision trees extracted and sometimes very large trees are generated making comprehensibility difficult. Most decision tree algorithms employ a pruning mechanism to reduce the decision tree. However, pruning may sometimes reduce the generalization accuracy of the tree.One of the most recent and more sophisticated algorithms is the TREPAN algorithm. TREPAN builds decision trees by recursively partitioning the instance space. However, unlike other decision tree algorithms where the amount of training data used to select splitting tests and label leaves decreases with depth of the tree, TREPAN uses membership queries to generate additional training data. For drawing the query instances, TRAP AN uses empirical distributions to model discrete valued features and kernel density estimates to model continuous features. In our research, we seek to develop heuristics for employing Trepan, an algorithm for extracting decision trees from neural networks. Typically, several parameters need to be chosen to obtain a satisfactory performance of the algorithm. The current understating of the various interactions between these is not well understood. By empirically evaluating the performance of the algorithm on a test set of databases chosen from benchmark machine learning and real world problems, several heuristics are proposed to explain and improve the performance of the algorithm. C4.5decision tree induction algorithm is used to analyze the datasets. We then use the data classified as cross-validation and training in neural network models is used together as the training data for the C4.5decision tree algorithm. The C4.5model is then compared to the best TRAP AN model for classification accuracy and comprehensibility. The experimentation is further validated by performance statistic measures. The algorithm is extended to work multi-class regression problems and its ability to comprehend generalized feed forward networks is investigated.Further, the empirical investigation of TRAPAN on these datasets shows the following heuristics (ⅰ) for complex and highly complex datasets, best model accuracy is obtained within a tree size range of number of inputs (ⅱ) TREPAN generalizes better at low min. sample sizes for highly complex datasets having little data. This is because TREPAN generates instances and obtains class labels for those instances from the oracle when there is not enough data, finally (ⅲ) we have also observed that single text TREPAN and TREPAN algorithm perform better than disjunctive TREPAN most at times.

论文目录

  • DEDICATION
  • Abstract
  • 简介
  • TABLE OF CONTENTS
  • Acknowledgement
  • List of Figures
  • List of Tables
  • Chapter One Introduction
  • 1.1 Background of the Study
  • 1.2 Classifications Algorithm
  • 1.3 Objective of research
  • 1.4 Thesis overview
  • Chapter Two Background and Literature Review
  • 2.1 Artificial neural networks
  • 2.1.1 Neural network architecture
  • 2.1.2 Neural Network Training
  • 2.1.3 Neural networks for classification
  • 2.1.4 Rule of extraction from neural networks
  • 2.2 Decision trees
  • 2.2.1 Decision tree classifications
  • 2.2.2 Decision tree applications
  • 2.3 C4.5 Algorithm
  • 2.3.1 Information Gain, Entropy Measure and Gain Ratio
  • 2.4. TREPAN Algorithm
  • 2.4.1 M-of-N Splitting test
  • 2.4.2 Single Test TREPAN and Disjunctive TREPAN
  • 2.5 Summary
  • Chapter Three Methodology
  • 3.1 Phase 1
  • 3.2 Phase 2
  • 3.2.1 Datasets
  • 3.3 Phase 3
  • 3.4 Phase 4
  • 3.5 Performance measures
  • 3.5.1 Classification accuracy
  • 3.5.2 Comprehensibility
  • 3.6 Summary
  • Chapter Four Results and Discussion
  • 4.0 Investigate and Extended TREPAN
  • 4.1 Dataset analysis
  • 4.1.1 Outages
  • 4.1.2 Body fat
  • 4.1.3. Admissions
  • 4.9 Summary
  • Chapter five Conclusion and future work
  • 5.1 Summary and Discussion
  • 5.1.1 Accuracy
  • 5.1.2 Comprehensibility
  • 5.2 Heuristics
  • 5.3 Conclusions
  • 5.4 Future works
  • References
  • 相关论文文献

    • [1].“失去人性的时代”——电影《Network》与法兰克福学派的批判理论[J]. 大众文艺(理论) 2009(04)
    • [2].REMARKS ON NETWORK COMMUNITY PROPERTIES[J]. Journal of Systems Science and Complexity 2008(04)
    • [3].INTELLIGENT INFORMATION NETWORK SECURITY AND MANAGEMENT[J]. 中国通信 2016(07)
    • [4].ARUBA NETWORK任命中国区总经理 谋求扩展中国市场份额[J]. 计算机与网络 2008(05)
    • [5].ADVANCES IN NETWORK FUNCTION VIRTUALIZATION[J]. 中国通信 2018(10)
    • [6].RESEARCH ON KEY NODES OF WIRELESS SENSOR NETWORK BASED ON COMPLEX NETWORK THEORY[J]. Journal of Electronics(China) 2011(03)
    • [7].EXISTENCE OF STRONGLY VALID TOLLS FOR MULTICLASS NETWORK EQUILIBRIUM PROBLEMS[J]. Acta Mathematica Scientia 2012(03)
    • [8].'BROADBAND CHINA' STRATEGY:NETWORK AND APPLICATION[J]. 中国通信 2013(09)
    • [9].RESEARCH ON ADAPTIVE COMPRESSION CODING FOR NETWORK CODING IN WIRELESS SENSOR NETWORK[J]. Journal of Electronics(China) 2012(05)
    • [10].DELAY DIFFERENTIAL EQUATIONS UNDER NONLINEAR IMPULSIVE CONTROL AND APPLICATIONS TO NEURAL NETWORK MODELS[J]. Journal of Systems Science & Complexity 2012(04)
    • [11].360Fashion Network“时尚品牌与手机技术”体验区亮相中国服装服饰博览会[J]. 计算机与网络 2013(06)
    • [12].DISH NETWORK由“地方进入地方” DISH NETWORK为美国210个地方电视市场提供地方广播电视服务[J]. 卫星电视与宽带多媒体 2010(12)
    • [13].INTRODUCTION OF YES NETWORK PROJECT:A LEGACY OF IYPE TRIENNIUM[J]. 地学前缘 2009(S1)
    • [14].A NEW NEURAL NETWORK-BASED ADAPTIVE ILC FOR NONLINEAR DISCRETE-TIME SYSTEMS WITH DEAD ZONE SCHEME[J]. Journal of Systems Science & Complexity 2009(03)
    • [15].Stata软件network组命令在网状Meta分析中的应用[J]. 中国循证医学杂志 2015(11)
    • [16].BFA BASED NEURAL NETWORK FOR IMAGE COMPRESSION[J]. Journal of Electronics(China) 2008(03)
    • [17].NEW ERA SOFTWARE DEFINED NETWORK[J]. 中国通信 2016(03)
    • [18].NASH EQUILIBRIUM OF TWO GAME MODELS ON THE RANDOM SOCIAL NETWORK[J]. Annals of Applied Mathematics 2017(04)
    • [19].A COUPLED 1-D AND 2-D CHANNEL NETWORK MATHEMATICAL MODEL USED FOR FLOW CALCULATIONS IN THE MIDDLE REACHES OF THE YANGTZE RIVER[J]. Journal of Hydrodynamics 2011(04)
    • [20].STUDY ON THE OPTIMIZATION OF TRANSPORT CONTROL POLICY IN COMMUNICATION NETWORK[J]. Journal of Electronics(China) 2010(02)
    • [21].NETWORK-CONNECTED UAV COMMUNICATIONS[J]. 中国通信 2018(05)
    • [22].网络化(network)城市建筑研究初探——从“十次小组”(Team 10)谈起[J]. 建筑师 2008(05)
    • [23].IDENTIFIER-BASED UNIVERSAL NETWORK ARCHITECTURE[J]. 中国通信 2013(11)
    • [24].MARKOV CHAIN-BASED ANALYSIS OF THE DEGREE DISTRIBUTION FOR A GROWING NETWORK[J]. Acta Mathematica Scientia 2011(01)
    • [25].A COMPACT PLANAR ULTRA-BROADBAND SUM-AND-DIFFERENCE NETWORK[J]. Journal of Electronics(China) 2008(06)
    • [26].UNITED NATIONS NATIONS UNIES THE SECRETARY-GENERAL——MESSAGE TO SILK ROAD NGO COOPERATION NETWORK[J]. International Understanding 2017(Z1)
    • [27].SUPER TOUGH GELS WITH A DOUBLE NETWORK STRUCTURE[J]. Chinese Journal of Polymer Science 2009(01)
    • [28].EXACT SOLUTION OF THE DEGREE DISTRIBUTION FOR AN EVOLVING NETWORK[J]. Acta Mathematica Scientia 2009(03)
    • [29].MULTIPLE ATTRIBUTE NETWORK SELECTION ALGORITHM BASED ON AHP AND SYNERGETIC THEORY FOR HETEROGENEOUS WIRELESS NETWORKS[J]. Journal of Electronics(China) 2014(01)
    • [30].绿色地球守护者 领航 出发[J]. 中国远洋航务 2014(05)
    Observation Analysis of Decision Tree Extraction from Artificial Neural Network
    下载Doc文档

    猜你喜欢