Deep Mining Method of Distributed Data Association Based on Decision Tree Algorithm
Abstract
In this era of big data, the enormous increase in the amount of data being generated makes it problematic to determine the association between data. Hence, this paper is concerned with the deep mining of distributed data association based on the decision tree algorithm. The top-down recursive method is adopted to compare the attribute values of the internal nodes of the decision tree, determine the downward branches from the node according to different attribute values, and generate a decision tree through the probability estimation of single tree and multiple trees. The gain ratio algorithm is used to optimize the information gain algorithm to obtain the heuristic information of the decision tree and select the most appropriate
test attributes. At the same time, the pruning strategy applied to a decision tree is optimized by setting multiple thresholds. The optimized decision tree is used to deeply mine the association between horizontal and vertical data distribution. The results show that the decision tree constructed by this method can accurately and deeply mine different attributes, the mining process has good stability, and the mining results can meet the needs of practical application.
Keywords: Decision tree algorithm; Distributed data; Association; Deep mining; Probability estimation; Pruning strategy.
Cite As
J. Cai, Y. Ding, "Deep Mining Method of Distributed Data Association Based on Decision Tree
Algorithm", Engineering Intelligent Systems, vol. 31 no. 3, pp. 229-273, 2023.