Research on Multidimensional Association Rules Algorithm based on Hadoop
Yuanyuan CHENG
School of Information Sciences and Engineering, Chongqing Jiaotong University, Chongqing, 400074, CHINA
Abstract: The existing parallel multidimensional association rules algorithm has a lot of problems. On account of the data large and disorder makes huge communication traffic, no uniform distribution cannot deal with load balancing, it is also makes the system I/O performance is low, the new parallel multidimensional association rules algorithm based on the Hadoop platform is proposed. Thought of the new design algorithm adopting the method of equip-width discretization to split each attribute domain, after a scan, deleting the dissatisfaction property value of each box, then mapped to a unified space. After preprocessing, combining the improved Apriority algorithm for implementation by using the Map Reduce programming model. The results show that the improved algorithm can better solve the above problems on the basis of improving the efficiency of mining, and this algorithm has great value in research and utilization. Keywords: Multidimensional association rules algorithm; Huge communication traffic; load balancing; Equip-width; Preprocessing