Abstract
The change of weather factors will lead to great changes in users' electricity consumption behaviors. In order to discover the associations between users' electricity consumption behavior and weather factors, and meet the needs of efficient mining of massive data, in this paper a parallel association rule mining method is presented based on MapReduce. Firstly, to convert electricity data into the form suitable for association rule mining, the density-based K-means method is implemented to discretize the electricity consumption data, in which the parallel calculation method of sample density is presented. Then the cluster results are combined with the weather data. Next, the parallel frequent itemset mining method and association rule mining method are designed and implemented based on Apriori. In the frequent itemset mining MapReduce job the Combiner function is added and implemented. The local support threshold is set in the Combiner implementation to filter out some candidate itemsets, thereby reducing the time to transmit data among slave nodes. Finally, experiments on a Hadoop cluster are conducted. The experimental results validate availability and scalability of the proposed method.