Abstract
In the atmospheric science, the scale of meteorological data is massive and growing rapidly. K-means is a fast and available cluster algorithm which has been used in many fields. However, for the large-scale meteorological data, the traditional K-means algorithm is not capable enough to satisfy the actual application needs efficiently. This paper proposes an improved MK-means algorithm (MK-means) based on MapReduce according to characteristics of large meteorological datasets. The experimental results show that MK-means has more computing ability and scalability.