Big sensing data is prevalent in both industry and sciatic research applications where the data is generated with high volume and velocity. Current big sensing data processing on Cloud have adopted some data compression techniques. However, due to the high volume and velocity of big sensing data, traditional data compression techniques lack sufficient efficiency and scalability for data processing. Based on septic on-Cloud data compression requirements, a scalable data compression approach is proposed to calculate the similarity among the partitioned data chunks. Instead of compressing basic data units, the compression will be conducted over partitioned data chunks. To restore original data sets, some restoration functions and predictions will be designed. Map Reduce is used for algorithm implementation to achieve extra scalability on Cloud. The proposed scalable compression approach based on data chunk similarity can signiﬁcantly improve data compression efficiency with affordable data accuracy.
Keywords—Cloud Computing; Big Sensing Data; Map Reduce Algorithm for Big Data; Compression Techniques.