TY - JOUR
T1 - An efficient divide-and-conquer approach for big data analytics in machine-to-machine communication
AU - Ahmad, Awais
AU - Paul, Anand
AU - Rathore, M. Mazhar
N1 - Publisher Copyright:
© 2015 Elsevier B.V.
PY - 2016/1/22
Y1 - 2016/1/22
N2 - Machine-to-Machine (M2M) communication relies on the physical objects (e.g., satellites, sensors, and so forth) interconnected with each other, creating mesh of machines producing massive volume of data about large geographical area (e.g., living and non-living environment). Thus, the M2M is an ideal example of Big Data. On the contrary, the M2M platforms that handle Big Data might perform poorly or not according to the goals of their operator (in term of cost, database utilization, data quality, processing and computational efficiency, analysis and feature extraction applications). Therefore, to address the aforementioned needs, we propose a new effective, memory and processing efficient system architecture for Big Data in M2M, which, unlike other previous proposals, does not require whole set of data to be processed (including raw data sets), and to be kept in the main memory. Our designed system architecture exploits divide-and-conquer approach and data block-wise vertical representation of the database follows a particular petitionary strategy, which formalizes the problem of feature extraction applications. The architecture goes from physical objects to the processing servers, where Big Data set is first transformed into a several data blocks that can be quickly processed, then it classifies and reorganizes these data blocks from the same source. In addition, the data blocks are aggregated in a sequential manner based on a machine ID, and equally partitions the data using fusion algorithm. Finally, the results are stored in a server that helps the users in making decision. The feasibility and efficiency of the proposed system architecture are implemented on Hadoop single node setup on UBUNTU 14.04 LTS core™i5 machine with 3.2 GHz processor and 4 GB memory. The results show that the proposed system architecture efficiently extract various features (such as River) from the massive volume of satellite data.
AB - Machine-to-Machine (M2M) communication relies on the physical objects (e.g., satellites, sensors, and so forth) interconnected with each other, creating mesh of machines producing massive volume of data about large geographical area (e.g., living and non-living environment). Thus, the M2M is an ideal example of Big Data. On the contrary, the M2M platforms that handle Big Data might perform poorly or not according to the goals of their operator (in term of cost, database utilization, data quality, processing and computational efficiency, analysis and feature extraction applications). Therefore, to address the aforementioned needs, we propose a new effective, memory and processing efficient system architecture for Big Data in M2M, which, unlike other previous proposals, does not require whole set of data to be processed (including raw data sets), and to be kept in the main memory. Our designed system architecture exploits divide-and-conquer approach and data block-wise vertical representation of the database follows a particular petitionary strategy, which formalizes the problem of feature extraction applications. The architecture goes from physical objects to the processing servers, where Big Data set is first transformed into a several data blocks that can be quickly processed, then it classifies and reorganizes these data blocks from the same source. In addition, the data blocks are aggregated in a sequential manner based on a machine ID, and equally partitions the data using fusion algorithm. Finally, the results are stored in a server that helps the users in making decision. The feasibility and efficiency of the proposed system architecture are implemented on Hadoop single node setup on UBUNTU 14.04 LTS core™i5 machine with 3.2 GHz processor and 4 GB memory. The results show that the proposed system architecture efficiently extract various features (such as River) from the massive volume of satellite data.
KW - Big Data
KW - Data fusion domain
KW - Divide-and-conquer
KW - M2M
UR - http://www.scopus.com/inward/record.url?scp=84945582178&partnerID=8YFLogxK
U2 - 10.1016/j.neucom.2015.04.109
DO - 10.1016/j.neucom.2015.04.109
M3 - Article
AN - SCOPUS:84945582178
SN - 0925-2312
VL - 174
SP - 439
EP - 453
JO - Neurocomputing
JF - Neurocomputing
ER -