Abstract
Social data such as comments is massive and unstructured, thus, existing relational data model shows short of processing such kind of data. Besides, it is more difficult to analyze the users’ responses in a real-time manner from the dynamically increasing social data. In this paper, to quickly analyze the social users’ comments, we design a fast social-user reaction analysis system based on Hadoop for storing big data and distributed in-memory-based Spark for data processing. In the experiments, about one Terabytes of social data which is composed of around 1.6 billion records are first stored and pre-processed. Then an algorithm called n-gram is used to analyze the comment responses. In processing this algorithm, big data is not loaded to cluster disk but directly to memory and thus, it is possible to process the social users’ responses in a real-time manner.1.
Original language | English |
---|---|
Pages (from-to) | 9345-9349 |
Number of pages | 5 |
Journal | International Journal of Applied Engineering Research |
Volume | 11 |
Issue number | 18 |
State | Published - 2016 |
Keywords
- Hadoop
- N-gram
- Social data
- Spark
- SparkSQL