LADRA: Log-based abnormal task detection and root-cause analysis in big data processing with Spark

Siyang Lu, Xiang Wei, Bingbing Rao, Byungchul Tak, Long Wang, Liqiang Wang

Research output: Contribution to journalArticlepeer-review

31 Scopus citations

Abstract

As big data processing is being widely adopted by many domains, massive amount of generated data become more reliant on the parallel computing platforms for analysis, wherein Spark is one of the most widely used frameworks. Spark's abnormal tasks may cause significant performance degradation, and it is extremely challenging to detect and diagnose the root causes. To that end, we propose an innovative tool, named LADRA, for log-based abnormal tasks detection and root-cause analysis using Spark logs. In LADRA, a log parser first converts raw log files into structured data and extracts features. Then, a detection method is proposed to detect where and when abnormal tasks happen. In order to analyze root causes we further extract pre-defined factors based on these features. Finally, we leverage General Regression Neural Network (GRNN) to identify root causes for abnormal tasks. The likelihood of reported root causes are presented to users according to the weighted factors by GRNN. LADRA is an off-line tool that can accurately analyze abnormality without extra monitoring overhead. Four potential root causes, i.e., CPU, memory, network, and disk I/O, are considered. We have tested LADRA atop of three Spark benchmarks by injecting aforementioned root causes. Experimental results show that our proposed approach is more accurate in the root cause analysis than other existing methods.

Original languageEnglish
Pages (from-to)392-403
Number of pages12
JournalFuture Generation Computer Systems
Volume95
DOIs
StatePublished - Jun 2019

Keywords

  • Abnormal task
  • Log analysis
  • Root cause
  • Spark

Fingerprint

Dive into the research topics of 'LADRA: Log-based abnormal task detection and root-cause analysis in big data processing with Spark'. Together they form a unique fingerprint.

Cite this