Apache Eagle: Secure Hadoop in real time
Apache Eagle is an open source monitoring solution to instantly identify access to sensitive data, recognize malicious activities, and take action. Arun Karthick Manoharan, Edward Zhang, and Chaitali Gupta explain how Eagle helps secure a Hadoop cluster using policy-based and machine-learning user-profile-based detection and alerting.
Talk Title | Apache Eagle: Secure Hadoop in real time |
Speakers | Arun Karthick Manoharan (eBay), Yong Zhang (eBay), Chaitali Gupta (eBay) |
Conference | Strata + Hadoop World |
Conf Tag | Making Data Work |
Location | London, United Kingdom |
Date | June 1-3, 2016 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Apache Eagle is an open source monitoring solution to instantly identify access to sensitive data, recognize malicious activities, and take action. Eagle is built for real-time policy evaluation and real-time machine-learning detection using Kafka, Storm, and Spark infrastructure. Eagle audits access to HDFS files, Hive, and HBase tables in real time, enforces policies defined on sensitive data access and alerts or blocks users’ access to that sensitive data in real time. Eagle also creates user profiles based on the typical access behavior for HDFS and Hive and sends alerts when anomalous behavior is detected. Eagle can also import sensitive data information classified by external classification engines to help define its policies. Eagle uses Kafka to process more than 10 billion security events per day and generates actionable alerts within seconds. Eagle provides easy programming API and configuration for consuming any data source and also ingests high-volume Hadoop audit logs into Kafka by the Log4j appender or Logstash agent, which involves a lot of performance tuning in Kafka operation. To ensure minimum alert latency, Eagle rebalances Storm topology accordingly in real time to achieve maximum elasticity. Arun Karthick Manoharan, Edward Zhang, and Chaitali Gupta offer an overview of Eagle, explain how Eagle helps secure a Hadoop cluster using policy-based and machine-learning user-profile-based detection and alerting, and explore how Eagle is built with scalability and usability in mind. Topics include: