Stream analytics in the enterprise: A look at Intels internal IoT implementation

Moty Fania shares Intels IT experience implementing an on-premises IoT platform for internal use cases. The platform was based on open source big data technologies and containers and was designed as a multitenant platform with built-in analytical capabilities. Moty highlights the key lessons learned from this journey and offers a thorough review of the platforms architecture.


Talk Title	Stream analytics in the enterprise: A look at Intels internal IoT implementation
Speakers	Moty Fania (Intel)
Conference	Strata + Hadoop World
Conf Tag	Making Data Work
Location	London, United Kingdom
Date	June 1-3, 2016
URL	Talk Page
Slides	Talk Slides
Video

Recent years have seen significant evolution of the Internet of Things. It has become increasingly easy to connect devices to the Internet and send sensorial data to the public cloud. However, it’s quite evident that the adoption of IoT platforms and stream analytics within the enterprise is lagging and less prevalent, due in part to companies’ lack of expertise and skills required to deploy an on-premises platform and demonstrate high value through various, real-life use cases. Moty Fania shares Intel’s IT experience implementing an on-premises IoT platform for internal use cases. The platform was based on open source big data technologies and containers and was designed as a multitenant platform with built-in analytical capabilities. Moty highlights the key lessons learned from this journey and offers a thorough review of the platform’s architecture. Intel IT’s goal was to allow users and organizations in Intel to gain insights and business value from real-time analytics and become more proactive. Intel deployed a platform based on several open source technologies, including Akka, Kafka, and Spark Streaming, with a full stack of algorithms such as multisensor change detection, anomaly detection, and more. Unlike other IoT analytics implementations that settle for basic statistics or make many assumptions on the collected data, Intel’s implementation includes a generic analytics layer that uses machine learning and advanced statistical tests to provide meaningful insights to users in different use cases and business domains. Moty outlines Intel’s “smart data pipe”/stream processing framework, Pigeon, which enables stream analytics at scale. Pigeon, based on Akka, implements a cluster capable of processing topologies that process the data according to any arbitrary logic determined by the users. It handles the creation of topologies, balancing them across the cluster, and allows nodes to join or leave dynamically. Pigeon is optimized to be easily deployed with Docker and Core OS and cut down development by enabling a single developer to deploy a massive real-time, elastic processing cluster with a click of a button. Spark Streaming was used to deploy self-service data monitors that allow users define their own rules and get an actuation when a certain condition is met. These user-defined rules are monitored in near-real-time on the stream. Moty then explains how Pigeon and its analytics capabilities were applied to several use cases—both internally and externally—with interesting results. In one POC, Pigeon helped identify a fab tool causing a yield problem; in another POC it showed malfunctions of electrical network voltage sensors. Moty concludes by exploring how operational activities can be “translated” into IoT stream analytics scenarios to allow a higher level of proactivity and a shift from manual monitoring and firefighting to higher-value work.

Stream analytics in the enterprise: A look at Intels internal IoT implementation

IoT in the enterprise: A look at Intel (IoT) Inside

How the oil and gas industry is igniting a spark with information fusion and metadata analytics

Building machine-learning apps with Spark: MLlib, ML Pipelines, and GraphX (Half Day)

How to turn your house into a robot: An adaptive-learning algorithm for the Internet of Things

The Internet of Things: Its the (sensor) data, stupid

Deployment and orchestration at scale with Docker Swarm