January 4, 2020

331 words 2 mins read

Classification of telecom network traffic: Insight gained using statistical learning on a big data platform

Classification of telecom network traffic: Insight gained using statistical learning on a big data platform

Statistical learning techniques applied to network data provide a comprehensive view of traffic behavior that would not be possible using traditional descriptive statistics alone. Amie Elcan shares an application of the random forest classification method using network data queried from a big data platform and demonstrates how to interpret the model output and the value of the data insight.

Talk Title Classification of telecom network traffic: Insight gained using statistical learning on a big data platform
Speakers Amie Elcan (CenturyLink)
Conference Strata Data Conference
Conf Tag Make Data Work
Location New York, New York
Date September 26-28, 2017
URL Talk Page
Slides Talk Slides
Video

The availability of diverse sources of large volumes of machine-generated network data combined with performance advancements in data retrieval and storage computer platforms is enabling the successful application of statistical learning techniques to explain and predict network phenomenon. Open source programming languages, online training, and visualization tools are accelerating the use of statistical learning. Catalogs of the most common classification and prediction algorithms are abundant and have eliminated the time-consuming need to code the algorithms from scratch. Data network traffic behavior over time for large service providers is dynamic. Patterns are not always decipherable from observation or summarization alone. The volume of data that can now be collected is uninformative unless it is systematically analyzed. Statistical learning models on data queried from a big data platform are being used to classify characteristics of traffic patterns, using statistical properties of the traffic as model attributes. The output from these models is used to perform forensics on recent historical traffic measurements to detect concurrent anomalies in intraday traffic behavior. Statistical learning techniques applied to network data provide a comprehensive view of traffic behavior that would not be possible using traditional descriptive statistics alone. Amie Elcan shares an application of the random forest classification method using network data queried from a big data platform and demonstrates how to interpret the model output and the value of the data insight.

comments powered by Disqus