[TALK]@Telematika

ml

Your easy move to serverless computing and radically simplified data processing

February 7, 2020

Most analytic flows can benefit from serverless, starting with simple cases to and moving to complex data preparations for AI frameworks like TensorFlow. To address the challenge of how to easily integrate serverless without major disruptions to your system, Gil Vernik explores the push to the cloud experience, which dramatically simplifies serverless for big data processing frameworks.

Data science + design thinking: A perfect blend to achieve the best user experience

February 6, 2020

Design thinking is a methodology for creative problem-solving developed at the Stanford d.school. The methodology is used by world-class design firms like IDEO and many of the world's leading brands like Apple, Google, Samsung, and GE. Michael Radwin prepares a recipe for how to apply design thinking to the development of AI/ML products.

Executive Briefing: Unpacking AutoML

February 5, 2020

Paco Nathan outlines the history and landscape for vendors, open source projects, and research efforts related to AutoML. Starting from the perspective of an AI expert practitioner who speaks business fluently, Paco unpacks the ground truth of AutoMLtranslating from the hype into business concerns and practices in a vendor-neutral way.

Machine Learning Models and Datasets Versioning Practices and Tools

February 5, 2020

The rise of AI and ML changes development workflow and requires new development tools: data versioning, ML pipeline versioning, experiments metrics tracking and others that have not been formalized an …

Introducing Kubeflow (with special guests TensorFlow and Apache Spark)

February 4, 2020

Modeling is easyproductizing models, less so. Distributed training? Forget about it. Say hello to Kubeflow with Holden Karaua system that makes it easy for data scientists to containerize their models to train and serve on Kubernetes.

TFX: Production ML pipelines with TensorFlow

February 2, 2020

Putting together an ML production pipeline for training, deploying, and maintaining ML and deep learning applications is much more than just training a model. Robert Crowe explores Google's open source community TensorFlow Extended (TFX), an open source version of the tools and libraries that Google uses internally, made using its years of experience in developing production ML pipelines.

The moral responsibility of AI builders (sponsored by Dataiku)

February 2, 2020

With the adoption of AI in the enterprise accelerating, its impactsboth positive and negativeare rapidly increasing. Triveni Gandhi explores why the builders of these new AI capabilities all bear some moral responsibility for ensuring that their products create maximum benefit and minimal harm.

Building machine learning inference pipelines at scale

January 31, 2020

Real-life ML workloads require more than training and predicting: data often needs to be preprocessed and postprocessed. Developers and data scientists have to train and deploy a sequence of algorithms that collaborate in delivering predictions from raw data. Julien Simon outlines how to build machine learning inference pipelines using open source libraries and how to scale them on AWS.

End-to-end ML streaming with Kubeflow, Kafka, and Redis at scale

January 30, 2020

With ubiquitous ML models, model serving and pipelining is more important now. Comcast runs hundreds of models at scale with Kubernetes and Kubeflow. Together with other popular open source streaming platforms such as Apache Kafka and Redis, Comcast invokes models billions of times per day while maintaining high availability guarantees and quick deployments. Join Nick Pinckernell to learn how.

Machine learning vital signs: Metrics and monitoring of AI in production

January 27, 2020

Production artificial intelligence systems are interacting with the real world, and it's terrifying that oftentimes nobody has any idea how they're performing on live data. Donald Miner details why you should track your models in production over time, explains how you can implement proper logging and metrics for models, and details metrics you should probably be capturing.

Model as a Service for Real-time Decisioning

January 27, 2020

Imagine a stream processing platform that leverages ML models and requires real-time decisions. While most solutions provide tightly coupled ML models in the use case, these may not offer the most eff …

Model as a service for real-time decisioning

January 27, 2020

Hosting models and productionizing them is a pain point. ML models used for real-time processing require data scientists to have a defined workflow giving them the agility to do self-service seamless deployments to production. Niraj Tank and Sumit Daryani detail open source technologies for building a generic service-based approach for servicing ML decisioning and achieving operational excellence.

Optimizing analytical queries on Cassandra by 100x

January 26, 2020

Cassandra is one of the most popular datastores in big data and ML applications. Data analysis at scale with fast query response is critical for business needs, and while Cassandra with Spark integration allows running an analytical workload, it can be slow. Shradha Ambekar dives into the challenges faced at Intuit and the solutions her team implemented to improve performance by 100x.

Overview of Data Governance

January 26, 2020

Paco Nathan offers an overview of its history, themes, tools, process, standards, and morepartly based on interviewing experts in this field about issues and best practices. Join in to learn what impact machine learning has on data governance and vice versa, along with an overview of open source projects and open standards in this space.

Using Kubeflow Pipelines for Building Machine Learning Pipelines

January 24, 2020

Kubeflow is an open-source project dedicated to making deployments of machine learning workflows on Kubernetes simple, portable and scalable. This session will focus on Kubeflow Pipelines, a platform …

Unlocking your serverless functions with OpenFaaS for AI chatbot projects

January 23, 2020

Sergio Mendez examines critical challenges when implementing AI chatbots and explains how Movistar designed an open source serverless architecture using OpenFaaS on top of Kubernetes and other complementary technologies like NoSQL, brokers to deploy Telegram AI chatbots. Sergio then compares these technologies to "vendor lock-in" services offered by major cloud providers.

What's your machine learning score?

January 23, 2020

ML in production is different than ML in an R&D environment. Tania Allard dives deep into a number of techniques to test your ML quality and decay in your R&D and production environments appropriately. You'll see examples of issues commonly encountered in the ML area and how to test and monitor your data, model development, and infrastructure.

Just Deploy It! How to Ship Your ML Model to Production

January 17, 2020

Once upon a time in the kingdom of Artificial Intelligence, there were data scientists who worked hard on their complex researches and ML models development. But then the day of atonement came - time …

Data-driven digital transformation and jobs: The new software hierarchy and ML

January 13, 2020

Robert Cohen discusses the skills that employers are seeking from employees in digital jobs, linked to the new software hierarchy driving digital transformation. Robert describes this software hierarchy as one that ranges from DevOps, CI/CD, and microservices to Kubernetes and Istio. This hierarchy is used to define the jobs that are central to data-driven digital transformation.

Combining WrapFS and eBPF to Provide a Lightweight File System Sandboxing Framework

January 12, 2020

Filesystem (FS) sandboxing is a useful technique to protect sensitive data from untrusted binaries. However, existing approaches do not allow fine-grained control over policy enforcement (e.g., seccom …

Evaluating cybersecurity defenses with a data science approach

January 12, 2020

Cybersecurity analysts are under siege to keep pace with the ever-changing threat landscape. The analysts are overworked as they are bombarded with and burned out by the sheer number of alerts that they must carefully investigate. Brennan Lodge and Jay Kesavan explain how to use a data science model for alert evaluations to empower your cybersecurity analysts.

Executive Briefing: Overview of data governance

January 11, 2020

Effective data governance is foundational for AI adoption in enterprise, but it's an almost overwhelming topic. Paco Nathan offers an overview of its history, themes, tools, process, standards, and more. Join in to learn what impact machine learning has on data governance and vice versa.

Fair, privacy-preserving, and secure ML

January 11, 2020

Mikio Braun explores techniques and concepts around fairness, privacy, and security when it comes to machine learning models.

Can Artificial Intelligence Secure Your Infrastructure?

January 8, 2020

While intrusion detection systems are the basis of every security aware organization and most of the network based threats have been successfully mitigated in the past; it has a major drawback. And th …

Distributed ML on Unikernels for IoT

January 8, 2020

Machine Learning (ML) has been happening only in Cloud and some inference on Edge. But there's quite few ML happening in IoT where Linux cannot fit. This area is called TinyML. In order to democratize …

Spark NLP in action: How Indeed applies NLP to standardize rsum content at scale

January 6, 2020

Alexander Thomas and Alexis Yelton demonstrate how to use Spark NLP and Apache Spark to standardize semistructured text, illustrated by Indeed's standardization process for rsum content.

Veo5G Project: A 5G Use Case for Network Data Analytics

January 6, 2020

This presentation will describe Gradiant's role in the VEO5G R&D project: provide data analytics for networking data in a 5G scenario.Gradiant researchers will share their experience working with the …

The Lyft data platform: Now and in the future

January 5, 2020

Lyfts data platform is at the heart of the company's business. Decisions from pricing to ETA to business operations rely on Lyfts data platform. Moreover, it powers the enormous scale and speed at which Lyft operates. Mark Grover and Deepak Tiwari walk you through the choices Lyft made in the development and sustenance of the data platform, along with what lies ahead in the future.