February 20, 2020

222 words 2 mins read

Creating smaller, faster, production-worthy mobile machine learning models

Creating smaller, faster, production-worthy mobile machine learning models

Getting machine learning models ready for use on device is a major challenge. Drag-and-drop training tools can get you started, but the models they produce arent small enough or fast enough to ship. Jameson Toole walks you through optimization, pruning, and compression techniques to keep app sizes small and inference speeds high.


Talk Title	Creating smaller, faster, production-worthy mobile machine learning models
Speakers	Jameson Toole (Fritz AI)
Conference	O’Reilly Artificial Intelligence Conference
Conf Tag	Put AI to Work
Location	London, United Kingdom
Date	October 15-17, 2019
URL	Talk Page
Slides	Talk Slides
Video

Getting machine learning models ready for use on device is a major challenge. Drag-and-drop training tools can get you started, but the models they produce aren’t small enough or fast enough to ship. Jameson Toole walks you through optimization, pruning, and compression techniques to keep app sizes small and inference speeds high. Jameson explores flexible model architectures that meet performance and accuracy requirements across devices and platforms. You’ll discover pruning and distillation techniques to optimize model performance and quantization tools to compress models to a fraction of their original size. Jameson gives you a practical example of this process as he creates an artistic style transfer model that’s just 17 kb. All of these techniques are applied to mobile machine learning frameworks such as Core ML and TensorFlow Lite.

framework tensorflow ml mobile machine learning performance style transfer optimization

comments powered by Disqus

Your easy move to serverless computing and radically simplified data processing

Your easy move to serverless computing and radically simplified data processing

February 7, 2020

Most analytic flows can benefit from serverless, starting with simple cases to and moving to complex data preparations for AI frameworks like TensorFlow. To address the challenge of how to easily integrate serverless without major disruptions to your system, Gil Vernik explores the push to the cloud experience, which dramatically simplifies serverless for big data processing frameworks.

How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE)

How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE)

February 19, 2020

Join Thomas Phelan to learn whether the combination of containers with large-scale distributed data analytics and machine learning applications is like combining oil and water or like peanut butter and chocolate.

Machine learning challenges at LinkedIn: Spark, TensorFlow, and beyond

Machine learning challenges at LinkedIn: Spark, TensorFlow, and beyond

February 18, 2020

From people you may know (PYMK) to economic graph research, machine learning is the oxygen that powers how LinkedIn serves its 630M+ members. Zhe Zhang provides you with an architectural overview of LinkedIns typical machine learning pipelines complemented with key types of ML use cases.

How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE (BlueData))

How to deploy large-scale distributed data analytics and machine learning on containers (sponsored by HPE (BlueData))

February 12, 2020

Anant Chintamaneni and Matt Maccaux explore whether the combination of containers with large-scale distributed data analytics and machine learning applications is like combining oil and water or like peanut butter and chocolate.

Deep learning at scale: Tools and solutions

Deep learning at scale: Tools and solutions

February 6, 2020

Success with DL requires more than just TensorFlow or PyTorch. Angela Wu, Sidney Wijngaarde, Shiyuan Zhu, and Vishnu Mohan detail practical problems faced by practitioners and the software tools and techniques you'll need to address the problems, including data prep, GPU scheduling, hyperparameter tuning, distributed training, metrics management, deployment, mobile and edge optimization, and more.

Optimizing analytical queries on Cassandra by 100x

Optimizing analytical queries on Cassandra by 100x

January 26, 2020

Cassandra is one of the most popular datastores in big data and ML applications. Data analysis at scale with fast query response is critical for business needs, and while Cassandra with Spark integration allows running an analytical workload, it can be slow. Shradha Ambekar dives into the challenges faced at Intuit and the solutions her team implemented to improve performance by 100x.