February 11, 2020

244 words 2 mins read

Practical feature engineering

Practical feature engineering

Feature engineering is generally the section that gets left out of machine learning books, but it's also the most critical part in practice. Ted Dunning explores techniques, a few well known, but some rarely spoken of outside the institutional knowledge of top teams, including how to handle categorical inputs, natural language, transactions, and more in the context of machine learning.


Talk Title	Practical feature engineering
Speakers	Ted Dunning (MapR, now part of HPE)
Conference	Strata Data Conference
Conf Tag	Make Data Work
Location	New York, New York
Date	September 24-26, 2019
URL	Talk Page
Slides	Talk Slides
Video

Feature engineering is generally the section that gets left out of machine learning books, but it’s also the most important part of successful models, even in today’s world of deep learning. While academic courses on machine learning focus on gradients and the latest flavor of recurrent network, Ted Dunning explores the techniques that practitioners in the real world are seeking out better features and figuring out how to extract value using a variety of time-honored (and occasionally exceptionally clever) heuristics. In a sense, feature engineering is the Rodney Dangerfield of machine learning, never getting any respect. It is, however, the task that will get you the most value for time spent in terms of model performance. This work is not just the work of the data scientist. Good features encode business realities as well and are the cross-product of good business sense and good data engineering.

code real world data engineering network deep learning machine learning performance book course

comments powered by Disqus

Deep learning for recommender systems

Deep learning for recommender systems

January 12, 2020

The success of deep learning has reached the realm of structured data in the past few years, where neural networks have been shown to improve the effectiveness and predictability of recommendation engines. Oliver Gindele offers a brief overview of such deep recommender systems and explains how they can be implemented in TensorFlow.

Deep learning for speech synthesis: The good news, the bad news, and the fake news

Deep learning for speech synthesis: The good news, the bad news, and the fake news

January 12, 2020

Modern deep learning systems allow us to build speech synthesis systems with the naturalness of a human speaker. While there are myriad benevolent applications, this also ushers in a new era of fake news. Scott Stevenson explores the danger of such systems and details how deep learning can also be used to build countermeasures to protect against political disinformation.

Synthetic video generation: Why seeing should not always be believing

Synthetic video generation: Why seeing should not always be believing

January 6, 2020

The advent of "fake news" has led us to doubt the truth of online media, and advances in machine learning give us an even greater reason to question what we are seeing. Despite the many beneficial applications of this technology, it's also potentially very dangerous. Alex Adam explains how synthetic videos are created and how they can be detected.

Deploying deep learning models on GPU-enabled Kubernetes clusters

Deploying deep learning models on GPU-enabled Kubernetes clusters

January 1, 2020

Interested in deep learning models and how to deploy them on Kubernetes at production scale? Not sure if you need to use GPUs or CPUs? Mathew Salvaris and Fidan Boylu Uz help you out by providing a step-by-step guide to creating a pretrained deep learning model, packaging it in a Docker container, and deploying as a web service on a Kubernetes cluster.

Industrialized capsule networks for text analytics

Industrialized capsule networks for text analytics

December 30, 2019

Vijay Agneeswaran and Abhishek Kumar offer an overview of capsule networks and explain how they help in handling spatial relationships between objects in an image. They also show how to apply them to text analytics. Vijay and Abhishek then explore an implementation of a recurrent capsule network and benchmark the RCN with capsule networks with dynamic routing on text analytics tasks.

Understanding and integrating Intel Deep Learning Boost (Intel DL Boost)

Understanding and integrating Intel Deep Learning Boost (Intel DL Boost)

December 28, 2019

Banu Nagasundaram offers an overview of Intel's Deep Learning Boost (Intel DL Boost) technology, featuring integer vector neural network instructions targeting future Intel Xeon scalable processors. Banu walks you through the 8-bit integer convolution implementation made in the Intel MKL-DNN library to demonstrate how this new instruction is used in optimized code.