Sparklyr: An R interface for Apache Spark
Sparklyr makes it easy and practical to analyze big data with Ryou can filter and aggregate Spark DataFrames to bring data into R for analysis and visualization and use R to orchestrate distributed machine learning in Spark using Spark ML and H2O SparkingWater. Edgar Ruiz walks you through these features and demonstrates how to use sparklyr to create R functions that access the full Spark API.
Talk Title | Sparklyr: An R interface for Apache Spark |
Speakers | Edgar Ruiz (RStudio) |
Conference | Strata + Hadoop World |
Conf Tag | Big Data Expo |
Location | San Jose, California |
Date | March 14-16, 2017 |
URL | Talk Page |
Slides | Talk Slides |
Video | |
Sparklyr, a free and open sourced package developed by RStudio in conjunction with IBM, Cloudera, and H2O, makes it easy and practical to analyze big data with R. The package provides an R interface to Spark’s distributed machine-learning algorithms and much more. With sparklyr, you can: Edgar Ruiz walks you through these features and demonstrates how to use sparklyr to create R functions that access the full Spark API.