November 1, 2019

411 words 2 mins read

The enterprise geospatial platform: A perfect fusion of cloud and open source technologies

The enterprise geospatial platform: A perfect fusion of cloud and open source technologies

Recently, the volume of data collected from farmers' fields via sensors, rovers, drones, in-cabin technologies, and other sources has forced Monsanto to rethink its geospatial processing capabilities. Naghman Waheed and Martin Mendez-Costabel explain how Monsanto built a scalable geospatial platform using cloud and open source technologies.

Talk Title The enterprise geospatial platform: A perfect fusion of cloud and open source technologies
Speakers Naghman Waheed (Bayer Crop Science), Martin Mendez-Costabel (Bayer Crop Science)
Conference Strata + Hadoop World
Conf Tag Big Data Expo
Location San Jose, California
Date March 14-16, 2017
URL Talk Page
Slides Talk Slides
Video

Geospatial datasets and systems were introduced at Monsanto over a decade ago, and their significance and use has only increased over time. Moreover, the volume and variety of datasets that are geospatially tagged and collected is increasing exponentially. However, the systems in use today have struggled to keep up with the ever-increasing demand. To address this, the Monsanto Data Platform Architecture and Engineering team embarked on a journey to create a scalable geospatial platform in the cloud using only open source components. The result has been a fully scalable geospatial platform that is being utilized across the globe for processing of geospatial datasets for both visualization and analytics services. Naghman Waheed and Martin Mendez-Costabel explain how Monsanto built this platform, focusing on the technical design and build of the entire system and covering the technical architecture, how and why the team chose certain open source components, and the lessons learned along the way. Naghman and Martin also highlight the value derived out of the new platform through examples of how the system is being used to provide analytics on top of large geospatial datasets. The entire platform was designed with several key architecture and engineering principles in mind: it needed to use open source, be instantiated in AWS cloud, be easily scalable for both processing and storage needs, have automated monitoring and failure form recovery, and integrate with existing technologies such as API gateway and identity management. The platform also supports a pay-as-you-use model with spend visibility and accountability passed back the the user of the platform. The platform was built using open source software, including CKAN as the data searching catalog, Geoserver as the geospatial processing engine, QGIS as the visualization tool, and S3, Amazon Elastic File System, PostGIS, and AWS ECS for data processing. The platform is fully integrated with AKAN and VDS (virtual directory service) and utilizes the OAuth2.0 security model.

comments powered by Disqus