Going serverless: Security outside the box
January 22, 2020
The advent of serverless technologies and infrastructure as code has changed how we build and deploy security services, empowering teams to create low-cost, scalable, and secure services to protect organizations. Drawing on their real-world experiences, Jack Naglieri and Austin Byers explore tools and techniques for successfully building, deploying, and debugging serverless security applications.
How to make a lion bulletproof: Setting up site reliability engineering (SRE) in a global financial organization
January 19, 2020
Did you read the OReilly book about Google SREs but doubt that SRE will work for your more traditional or more regulated company? Janna Brummel and Robin van Zijll explain how they implemented SRE in a global financial organization, providing an overview of methods and technologies and sharing lessons learned from a year of doing SRE.
Application scaling over the edge: Microservice architecture in industrial applications
January 14, 2020
Driven by the need for data analytics in Industry 4.0, edge computing is gaining momentum to bring intelligence to the devices at the networks edge. Fei Li offers insights on a microservice-based architecture that keeps analytics applications on edge devices while dynamically utilizing resources on the cloud to achieve resilience and scalability in critical industrial applications.
January 8, 2020
With the recent flourishing of observability systems, there's no shortage of things to monitor. Sadly, humans have limited capacity to process them all. Mark McBride outlines three key metricsrequest rate, success rate, and the latency histogramthat provide a high-level abstraction of the customer experience. If these three metrics are good, your system is healthy from a customer perspective.
Thriving under a continuous self-inflicted DDoS attack
January 6, 2020
New Relic customers send monitoring data to New Relic servers every minutea continuous firehose of data. Drawing on his experience at New Relic, Kevin Beck shares best practices for building a streaming service based on Apache Kafka, self-monitoring for reliability and fault tolerance, and building a DevOps culture that anticipates and prevents outages.
Geospatial big data analysis at Uber
January 3, 2020
Uber's geospatial data is increasing exponentially as the company grows. As a result, its big data systems must also grow in scalability, reliability, and performance to support business decisions, user recommendations, and experiments for geospatial data. Zhenxiao Luo and Wei Yan explain how Uber runs geospatial analysis efficiently in its big data systems, including Hadoop, Hive, and Presto.
From Zero to Hero: Scalable 4K Video Encoding with Kubernetes and Other Open Source Tools
December 31, 2019
From zero to hero: Scalable 4k video encoding with kubernetes and other open source tools (Hygo Reinaldo, Xite Networks) - Encoding 4k videos can be very challenging due to aspects like encoding time, …
Jupyter notebooks and production data science workflows
December 22, 2019
Jupyter notebooks are a great tool for exploratory analysis and early development, but what do you do when it's time to move to production? A few years ago, the obvious answer was to export to a pure Python script, but now there are other options. Andrew Therriault dives into real-world cases to explore alternatives for integrating Jupyter into production workflows.
Database reliability engineering: What, why, and how?
December 16, 2019
SRE is becoming quite the ubiquitous term, but what about DBRE? Laine Campbell and Charity Majors dive into DBRE, exploring the paths to this craft and how to culturally evolve and support it. Laine and Charity focus on organizational scale, self-service, and force multipliers in recoverability, observability, availability, security, release management, and infrastructure.
Kubernetes in the Datacenter: Squarespaces Journey Towards Self-Service Infrastructure [I]
December 2, 2019
As Squarespaces engineering organization evolved, microservices became an obvious solution to quickly deliver new features and improve infrastructure reliability. We encountered significant challenge …
The state of Spark in the cloud
November 29, 2019
Nicolas Poggi evaluates the out-of-the-box support for Spark and compares the offerings, reliability, scalability, and price-performance from major PaaS providers, including Azure HDinsight, Amazon Web Services EMR, Google Dataproc, and Rackspace Cloud Big Data, with an on-premises commodity cluster as baseline.
Kubernetes Scheduling Features or How Can I Make the System Do What I Want? [I]
November 27, 2019
Each user has her own set of requirements and constraints on where their Pods should be placed in a cluster. Some want to increase utilization, thus they want to pack Pods as densely as possible. Othe …
Databases and Docker: A survival guide
November 26, 2019
Containers are great ephemeral vessels for your applications. But what about the data that drives your business? It must survive containers coming and going, maintain its availability and reliability, and grow when you need it. Alvin Richards does some live coding to show key strategies to help you survive the transition to production.
Tales from Lastminute.com Machine Room: Our Journey Towards a Full On-Premise Kubernetes Architecture in Production [I]
November 25, 2019
We sell travel services to more than 10 million customers worldwide in 15 languages across 35 countries, through hundreds of micro-services. What happens if you challenge the way you deliver your pr …