Presto Newsletter – August 2019
Welcome to the 14th issue of the Presto Newsletter. Please sign-up to get future issues to your mailbox. We aim to track the relevant news in the Presto community and keep you updated every month.
Presto Summit (East Coast):
We’re excited to announce the Presto Summit is coming to New York City in mid-December!
More details to follow soon. If you’d like to speak at this event please let us know at email@example.com.
Webinar – Building a Federated Semantic Layer using Starburst Presto (Sept 26th @ 2pm ET):
In this upcoming webinar we’ll discuss the value of building a Federated Semantic Layer within your data infrastructure, and how to accomplish this with Starburst Presto. Space is limited, register with the link above.
Meetup – Presto on Kubernetes: Query Anything, Anywhere (Sept 12th, Tel Aviv)
Presto Summit in India – Presented by Quoble (Sept 5th)
Presto News & Knowledge
Presto at Lyft
With thousands of dashboards powered by Presto and about 1.5K weekly active users running a couple of million queries every month, Presto is a critical tool for managing Lyft’s data. In this recent Lyft Engineering blog post by Puneet Jaiswal, their use of Presto from introduction to present day is examined in detail.
Presto on Kubernetes
Kubernetes (K8s) eases the burden and complexity of configuring, deploying, managing, and monitoring containerized applications. With the recent availability of Starburst Presto on Kubernetes, deploying and using Presto across hybrid and multi cloud environment have become even simpler. Read more about how this is accomplished in the blog above.
Starburst Presto on Azure Kubernetes Services (video)
This video looks at Starburst Presto on Azure and how to deploy Presto on Azure Kubernetes Services. We’ll also walk you through how to deploy a Kubernetes cluster, and then a Starburst Presto cluster on top of that.
Autoscaling Presto on Google Kubernetes Engine (video)
This video explores how autoscaling works for Starburst Presto on Kubernetes, including a live demo.
Webinar Video & Slides – Presto on Kubernetes
In this August webinar we showcase how Presto on Kubernetes makes deploying and using Presto across hybrid and multi cloud environment simpler, allowing you to easily deploy Presto on RedHat OpenShift Container Platform, Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS), and Amazon Elastic Container Service for Kubernetes (Amazon EKS). A live demo is also included at the end.
VLDB 2019 Test of Time Award
Congratulations to Starburst Data CTO & Co-Founder, Kamil Bajda-Pawlikowski, and his co-authors on their recent award for their paper, HadoopDB: An Architectural Hybrid of MapReduce and DBMS Technologies for Analytical Workloads.
Running Presto Queries
In this recent blog published by Several Nines, the author discusses Presto’s architecture and takes a look at how to set up a basic presto environment using a Docker server from the tar file.
MinIO gains traction as alt. for S3 for on-prem deployments
This Datanami article looks at the demise of Hadoop and the impact that it’s had on object storage vendors, particularly MinIO who saw a gap in on-premises object storage.
Presto Releases & New Features
We recently announced the general availability of Starburst Enterprise Presto 312e. The main features we’ve added include the following:
- Google Cloud Platform: Including Cloud Storage, Kubernetes Service, and Dataproc
- Kubernetes: Support for Presto on Kubernetes environments
- Data Source Connectivity: Including parallel Teradata connector, MapR connector, and support for Azure Data Lake Storage Gen2
- Presto Core: Including performance and security features
UNNEST performance improvements
The execution plans for queries with a
CROSS JOIN UNNEST clause contain an Unnest Operator. The previous implementation of Unnest Operator performed a deep copy on all input blocks to generate output blocks. This caused high CPU consumption and memory allocation for the operator, and impacted the performance of such queries. This post explores how this is solved to achieve ~10x gain in CPU time and 3x~5x gain in memory allocation.
Google Sheets connector was recently merged
This doc looks at the recent release of the Google Sheets connector, which allows for reading spreadsheets as tables in Presto.