Presto Newsletter – February 2020
Welcome to the 19th issue of the Presto Newsletter. Please sign-up to get future issues to your mailbox. We aim to track the relevant news in the Presto community and keep you updated every month.
Videos – Presto Summit NYC
The presentation videos are here! If you missed the Presto Summit in NYC don’t worry, you can now view all the presentations on-demand at the link above.
We’re looking forward to seeing everyone at one of the upcoming Presto Summits throughout 2020. Stay tuned!
Starburst Presto Workshops by Insight Digital Innovation
In these interactive workshops, learn how Starburst Presto integrates with a wide variety of data sources and reporting tools to accelerate intelligence discovery and delivery. Get a detailed demonstration of provisioning and administration processes, explore use cases and gain hands-on experience connecting, querying and creating Business Intelligence (BI) visualizations with this powerful tool.
- Boston – Thursday, Feb 27th, 2020
- New York – Tuesday, March 3rd, 2020
- Reston – Wednesday, March 11th, 2020
Presto News & Knowledge
This post recaps the amazing achievements of the Presto community throughout 2019. From launching the Presto Software Foundation, to hosting 5 Presto Summits throughout the globe, to making countless improvements to Presto, it’s safe to say that 2019 was a BIG year.
Starburst, The Presto Company, had quite a year as well, helping to solidify Presto’s spot as a top big data tool. This post recaps the $22m Series A fundraising, as well as several key hires, growth achievements, and Presto improvements.
In a previous post, the author sets up a Presto data warehouse using Docker. In this latest post, he updates and improves upon this Presto cluster, moving everything, including the Hive Metastore, to run in Kubernetes. Learn more about his process and how he configured Kubernetes to best run Presto.
This blog post by Tom Nats looks at Starburst Secrets, a feature of Starburst Enterprise Presto that allows administrators to separate configuration files from sensitive data. Read how this is accomplished above.
Presto Releases & New Features
New features in Presto:
Delta Lake 0.5.0 introduces even better Presto performance
Did you know?
1 Get the list and size of files in a distributed storage table (HDFS, S3, ADLS, CEPH,etc..):
select distinct “$path”, “$file_size” from your_table_name; (might want to add a limit for very large tables)
2 The Presto cost based optimizer uses statistics for queries whether the table is external or internally managed.
3 To use Glue as a catalog in Presto, you simple create a file named glue.properties (or whatever you can to call the catalog) with the following lines:
hive.metastore = glue