Cookie Notice

This site uses cookies for performance, analytics, personalization and advertising purposes.

For more information about how we use cookies please see our Cookie Policy.

Manage Consent Preferences

Essential/Strictly Necessary Cookies

Required

These cookies are essential in order to enable you to move around the website and use its features, such as accessing secure areas of the website.

Analytical/ Performance Cookies

These are analytics cookies that allow us to collect information about how visitors use a website, for instance which pages visitors go to most often, and if they get error messages from web pages. This helps us to improve the way the website works and allows us to test different ideas on the site.

Functional/ Preference Cookies

These cookies allow our website to properly function and in particular will allow you to use its more personal features.

Targeting/ Advertising Cookies

These cookies are used by third parties to build a profile of your interests and show you relevant adverts on other sites. You should check the relevant third party website for more information and how to opt out, as described below.

Presto Newsletter – July 2018

Welcome to the first issue of the Presto Newsletter.

[mc4wp_form id=”956″]

Events

Presto Summit 2018 recap
The first ever, all-day Presto Summit brought together many Presto users, committers, and other big data analytics fans. Participants from over 40 companies joined us on July 16th. The agenda was filled with high-quality talks from some of the leading members of the Presto community.
Here is a link to the topics covered and the slides:
https://www.starburstdata.com/technical-blog/presto-summit-2018-recap/

Presto News and Knowledge

Querying 8.66 Billion Records – a Performance and Cost Comparison between Presto and Redshift (including Spectrum)
This is a very detailed post from Ernesto at Concurrency Labs comparing Presto to Redshift. The comparison includes cost and performance for both solutions and is worth the read:
https://www.concurrencylabs.com/blog/starburst-presto-vs-aws-redshift/

Using Presto to query on-premises object stores
Nitish at Minio, a distributed object store for private clouds (roll your own S3..), wrote a great post on creating your own object store analytics hub using Presto:
https://blog.minio.io/presto-modern-interactive-sql-query-engine-for-enterprise-ce56d7aea931

Demo: Querying Presto from Qlik Sense
This demo shows how easy it is to use Qlik Sense to query Presto:
https://www.youtube.com/watch?v=X9lFdues_wE

Presto query optimizer: Pursuit of performance
Starburst CTO Kamil Bajda-Pawlikowski and Facebook’s Martin Traverso presented at the DataWorks Summit on Presto’s new cost-based optimizer:
https://dataworkssummit.com/san-jose-2018/session/presto-query-optimizer-pursuit-of-performance/

Using Presto for GeoSpatial Analytics
Also at DataWorks Summit, Uber engineers talked about using Presto for GeoSpatial Analytics:
https://dataworkssummit.com/san-jose-2018/session/geospatial-data-platform-at-uber/

Presto at Tivo, Boston Hadoop Meetup
See how Tivo uses Presto for SQL analytics. This excellent presentation covers a few important topics:
– TIVO’s decision-making process – choosing Presto over Redshift Spectrum
– Choosing the correct AWS instance type for their Presto workloads
– How the different memory structures in Presto work together
– Using MySQL and S3 together to create TIVO’s data warehouse
https://www.slideshare.net/JustinBorgman1/presto-at-tivo-boston-hadoop-meetup

Presto TPC-DS benchmark on AWS
Before introducing the Presto cost-based optimizer, Presto had issues with running all TPC-DS queries. That’s no longer the case, plus the performance is much better than the older versions of Presto:
https://www.starburstdata.com/technical-blog/starburst-presto-on-aws-18x-faster-than-emr/

Big Data File Formats – ORC, Parquet & AVRO
At Starburst, we field a lot of questions from customers and prospects on which source file format to use. The answer is usually situation-dependent. This article on Datanami from Alex Woodie does an excellent job of breaking down each format and their advantages and disadvantages in different situations:
https://www.datanami.com/2018/05/16/big-data-file-formats-demystified/

3rd party Presto benchmarks
Here are two excellent articles on Presto performance comparison benchmarks. It’s no wonder Presto’s popularity has exploded over the last few years:
http://bytes.schibsted.com/bigdata-sql-query-engine-benchmark/
https://virtuslab.com/blog/benchmarking-spark-sql-presto-hive-bi-processing-googles-cloud-dataproc/

Releases and New Features

Starburst Presto 203e released:
https://www.starburstdata.com/technical-blog/starburst-enterprise-distribution-of-presto-203e-now-available/

-AWS Glue Integration
-New geospatial functions and improved geospatial function performance
-Additional SQL subquery support
-Add SQL FILTER clause for aggregations
-Column-level access control
-Support for authentication with JWT access token
-Various bug fixes that continue to improve the robustness of Presto
-Improvements to query scheduling and resource management

We would like to thank the members of the Presto community for the following contributions:
-Maria Basmanova from Facebook – new geospatial functions and optimizations
–Rentao Wu from AWS – Glue Catalog support
-Li Ding – SQL FILTER clause for aggregations
and many, many more!

Engineer’s Corner

Iceberg – A modern table format for big data from Netflix
During the first-ever Presto Summit last week, Netflix presented “Iceberg,” a new file format for storing large, slow-moving tabular data. Their presentation and Github links:
https://www.slideshare.net/kbajda/presto-summit-2018-09-netflix-iceberg/
https://github.com/Netflix/iceberg

[mc4wp_form id=”956″]

Essential/Strictly Necessary Cookies

Analytical/ Performance Cookies

Functional/ Preference Cookies

Targeting/ Advertising Cookies

By Use Cases

By Industry

Documentation

Connect

Education

Blog

Resources

Pages

Documentation

Presto Newsletter – July 2018

Events

Presto News and Knowledge

Releases and New Features

Engineer’s Corner

A single point of access to all your data

Stay in the know - Sign up for our newsletter!

Resources

Quick Links

Get In Touch

Start Free with
Starburst Galaxy

For more deployment options:

Essential/Strictly Necessary Cookies

Analytical/ Performance Cookies

Functional/ Preference Cookies

Targeting/ Advertising Cookies

By Use Cases

By Industry

Documentation

Connect

Education

Starburst Galaxy

Starburst Enterprise

By Use Cases

By Industry

Documentation

Connect

Education

Filter:

Blog

Resources

Pages

Documentation

Presto Newsletter – July 2018

Events

Presto News and Knowledge

Releases and New Features

Engineer’s Corner

A single point of access to all your data

Stay in the know - Sign up for our newsletter!

Resources

Quick Links

Get In Touch

Start Free withStarburst Galaxy

For more deployment options:

Start Free with
Starburst Galaxy