Starburst 332-e Release
by Tom N.
Another quarter, another huge release from our engineering team here at Starburst. The main focus of this release is performance and security.
This major release combines features that have been contributed back to the open source project as well as curated for Starburst Enterprise Presto customers. For more information on this major release, register for our upcoming webinar or read our blog.
Faster data access, across all data sources
Starting with version 332, Open Source Presto now has embedded caching in response to customer demand for faster access across frequently accessed data. While already widely deployed among leading companies for ad-hoc SQL, BI, and reporting use cases, Presto is now able to take on new use cases such as:
- Dashboarding applications that frequently refresh the same data over and over.
- Multi-cloud analytics where a table in Cloud A needs to be frequently joined with a table in Cloud B, further delivering on Starburst’s promise to enable analytics anywhere.
- Querying data from an operational or OLTP database without straining the resources of the underlying mission-critical system.
Presto caching is now available in beta preview with Presto 332 for caching data from data lakes (object storage, Hadoop, etc.), while Starburst Enterprise Presto with Caching offers additional features and support, and will soon be able to cache data across any data source.
Optimized Delta Lake Reader
Now Databricks customers can take advantage of the speed, concurrency, and scalability that Presto is known for, to query their Delta Lake. With this new reader, Starburst Enterprise Presto makes it possible to extend the benefits of data enrichment to data lakes. This bridges functional gaps between data warehouses and data lakes and gives joint Starburst + Databricks clients greater cost control, flexibility, and speed of access to mutable data in data lakes.
Implemented through a new connector called Delta Lake, we built this from scratch with the assistance of Databricks. It can query any Delta Lake tables created by Databricks or the open source Delta Lake API.
Starburst is hosting a separate webinar exclusively focused on the Delta Lake Reader.
Ensuring Fine-Grained Access Control, across all data sources
Starburst Enterprise Presto now includes a single control point for fine-grained access control across all data sources. In addition to reducing the burden of having to lock down each data source, Global Security also reduces vulnerabilities by providing a simpler way to ensure consistent access across sources.
Using the easy interface provided by Apache Ranger, controlling access to control access down to the table, column and row level provides a powerful level of security to meet the high security requirements the industry now considers table stakes.
In addition, Starburst Enterprise Presto now provides an integrated Ranger deployment using our Kubernetes and AWS CloudFormation deployment methods.
Okta has become one of the most popular authentication platforms in the world. With this release, Starburst Enterprise Presto supports authentication using Okta’s Single Sign On (SSO) feature. More information can be found in our Okta authentication documentation located here.
Oracle Parallel Connector
Our Starburst Enterprise Presto Oracle connector is getting some horsepower. With the addition of a parallel option, this will help when queries against an Oracle database involve a large amount of data. Sometimes this is moving data out of Oracle and other times it’s joining with a data lake trying to run large analytical queries.
This new feature will give a boost to existing queries that are executed against a traditional Oracle database in the cloud or on-premises or even an Exadata system.
Read more information about this new feature in our Oracle connector documentation here.
Real-Time Query Logging
One of the most asked for features of Starburst Enterprise Presto is query logging. There are many reasons for this:
– Compliance –
As data volume constantly increases, companies are under more and more pressure to monitor data access within their organization. With Starburst Enterprise Presto’s event logging functionality, a full, GDPR level audit trail is available in real-time. This allows tracking access to all data sources that are a result of queries submitted to Starburst Enterprise Presto.
– Chargeback –
Starburst Enterprise Presto is used by many different departments and user groups. It can be difficult for a centralized IT organization to determine the resource usage of these different users. With event logging, each query is logged into a database. In addition to the user that executed the query, the query and the elapsed time, other metrics such as RAM and CPU are available which can provide a more granular level of detail of actual usage.
– Performance Tuning –
The data collected for each query includes resource utilization. This data enables resource usage per query and can be used to determine queries, users and data sources where performance tuning might be considered. Reporting can easily be created to monitor Starburst Enterprise Presto usage based on users, connectors and tables.
We’re happy to provide real-time query logging. In addition to just logging queries, we also log many different performance metrics.
Here are just a few:
|query_id||Randomly generated id of the query|
|execution_time||How long the query took to complete|
|user||The user that executed the query|
|query||The text of the query|
|total_rows||How many rows the query produced|
|written_rows||How many rows the query wrote to the target|
|cpu_time||Total cpu consumed on the cluster|
|client_info||Detailed information about the client. (JDBC,etc..)|
|query_plan||The plan the cost based optimized produced|
To enable this feature, you can head over to our documentation page here.
Enterprise security often requires users to have different roles for different S3 sources. IAM Passthrough allows Starburst Enterprise Presto to transparently assume these roles to comply with authorization policies.
Presto supports flexible security mapping for S3, allowing for separate credentials or IAM roles for specific users or buckets/paths. The IAM role for a specific query can be selected from a list of allowed roles by providing it as an extra credential.
Combined with Global Security covered earlier, organizations can provide end-to-end security to their Presto users allowing them to meet and exceed their organizations’ security standards.
For more information on this major release, join our upcoming webinar.