Prestogres, ODBC & JDBC connectivity for Presto
-
Upload
sadayuki-furuhashi -
Category
Software
-
view
1.446 -
download
9
description
Transcript of Prestogres, ODBC & JDBC connectivity for Presto
Sadayuki Furuhashi
Founder & Software Architect
ODBC & JDBC connectivity for Presto
Treasure Data, inc.
A little about me...
> Sadayuki Furuhashi github/twitter: @frsyuki
> Treasure Data, Inc. Founder & Software Architect
> Open source projects MessagePack - efficient object serializer Fluentd - data collection tool
ServerEngine - ruby framework to build multiprocess servers
LS4 - distributed object storage system (suspended)
kumofs - distributed key-value data store (suspended)
Background + Intro:
Background
Pig
• Tableau • Pentaho • Web apps
RDB, HTTP, etc.“Plazma”
ColumnarCloud Storage
This is us(Treasure Data)
Pig
• Tableau • Pentaho • Web apps
RDB, HTTP, etc.“Plazma”
ColumnarCloud Storage
Data collection
> “Fluentd” streaming data collection tool
> Plugin architecture
> github.com/fluent/fluentd
Pig
• Tableau • Pentaho • Web apps
RDB, HTTP, etc.“Plazma”
ColumnarCloud Storage
Hadoop as a service
> “BigData” processing • Funnel analysis for
web services • Correlation analysis for
ad-tech (DSP/SSP/DMP) • Creating OLAP cube
> Multi-tenant scheduling • utilize idling resources
purchased by other users
Pig
• Tableau • Pentaho • Web apps
RDB, HTTP, etc.“Plazma”
ColumnarCloud Storage
Presto as a service
> Interactive queries
> Multi-tenant scheduling(in progress)
Pig
• Tableau • Pentaho • Web apps
RDB, HTTP, etc.“Plazma”
ColumnarCloud Storage
Here is the problem…
ODBC/JDBC
Missing!
The problem to solve
• Providing open-source ODBC/JDBC connectivity for Presto quickly
• Tableau • Pentaho • Web apps
ODBC/JDBC
• ODBC/JDBC are VERY complicated API > PostgreSQL ODBC driver: 60,000 lines > PostgreSQL JDBC driver: 43,000 lines
A solution
• Using PostgreSQL ODBC/JDBC drivers
• Creating PostgreSQL protocol gateway
A solution
• Using PostgreSQL ODBC/JDBC drivers
• Creating PostgreSQL protocol gateway
PostgreSQL protocol gateway for Presto
feature-complete & matured for many years
some middlewarealready implemented
Architecture
Architecture
Tableau PentahoWeb apps…
PostgreSQL protocol
PostgreSQL ODBC/JDBC driver, Other PostgreSQL clients
pgpool-II (patched)
Internal Architecture
Tableau…select count(*) from x;
run_presto_as_temp_table( …, ’select count(*) from x’);
patched pgpool-II wrapsthe SQL in a function call
PostgreSQL
the function sends theoriginal sql to Presto
select count(*) from x;
SELECT from system catalogs
pgpool-II (patched) Tableau…
get table list
PostgreSQL
run CREATE TABLEfor each actual table
run the original query
to get metadata of tables
Demo
Limitations
• Server-side prepare is not supported
• Cursor (DECLARE/FETCH) is not supported
• JDBC driver needs ?protocolVersion=2 option