SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns...

27

Transcript of SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns...

Page 1: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),
Page 2: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

SkyServer Database SkyServer Database Past, Present, and FuturePast, Present, and Future

Jim Gray: MicrosoftJim Gray: MicrosoftAlex Szalay (and friends): Johns HopkinsAlex Szalay (and friends): Johns Hopkins

Help from:Help from:Cathan Cook (personal SkyServer),Cathan Cook (personal SkyServer),Maria A. Nieto-Santisteban (image cutout service)Maria A. Nieto-Santisteban (image cutout service)Peter Kukol, Gyorgy Fekete, Wil O’Mullane (evolved HTM to native C#)Peter Kukol, Gyorgy Fekete, Wil O’Mullane (evolved HTM to native C#)Cesar Galindo-Legaria, Ming-Chuan Wu (statistics for optimizer)Cesar Galindo-Legaria, Ming-Chuan Wu (statistics for optimizer)

Page 3: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Promised AbstractPromised Abstract Can database technology help manage and mine Can database technology help manage and mine

scientific data? That is the question I have been scientific data? That is the question I have been trying to answer with my astronomy colleagues trying to answer with my astronomy colleagues (especially Alex Szalay.) We have had some success (especially Alex Szalay.) We have had some success but still face many problems. I will start by but still face many problems. I will start by describing the odyssey of putting the Sloan Digital describing the odyssey of putting the Sloan Digital Sky Survey online and give some statistics about Sky Survey online and give some statistics about how it is used and what we are doing now. That how it is used and what we are doing now. That segment will end with a discussion of how the segment will end with a discussion of how the integration of SQL with the CLR (common language integration of SQL with the CLR (common language run time) makes it much easier for us to handle run time) makes it much easier for us to handle scientific datatypes and spatial access methods. The scientific datatypes and spatial access methods. The World-Wide Telescope is an attempt to federate all World-Wide Telescope is an attempt to federate all the astronomy archives of the world. I will briefly the astronomy archives of the world. I will briefly describe the architecture of SkyQuery a prototype describe the architecture of SkyQuery a prototype portal to several archives each of which is a web portal to several archives each of which is a web service. Urls: service. Urls: http://research.microsoft.com/~grayhttp://research.microsoft.com/~gray http://skyserver.pha.jhu.edu/dr1/http://skyserver.pha.jhu.edu/dr1/ http://skyQuery.nethttp://skyQuery.net

Page 4: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Instructions from Last YearInstructions from Last Year

Less marketectureLess marketecture More SubstanceMore Substance More demos and codeMore demos and code

Page 5: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

SkyServer Overview SkyServer Overview (10 min)(10 min) 10 minute SkyServer tour10 minute SkyServer tour

Pixel space Pixel space http://skyserver.sdss.org/en/http://skyserver.sdss.org/en/

Record space: Record space: http://skyserver.sdss.org/en/tools/explore/obj.asp?id=2255030989160697http://skyserver.sdss.org/en/tools/explore/obj.asp?id=2255030989160697 Doc space: NedDoc space: Ned Set space: Set space: Web & Query Logs: Web & Query Logs: Dr1 WebServiceDr1 WebService

You can download You can download (thanks to Cathan Cook )(thanks to Cathan Cook ) Data + Database code:Data + Database code: Website: Website:

Data Mining the SDSS SkyServer DatabaseData Mining the SDSS SkyServer Database MSR-TR-2002-01MSR-TR-2002-01

select top 10 * from weblog..weblog where yy = 2003 and mm=7 and dd =25 order by seq desc

select top 10 * from weblog..sqlLogorder by theTime Desc

http://skyserver.pha.jhu.edu/dr1/en/tools/chart/navi.asp

http://research.microsoft.com/~gray/SDSS/personal_skyserver.htm

Page 6: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Cutout Service (10 min)Cutout Service (10 min)A typical web serviceA typical web service Show itShow it Show WSDLShow WSDL Show fixing a bugShow fixing a bug Rush through code.Rush through code. You can download it.You can download it.

Maria A. Nieto-Santisteban did most of this (Alex and I started it)Maria A. Nieto-Santisteban did most of this (Alex and I started it)

http://research.microsoft.com/~gray/SDSS/personal_skyserver.htm

Page 7: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

SkyQuery: SkyQuery: http://skyquery.net/http://skyquery.net/ Distributed Query tool using a set of web servicesDistributed Query tool using a set of web services Four astronomy archives from Four astronomy archives from

Pasadena, Chicago, Baltimore, Cambridge (England).Pasadena, Chicago, Baltimore, Cambridge (England). Feasibility study, built in 6 weeks Feasibility study, built in 6 weeks

Tanu Malik (JHU CS grad student) Tanu Malik (JHU CS grad student) Tamas Budavari (JHU astro postdoc)Tamas Budavari (JHU astro postdoc) With help from Szalay, Thakar, GrayWith help from Szalay, Thakar, Gray

Implemented in C# and .NETImplemented in C# and .NET Allows queries like:Allows queries like:

SELECT o.objId, o.r, o.type, t.objId FROM SDSS:PhotoPrimary o,

TWOMASS:PhotoPrimary t WHERE XMATCH(o,t)<3.5

AND AREA(181.3,-0.76,6.5) AND o.type=3 and (o.I - t.m_j)>2

Page 8: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

2MASS

INT

SDSS

FIRST

SkyQueryPortal

ImageCutout

SkyQuery StructureSkyQuery Structure Each SkyNode publishes Each SkyNode publishes

Schema Web ServiceSchema Web Service Database Web ServiceDatabase Web Service

Portal is Portal is Plans Query (2 phase) Plans Query (2 phase) Integrates answersIntegrates answers Is itself a web serviceIs itself a web service

Page 9: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Four Database TopicsFour Database Topics

Sparse tables: Sparse tables: column vs row storecolumn vs row storetag and index tablestag and index tablespivotpivot

Maplist (cross apply)Maplist (cross apply) Bookmark bug Bookmark bug Object Relational has arrived.Object Relational has arrived.

Page 10: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Column Store PyramidColumn Store Pyramid Users see fat base tablesUsers see fat base tables

(universal relation)(universal relation)

Define popular columns index Define popular columns index tag tabletag table 10% ~ 100 columns 10% ~ 100 columns

Make many skinny indices Make many skinny indices 1% ~ 10 columns 1% ~ 10 columns

Query optimizer picks right planQuery optimizer picks right plan Automate definition & useAutomate definition & use Fast read, slow insert/update Fast read, slow insert/update Data warehouseData warehouse

Note: prior to Yukon, Note: prior to Yukon, index had 16 column limit. index had 16 column limit. A bane of my existence. A bane of my existence. Simpl

e

Typical Semi-join Fat quer

y

Obese query

BASEBASE

INDICIESINDICIES

TAGTAG

Page 11: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

ExamplesExamples

create table base (id bigint, f1 int primary key,f2 int, …,f1000 int)

create index tag on base (id) include (f1, …, f100)

create index skinny on base(f2,…f17)

Simple

Typical Semi-join

Fat quer

y

Obese query

BASE

INDICIESINDICIES

TAGTAG

Page 12: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

A Semi-Join ExampleA Semi-Join Example

create table fat(a int primary key, b int, c int, fat char (988))declare @i int, @j int; set @i = 0again: insert fat values(@i, cast(100*rand() as int), cast (100*rand() as int), ' ') set @i = @i + 1; if (@i < 1000000) goto again

create index ab on fat(a,b)create index ac on fat(a,c)

dbcc dropcleanbuffers with no_infomsgs

select count(*) from fat with(index (0)) where c = b-- Table 'fat'. Scan 3, reads 137,230, CPU : 1.3 s, elapsed 31.1s.

dbcc dropcleanbuffers with no_infomsgs

select count(*) from fat where b=c-- Table 'fat'. Scan 2, reads: 3,482 CPU 1.1 s, elapsed: 1.4 s.

1GB

8MB 8MB

b=c 3.4K IO 1.4 sec

abab acac

b=c 137 K IO

31 sec

Page 13: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Moving From Rows to ColumnsMoving From Rows to ColumnsPivotPivot & & UnPivotUnPivotWhat if the table is sparse? What if the table is sparse? LDAP has 7 mandatory LDAP has 7 mandatory

and 1,000 optional attributesand 1,000 optional attributes

Store row, col, valueStore row, col, value

create table Features ( object varchar , attribute varchar,

value varchar, primary key ( object,

attribute)) select *from (features pivot value on attribute

in (year, color) ) as Twhere object = ‘4PNC450’

Featuresobject attribute value

●●●●4PNC450 year 20004PNC450 color white4PNC450 make Ford4PNC450 model Taurus●●●●

TObject year color 4PNC450 2000 white

Page 14: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Maplist Meets SQL – cross applyMaplist Meets SQL – cross apply

Your table-valued function F(a,b,c) Your table-valued function F(a,b,c) returns all objects related to a,b,c.returns all objects related to a,b,c.

spatial neighbors, spatial neighbors, sub-assemblies, sub-assemblies, members of a group, members of a group, items in a folder,…items in a folder,…

Apply this function to each rowApply this function to each row Classic drill-downClassic drill-down

use use outer applyouter apply if f() may be null if f() may be null

select p.*, q.*from parent as p cross apply f(p.a, p.b, p.c) as qwhere p.type = 1

p1

f(p1)

p2

f(p2)

pn

f(pn)

Page 15: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

The Bookmark BugThe Bookmark Bug SQL is a non-procedural language.SQL is a non-procedural language. The compiler/optimizer picks the The compiler/optimizer picks the

procedureprocedurebased on statistics.based on statistics.

If the stats are wrong or missing….If the stats are wrong or missing….Bad things happen.Bad things happen.Queries can run VERY slowly.Queries can run VERY slowly.

Strategy 1: allow users to specify plan.Strategy 1: allow users to specify plan. Strategy 2: make the optimizer smarterStrategy 2: make the optimizer smarter

(and accept hints from the user.)(and accept hints from the user.)

Page 16: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

An Example of the ProblemAn Example of the Problem A query selects some fields A query selects some fields

of an index of an index and of huge table.and of huge table.

Bookmark plan: Bookmark plan: look in index for a subset.look in index for a subset. Lookup subset in Fat table.Lookup subset in Fat table.

This is This is great if subset << table.great if subset << table. terrible if subset ~ table.terrible if subset ~ table.

If statistics are wrong, If statistics are wrong, or if predicates not independent,or if predicates not independent,you get the wrong plan.you get the wrong plan.

How to fix the statistics?How to fix the statistics?

Ind

ex

Huge table

Page 17: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

A Fix: Let user ask for statsA Fix: Let user ask for stats Create Statistics on View(f1,..,fn)Create Statistics on View(f1,..,fn) Then the optimizer has the right dataThen the optimizer has the right data

Picks the right plan.Picks the right plan.Statistics on ViewsStatistics on Views, , C. Galindo-Legaria, M. Josi, F. Waas, M. Wu, VLDB 2003C. Galindo-Legaria, M. Josi, F. Waas, M. Wu, VLDB 2003, ,

Q3: Q3: Select count(*) from Galaxy Select count(*) from Galaxy where r < 22 and r_extinction > 0.120where r < 22 and r_extinction > 0.120

Bookmark: 34 M random IO, Bookmark: 34 M random IO, 520 minutes520 minutes Create Statistics on Galaxy(objID ) Create Statistics on Galaxy(objID ) Scan: 5 M sequential IOScan: 5 M sequential IO 18 minutes18 minutes

Ultimately this should be automated,Ultimately this should be automated, but for now,… it’s a step in the right direction.but for now,… it’s a step in the right direction.

Page 18: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Object Relational Has ArrivedObject Relational Has Arrived VMs are moving inside the DBVMs are moving inside the DB Yukon includes Common Language Runtime Yukon includes Common Language Runtime

(Oracle & DB2 have similar mechanisms).(Oracle & DB2 have similar mechanisms). So, C++, VB, C# and Java are So, C++, VB, C# and Java are

co-equal with TransactSQL.co-equal with TransactSQL. You can define classes and methodsYou can define classes and methods

SQL will store the instancesSQL will store the instancesAccess them via methodsAccess them via methods

You can put your analysis code You can put your analysis code INSIDEINSIDE the database. the database.

Minimizes data movement.Minimizes data movement.You can’t move petabytes to the clientYou can’t move petabytes to the clientBut we will soon have petabyte databases.But we will soon have petabyte databases.

datacode

datacode

+code

Page 19: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

And..And.. Fully-async and synchronous (blocking) callsFully-async and synchronous (blocking) calls

and multi-concurrent-result sets and multi-concurrent-result sets per connection (transaction)per connection (transaction)

Queues built in:Queues built in: Fire-and forget asynchronous processingFire-and forget asynchronous processing

It listens to Port 80 for SOAP calls :It listens to Port 80 for SOAP calls : TP-lite is backTP-lite is back

It’s a web service It’s a web service Notification service and Notification service and

data mining and data mining and olap and olap and reporting and reporting and xml and xml and xquery andxquery and.... ).... )

But, back to OR.But, back to OR.

Page 20: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

The HTM code body

The Pre CLR designThe Pre CLR design

Transact SQL sp_HTM(20 lines) 469 lines of

“glue”looking like: // Get Coordinates param datatype, and param length information of if (srv_paraminfo(pSrvProc, 1, &bType1, &cbMaxLen1, &cbActualLen1, NULL, &fNull1) == FAIL) ErrorExit("srv_paraminfo failed...");

// Is Coordinate param a character stringif (bType1 != SRVBIGVARCHAR && bType1 != SRVBIGCHAR &&

bType1 != SRVVARCHAR && bType1 != SRVCHAR)ErrorExit("Coordinate param should be a string.");

// Is Coordinate param non-nullif (fNull1 || cbActualLen1 < 1 || cbMaxLen1 <= cbActualLen1)

ErrorExit("Coordinate param is null.");

// Get pointer to Coordinate parampzCoordinateSpec = (char *) srv_paramdata (pSrvProc, 1);if (pzCoordinateSpec == NULL)

ErrorExit("Coordinate param is null.");pzCoordinateSpec[cbActualLen1] = 0;

// Get OutputVector datatype, and param length information if (srv_paraminfo(pSrvProc, 2, &bType2, &cbMaxLen2, &cbActualLen2, NULL, &fNull2) == FAIL) ErrorExit("Failed to get type info on HTM Vector param...");

Page 21: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

The “glue” CLR designThe “glue” CLR designDiscard 450 lines of UGLY Discard 450 lines of UGLY codecode

The HTM code body

C# SQL sp_HTM(50 lines)

using System;using System.Data;using System.Data.SqlServer;using System.Data.SqlTypes;using System.Runtime.InteropServices;namespace HTM {

public class HTM_wrapper {[DllImport("SQL_HTM.dll")] static extern unsafe void * xp_HTM_Cover_get (byte *str);public static unsafe void HTM_cover_RS(string input) {

// convert the input from Unicode (array of 2 bytes) to an array of bytes (not shown) byte * input; byte * output;

// invoke the HTM routine output = (byte *)xp_HTM_Cover_get(input);

// Convert the array to a tableSqlResultSet outputTable = SqlContext.GetReturnResultSet();

if (output[0] == 'O') { // if Output is “OK”uint c = *(UInt32 *)(s + 4); // cast results as datasetInt64 * r = ( Int64 *)(s + 8); // Int64 r[c-1,2]

for (int i = 0; i < c; ++i) { SqlDataRecord newRecord = outputTable.CreateRecord(); newRecord.SetSqlInt64(0, r[0]);

newRecord.SetSqlInt64(1, r[1]); r++;r++;outputTable.Insert(newRecord);

} } // return outputTable;} } }

Thanks!!! To Peter Kukol (who wrote this)

Page 22: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

The Clean CLR designThe Clean CLR designDiscard all glue codeDiscard all glue codereturn array cast as tablereturn array cast as table

CREATE ASSEMBLY HTM_AFROM '\\localhost\HTM\HTM.dll' CREATE FUNCTION HTM_cover( @input NVARCHAR(100) )RETURNS @t TABLE ( HTM_ID_START BIGINT NOT NULL PRIMARY KEY, HTM_ID_END BIGINT NOT NULL

)ASEXTERNAL NAME HTM_A:HTM_NS.HTM_C::HTM_cover

using System;using System.Data;using System.Data.Sql;using System.Data.SqlServer;using System.Data.SqlTypes;using System.Runtime.InteropServices;namespace HTM_NS {

public class HTM_C {public static Int64[,2] HTM_cover(string input) {

// invoke the HTM routine return (Int64[,2]) xp_HTM_Cover(input); // the actual HTM C# or C++ or Java or VB code goes here.

} } }

Your/My code

goes here

Page 23: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Performance (Beta1)Performance (Beta1) On a 2.2 Ghz Xeon On a 2.2 Ghz Xeon

Call a Transact SQL functionCall a Transact SQL function

3333μμss

Call a C# functionCall a C# function

5050μμss

Table valued functionTable valued function

not good in not good in ββ11

Array (== table) valued function 200 Array (== table) valued function 200 μμss

+ per row 27 + per row 27 μμs s

Page 24: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

CREATE ASSEMBLY ReturnOneAFROM '\\localhost\C:\ReturnOne.dll'GOCREATE FUNCTION ReturnOne_Int( @input INT) RETURNS INTAS EXTERNAL NAME ReturnOneA:ReturnOneNS.ReturnOneC::ReturnOne_IntGO----------------------------------------------- time echo an integerdeclare @i int, @j int, @cpu_seconds float, @null_loop floatdeclare @start datetime, @end datetimeset @j = 0set @i = 10000set @start = current_Timestampwhile(@i > 0) begin

set @j = @j + 1set @i = @i -1

endset @end = current_Timestampset @null_loop = datediff(ms, @start,@end) / 10.0

set @i = 10000set @start = current_Timestampwhile(@i > 0) begin

select @j = dbo.ReturnOne_Int(@i)set @j = @j + 1set @i = @i -1

endset @end = current_Timestamp set @cpu_seconds = datediff(ms, @start,@end) / 10.0 - @null_loopprint 'average cpu time for 1,000 calls to ReturnOne_Int was ' + str(@cpu_seconds,8,2)+ ' micro seconds'

The CodeThe Code

using System;using System.Data;using System.Data.SqlServer;using System.Data.SqlTypes;using System.Runtime.InteropServices;namespace ReturnOneNS { public class ReturnOneC {

public static int ReturnOne_Int(int input) {return input;

} }}

Function written in C# inside the DB

Program in DB in different

language (Tsql) calling function

Page 25: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

What Is the Significance?What Is the Significance?

No more inside/outside DB dichotomy.No more inside/outside DB dichotomy. You can put your code near the data.You can put your code near the data. Indeed, we are letting users put Indeed, we are letting users put

personal databases near the data personal databases near the data archive.archive.

This avoids moving large datasets.This avoids moving large datasets. Just move questions and answers.Just move questions and answers.

Page 26: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

Meta-MessageMeta-Message

Trying to fit science data into Trying to fit science data into databasesdatabases

When it does not fit, something is When it does not fit, something is wrong.wrong.

Look for solutionsLook for solutions Many solutions come from OR extensionsMany solutions come from OR extensions Some are fundamental engine changesSome are fundamental engine changes

More structure in DBMore structure in DB Richer operator setsRicher operator sets Better statisticsBetter statistics

Page 27: SkyServer Database Past, Present, and Future Jim Gray: Microsoft Alex Szalay (and friends): Johns Hopkins Help from: Cathan Cook (personal SkyServer),

© 2002 Microsoft Corporation. All rights reserved.© 2002 Microsoft Corporation. All rights reserved.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.This presentation is for informational purposes only. MICROSOFT MAKES NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY.