NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do...
Transcript of NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do...
![Page 1: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/1.jpg)
CompSci 516DatabaseSystems
Lecture20NoSQLand
ColumnStore
Instructor:Sudeepa Roy
DukeCS,Fall2018 CompSci516:DatabaseSystems 1
![Page 2: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/2.jpg)
ReadingMaterialNOSQL:• “ScalableSQLandNoSQLDataStores”RickCattell,SIGMODRecord,December2010(Vol.39,No.4)• seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers• MongoDBmanual:https://docs.mongodb.com/manual/
ColumnStore:• D.Abadi,P.Boncz,S.Harizopoulos,S.Idreos andS.Madden.TheDesignandImplementationof
ModernColumn-OrientedDatabaseSystems.FoundationsandTrendsinDatabases,vol.5,no.3,pp.197–280,2012.
• SeeVLDB2009tutorial:http://nms.csail.mit.edu/~stavros/pubs/tutorial2009-column_stores.pdf
Optional:• “Dynamo:Amazon’sHighlyAvailableKey-valueStore”ByGiuseppeDeCandia et.al.SOSP
2007
• “Bigtable:ADistributedStorageSystemforStructuredData”FayChanget.al.OSDI2006
DukeCS,Fall2018 CompSci516:DatabaseSystems 2
![Page 3: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/3.jpg)
NoSQL
DukeCS,Fall2018 CompSci516:DatabaseSystems 3
Seetheoptional/additionalslidesonMongoDBonthecoursewebsiteMaybeusefulforHW3
![Page 4: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/4.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 4
![Page 5: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/5.jpg)
Sofar-- RDBMS
• RelationalDataModel• RelationalDatabaseSystems(RDBMS)• RDBMSshave– acompletepre-definedfixedschema– aSQLinterface– andACIDtransactions
DukeCS,Fall2018 CompSci516:DatabaseSystems 5
![Page 6: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/6.jpg)
Today• NoSQL:”new”databasesystems– nottypicallyRDBMS– relaxonsomerequirements,gainefficiencyandscalability
• Newsystemschoosetouse/notuseseveralconceptswelearntsofar– e.g.“System---”doesnotuselocksbutusesmulti-versionCC(MVCC)or,
– “System---”usesasynchronousreplication• therefore,itisimportanttounderstandthebasics(Lectures1-18)eveniftheyarenotusedinsomenewsystems!
DukeCS,Fall2018 CompSci516:DatabaseSystems 6
![Page 7: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/7.jpg)
Warnings!
• MaterialfromCattell’spaper(2010-11)–someinfowillbeoutdated– seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers
• WewillfocusonthebasicideasofNoSQLsystems
• Optional readingslidesattheendonMongoDB– maybeusefulforHW3– therearealsocomparisontablesintheCattell’spaperifyouareinterested
DukeCS,Fall2018 CompSci516:DatabaseSystems 7
![Page 8: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/8.jpg)
OLAPvs.OLTP
• OLTP(OnLine TransactionProcessing)– Recalltransactions!– Multipleconcurrentread-writerequests– Commercialapplications(banking,onlineshopping)– Datachangesfrequently– ACIDproperties,concurrencycontrol,recovery
• OLAP(OnLine AnalyticalProcessing)– Manyaggregate/group-byqueries– multidimensionaldata– Datamostlystatic– WillstudyOLAPCubesoon
DukeCS,Fall2018 CompSci516:DatabaseSystems 8
![Page 9: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/9.jpg)
NewSystems• WewillexamineanumberofSQLandso- called“NoSQL”systemsor“datastores”
• DesignedtoscalesimpleOLTP-styleapplicationloads– todoupdatesaswellasreads– incontrasttotraditionalDBMSsanddatawarehouses– toprovidegoodhorizontalscalability(?)forsimpleread/writedatabaseoperationsdistributedovermanyservers
• OriginallymotivatedbyWeb2.0applications– thesesystemsaredesignedtoscaletothousandsormillionsofusers
DukeCS,Fall2018 CompSci516:DatabaseSystems 9
![Page 10: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/10.jpg)
NewSystemsvs.RDMS• Whenyoustudyanewsystem,compareitwithRDBMS-sonits– datamodel– consistencymechanisms– storagemechanisms– durabilityguarantees– availability– querysupport
• Thesesystemstypicallysacrificesomeofthesedimensions– e.g.database-widetransactionconsistency,inordertoachieveothers,e.g.higheravailabilityandscalability
DukeCS,Fall2018 CompSci516:DatabaseSystems 10
![Page 11: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/11.jpg)
NoSQL
• Manyofthenewsystemsarereferredtoas“NoSQL”datastores
• NoSQLstandsfor“NotOnlySQL”or“NotRelational”– notentirelyagreedupon
• Next:sixkeyfeaturesofNoSQLsystems
DukeCS,Fall2018 CompSci516:DatabaseSystems 11
![Page 12: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/12.jpg)
NoSQL:SixKeyFeatures
1. theabilitytohorizontallyscale“simpleoperations”throughputovermanyservers
2. theabilitytoreplicateandtodistribute(partition)dataovermanyservers
3. asimplecalllevelinterfaceorprotocol(incontrasttoSQLbinding)
4. aweakerconcurrencymodelthantheACIDtransactionsofmostrelational(SQL)databasesystems
5. efficientuseofdistributedindexesandRAMfordatastorage6. theabilitytodynamicallyaddnewattributestodatarecords
DukeCS,Fall2018 CompSci516:DatabaseSystems 12
![Page 13: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/13.jpg)
ImportantExamplesofNewSystems
• Threesystemsprovideda“proofofconcept”andinspiredmanyotherdatastores
1. Memcached2. Amazon’sDynamo3. Google’sBigTable
DukeCS,Fall2018 CompSci516:DatabaseSystems 13
![Page 14: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/14.jpg)
1.Memcached:mainfeatures
• popularopensourcecache
• supportsdistributedhashing(later)
• demonstratedthatin-memoryindexes canbehighlyscalable,distributing andreplicatingobjectsovermultiplenodes
DukeCS,Fall2018 CompSci516:DatabaseSystems 14
![Page 15: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/15.jpg)
2.Dynamo:mainfeatures
• pioneeredtheideaofeventualconsistencyasawaytoachievehigheravailabilityandscalability
• datafetchedarenotguaranteedtobeup-to-date
• butupdatesareguaranteedtobepropagatedtoallnodeseventually
DukeCS,Fall2018 CompSci516:DatabaseSystems 15
![Page 16: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/16.jpg)
3.BigTable :mainfeatures
• demonstratedthatpersistentrecordstoragecouldbescaledtothousandsofnodes
• “columnfamilies”
• https://cloud.google.com/bigtable/• https://static.googleusercontent.com/media/research.google.co
m/en//archive/bigtable-osdi06.pdf
DukeCS,Fall2018 CompSci516:DatabaseSystems 16
![Page 17: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/17.jpg)
BASE(notACIDJ)
• RecallACIDforRDBMSdesiredpropertiesoftransactions:– Atomicity,Consistency,Isolation,andDurability
• NOSQLsystemstypicallydonotprovideACID
• BasicallyAvailable• Softstate• Eventuallyconsistent
DukeCS,Fall2018 CompSci516:DatabaseSystems 17
![Page 18: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/18.jpg)
ACIDvs.BASE• TheideaisthatbygivingupACIDconstraints,onecanachievemuchhigherperformanceandscalability
• Thesystemsdifferinhowmuchtheygiveup– e.g.mostofthesystemscallthemselves“eventuallyconsistent”,meaningthatupdatesareeventuallypropagatedtoallnodes
– butmanyofthemprovidemechanismsforsomedegreeofconsistency,suchasmulti-versionconcurrencycontrol(MVCC)
DukeCS,Fall2018 CompSci516:DatabaseSystems 18
![Page 19: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/19.jpg)
“CAP”Theorem
• OftenEricBrewer’sCAPtheoremcitedforNoSQL
• A systemcanhaveonlytwooutofthreeofthefollowingproperties:• Consistency– doallclientsseethesamedata?
• Availability– isthesystemalwayson?
• Partition-tolerance– evenifcommunicationisunreliable,doesthesystemfunction?
• TheNoSQLsystemsgenerallygiveupconsistency– However,thetrade-offsarecomplex
DukeCS,Fall2018 CompSci516:DatabaseSystems 19
![Page 20: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/20.jpg)
TwofociforNoSQLsystems
1. “Simple”operations
2. HorizontalScalability
DukeCS,Fall2018 CompSci516:DatabaseSystems 20
![Page 21: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/21.jpg)
1.“Simple”Operations
• Readingorwritingasmallnumberofrelatedrecordsineachoperation– e.g.keylookups– readsandwritesofonerecordorasmallnumberofrecords
• Thisisincontrasttocomplexqueries,joins,orread-mostlyaccess
• Inspiredbyweb,wheremillionsofusersmaybothreadandwritedatainsimpleoperations– e.g.searchandupdatemulti-serverdatabasesofelectronic
mail,personalprofiles,webpostings,wikis,customerrecords,onlinedatingrecords,classifiedads,andmanyotherkindsofdata
DukeCS,Fall2018 CompSci516:DatabaseSystems 21
![Page 22: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/22.jpg)
2.HorizontalScalability
• Shared-NothingHorizontalScaling
• Theabilitytodistributeboththedataandtheloadofthesesimpleoperationsovermanyservers– withnoRAMordisksharedamongtheservers
• Not“vertical”scaling– whereadatabasesystemutilizesmanycoresand/orCPUsthatshareRAManddisks
• Someofthesystemswedescribeprovidebothverticalandhorizontalscalability
DukeCS,Fall2018 CompSci516:DatabaseSystems 22
![Page 23: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/23.jpg)
2.Horizontalvs.VerticalScaling
• Effectiveuseofmultiplecores(verticalscaling)isimportant– butthenumberofcoresthatcansharememoryislimited
• horizontalscalinggenerallyislessexpensive– canusecommodityservers
• Note:horizontalandverticalpartitioningarenotrelatedtohorizontalandverticalscaling (Lecture18)– exceptthattheyarebothusefulforhorizontalscaling
DukeCS,Fall2018 CompSci516:DatabaseSystems 23
![Page 24: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/24.jpg)
WhatisdifferentinNOSQLsystems
• WhenyoustudyanewNOSQLsystem,noticehowitdiffersfromRDBMSintermsof
1. ConcurrencyControl2. DataStorageMedium3. Replication4. Transactions
DukeCS,Fall2018 CompSci516:DatabaseSystems 24
![Page 25: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/25.jpg)
ChoicesinNOSQLsystems:1.ConcurrencyControl
a) Locks– somesystemsprovideone-user-at-a-timereadorupdatelocks– MongoDBprovideslockingatafieldlevel
b) MVCCc) None– donotprovideatomicity– multipleuserscaneditinparallel– noguaranteewhichversionyouwillread
d) ACID– pre-analyzetransactionstoavoidconflicts– nodeadlocksandnowaitsonlocks
DukeCS,Fall2018 CompSci516:DatabaseSystems 25
![Page 26: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/26.jpg)
ChoicesinNOSQLsystems:2.DataStorageMedium
a) StorageinRAM– snapshotsorreplicationtodisk– poorperformancewhenoverflowsRAM
b) Diskstorage– cachinginRAM
DukeCS,Fall2018 CompSci516:DatabaseSystems 26
![Page 27: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/27.jpg)
ChoicesinNOSQLsystems:3.Replication
• whethermirrorcopiesarealwaysinsynca) Synchronousb) Asynchronous– faster,butupdatesmaybelostinacrash
c) Both– localcopiessynchronously,remotecopies
asynchronously
DukeCS,Fall2018 CompSci516:DatabaseSystems 27
![Page 28: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/28.jpg)
ChoicesinNOSQLsystems:4.TransactionMechanisms
a) supportb) donotsupportc) inbetween– supportlocaltransactionsonlywithinasingle
objector“shard”– shard=ahorizontalpartitionofdataina
database
DukeCS,Fall2018 CompSci516:DatabaseSystems 28
![Page 29: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/29.jpg)
ComparisonfromCattell’spaper(2011)
DukeCS,Fall2018 CompSci516:DatabaseSystems 29
![Page 30: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/30.jpg)
DataModelTerminologyforNoSQL
• UnlikeSQL/RDBMS,theterminologyforNoSQLisofteninconsistent– wearefollowingnotationsinCattell’spaper
• Allsystemsprovideawaytostorescalarvalues– e.g.numbersandstrings
• Someofthemalsoprovideawaytostoremorecomplexnestedorreferencevalues
DukeCS,Fall2018 CompSci516:DatabaseSystems 30
![Page 31: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/31.jpg)
DataModelTerminologyforNoSQL
• Thesystemsallstoresetsofattribute-valuepairs– butusefourdifferentdatastructures
1. Tuple2. Document3. ExtensibleRecord4. Object
DukeCS,Fall2018 CompSci516:DatabaseSystems 31
![Page 32: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/32.jpg)
1.Tuple
• Sameasbefore• A“tuple”isarowinarelationaltable– attributenamesarepre-definedinaschema– thevaluesmustbescalar– thevaluesarereferencedbyattributename– incontrasttoanarrayorlist,wheretheyarereferencedbyordinalposition
DukeCS,Fall2018 CompSci516:DatabaseSystems 32
![Page 33: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/33.jpg)
2.Document
• Allowsvaluestobenesteddocumentsorlistsaswellasscalarvalues– thinkaboutXMLorJSON
• Theattributenamesaredynamicallydefinedforeachdocumentatruntime
• Adocumentdiffersfromatupleinthattheattributesarenotdefinedinaglobalschema– anda widerrangeofvaluesarepermitted
DukeCS,Fall2018 CompSci516:DatabaseSystems 33
![Page 34: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/34.jpg)
3.ExtensibleRecord
• A hybrid betweenatupleandadocument• familiesofattributesaredefinedinaschema• butnewattributescanbeadded(withinanattributefamily)onaper-recordbasis
• Attributesmaybelist-valued
DukeCS,Fall2018 CompSci516:DatabaseSystems 34
![Page 35: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/35.jpg)
4.Object
• Analogoustoanobjectinprogramminglanguages– butwithouttheproceduralmethods
• Valuesmaybereferencesornestedobjects
DukeCS,Fall2018 CompSci516:DatabaseSystems 35
![Page 36: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/36.jpg)
ExampleNOSQLsystems
• Key-valueStores:– ProjectVoldemort,Riak,Redis,Scalaris,TokyoCabinet,Memcached/Membrain/Membase
• DocumentStores:– AmazonSimpleDB,CouchDB,MongoDB,Terrastore
• ExtensibleRecordStores:– Hbase,HyperTable,Cassandra,Yahoo’sPNUTS
• RelationalDatabases:– MySQLCluster,VoltDB,Clustrix,ScaleDB,ScaleBase,NimbusDB,GoogleMegastore(alayeronBigTable)
DukeCS,Fall2018 CompSci516:DatabaseSystems 36
![Page 37: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/37.jpg)
SQLvs.NOSQL
DukeCS,Fall2018 CompSci516:DatabaseSystems 37
Argumentsforbothsidesstillacontroversialtopic
![Page 38: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/38.jpg)
WhychooseRDBMSoverNoSQL:1/31. Ifnewrelationalsystemscandoeverything
thataNoSQLsystemcan,withanalogousperformanceandscalability(?),andwiththeconvenienceoftransactionsandSQL,NoSQLisnotneeded
2. RelationalDBMSshavetakenandretainedmajoritymarketshareoverothercompetitorsinthepast30years– (network,object,andXMLDBMSs)
DukeCS,Fall2018 CompSci516:DatabaseSystems 38
![Page 39: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/39.jpg)
WhychooseRDBMSoverNoSQL:2/33. SuccessfulrelationalDBMSshavebeenbuilt
tohandleotherspecificapplicationloads inthepast:– read-onlyorread-mostlydatawarehousing– OLTPonmulti-coremulti-diskCPUs– in-memorydatabases– distributeddatabases,and– nowhorizontallyscaleddatabases
DukeCS,Fall2018 CompSci516:DatabaseSystems 39
![Page 40: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/40.jpg)
WhychooseRDBMSoverNoSQL:3/3
4. Whileno“onesizefitsall”intheSQLproductsthemselves,thereisacommoninterfacewithSQL,transactions,andrelationalschemathatgiveadvantagesintraining,continuity,anddatainterchange
DukeCS,Fall2018 CompSci516:DatabaseSystems 40
![Page 41: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/41.jpg)
WhychooseNoSQLoverRDBMS:1/31. Wehaven’tyetseengoodbenchmarksshowing
thatRDBMSscanachievescaling comparablewithNoSQLsystemslikeGoogle’sBigTable
2. Ifyouonlyrequirealookupofobjectsbasedonasinglekey– thenakey-valuestoreisadequateandprobablyeasiertounderstand
thanarelationalDBMS– Likewiseforadocumentstoreonasimpleapplication:youonlypay
thelearningcurveforthelevelofcomplexityyourequire
DukeCS,Fall2018 CompSci516:DatabaseSystems 41
![Page 42: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/42.jpg)
WhychooseNoSQLoverRDBMS:2/3
3. Someapplicationsrequireaflexibleschema– allowingeachobjectinacollectiontohavedifferentattributes
– WhilesomeRDBMSsallowefficient“packing”oftupleswithmissingattributes,andsomeallowaddingnewattributesatruntime,thisisuncommon
DukeCS,Fall2018 CompSci516:DatabaseSystems 42
![Page 43: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/43.jpg)
WhychooseNoSQLoverRDBMS:3/3
4. ArelationalDBMSmakes“expensive”(multi- nodemulti-table)operations“tooeasy”– NoSQLsystemsmakethemimpossibleorobviouslyexpensiveforprogrammers
5. WhileRDBMSshavemaintainedmajoritymarketshareovertheyears,otherproductshaveestablishedsmallerbutnon-trivialmarketsinareaswherethereisaneedforparticularcapabilities– e.g.indexedobjectswithproductslikeBerkeleyDB,orgraph-following
operationswithobject-orientedDBMSs
DukeCS,Fall2018 CompSci516:DatabaseSystems 43
![Page 44: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/44.jpg)
ColumnStore
DukeCS,Fall2018 CompSci516:DatabaseSystems 44
![Page 45: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/45.jpg)
Rowvs.ColumnStore
• Rowstore– storeallattributesofatupletogether– storagelike“row-majororder”inamatrix
• Columnstore– storeallrowsforanattribute(column)together– storagelike“column-majororder”inamatrix
• e.g.– MonetDB,Vertica(earlier,C-store),SAP/SybaseIQ,GoogleBigtable (withcolumngroups)
DukeCS,Fall2018 CompSci516:DatabaseSystems 45
![Page 46: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/46.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 46
Ack:SlidefromVLDB2009tutorialonColumnstore
![Page 47: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/47.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 47
Ack:SlidefromVLDB2009tutorialonColumnstore
![Page 48: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/48.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 48
Ack:SlidefromVLDB2009tutorialonColumnstore
![Page 49: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/49.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 49
Ack:SlidefromVLDB2009tutorialonColumnstore
![Page 50: NoSQL and Column Store · Why choose RDBMS over NoSQL : 1/3 1.If new relational systems can do everything that a NoSQL system can, with analogous performance and scalability (?),](https://reader034.fdocuments.net/reader034/viewer/2022042306/5ed24183919a5172c0108063/html5/thumbnails/50.jpg)
DukeCS,Fall2018 CompSci516:DatabaseSystems 50
Ack:SlidefromVLDB2009tutorialonColumnstore