Googol Data
-
Upload
mayflower-gmbh -
Category
Technology
-
view
1.918 -
download
0
Transcript of Googol Data
Googol records (with MySQL)
IPC | October 2008 | Alex Aulbach
© MAYFLOWER GmbH 2008
2
„Googol records“
Definition: Googol
10100
or
10 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000 000
000 000 000 000 000 000 000 000 000
or
“Imaginable big number”
© MAYFLOWER GmbH 2008
3
„Googol records“
Overview
What will the future bring for databases?
Is the principal way to access data the best?
Patterns (or suggestions)
and showing how that could work with MySQL.
Discuss!
© MAYFLOWER GmbH 2008
4
„Googol records“
The (performance) future of the web
Only 10-20 % of world population are “in the Internet”.
How should it be with 80 % ?
© MAYFLOWER GmbH 2008
5
„Googol records“
The (performance) future of the web
World population is growing and people get older.
© MAYFLOWER GmbH 2008
6
„Googol records“
The (performance) future of the web
More specialized databases
More ways to access them
Much easier to access
Sharing knowledge vs. closed knowledge: Who wins?
Services become more dependent to others
The web grows faster than Moores Law!(Moores Law: “Only” Factor 1000 in 20 years.)
© MAYFLOWER GmbH 2008
7
„Googol records“
What does this mean us?
We will surely come into problems
But cannot say when, where and why
No Boss. The data belongs to everyone
It’s like new roads
It’s “Real-live”!
© MAYFLOWER GmbH 2008
8
„Googol records“
Consequences of growth
New hardware will no longer solve speed problems
Even new database will not
Even a rewrite of the application won’t
Need to rethink the problems from scratch!
© MAYFLOWER GmbH 2008
9
„Googol records“
Of course...
… need for splitting, sharding, partitioning, cluster etc.
… need to plan growth from beginning of the project.
… hardware resources can no longer be planned.
… distinct importance of data.
… estimate instead of being correct.
© MAYFLOWER GmbH 2008
10
„Googol records“
But ...
Is this enough?
© MAYFLOWER GmbH 2008
11
„Googol records“
Patterns (or better: suggestions)
Brain storage engine.
Reading differs from writing.
Redundancy and specialization.
The storage itself can keep the information.
Time (and sleep).
The journey is the reward.
© MAYFLOWER GmbH 2008
12
„Googol records“
1 :: Brain storage engine :: 1
Short term memory (working memory)
Unsorted, unfiltered, any dataFast readVery much fast updates/changesRemembers which data is changed/invalidLimited
© MAYFLOWER GmbH 2008
13
„Googol records“
Long-term memory
Presorted, well filtered dataUnlimited (well, more or less)Extremely fast read access (sometimes)Updates/inserts by repeating in working memorySleep helps to better store
1 :: Brain storage engine :: 2
© MAYFLOWER GmbH 2008
14
„Googol records“
How does that model fit into real life?
Nobody awaits to find old things fast
Telephone-books
90/10-Problems
© MAYFLOWER GmbH 2008
15
„Googol records“
Show
Searching in long term memory.Scaling of working/long-term memory
vs.one table with inserts/updates/deletes.
© MAYFLOWER GmbH 2008
16
„Googol records“
2 :: Reading differs from writing
Look at the physical processes
Reading with the fingertips:No read and write at the same time
Handling reading and writing as different aspects of the same thing is a compromise
Only specialization enables good optimization
© MAYFLOWER GmbH 2008
17
„Googol records“
Reader/Writer: Simplest layout
© MAYFLOWER GmbH 2008
18
„Googol records“
The web as storage?
© MAYFLOWER GmbH 2008
19
„Googol records“
Web can work like this
© MAYFLOWER GmbH 2008
20
„Googol records“
Recursive definition of the catalog
© MAYFLOWER GmbH 2008
21
„Googol records“
Scaling, setup as “black box”
© MAYFLOWER GmbH 2008
22
„Googol records“
Share everything
© MAYFLOWER GmbH 2008
23
„Googol records“
Comments
How does this scale?
What doesn’t work with this?
© MAYFLOWER GmbH 2008
24
„Googol records“
3 :: Redundancy and specialization :: 1
We cannot backup a googol
Nobody needs backup, but everybody needs to restore
© MAYFLOWER GmbH 2008
25
„Googol records“
3 :: Redundancy and specialization :: 2
Redundancy:
Store the information on many places Store more important information on more places
Specialization: “Materialized views” EAV modeling and pivoting Take ideas from data warehouses and repositories
© MAYFLOWER GmbH 2008
26
„Googol records“
The wheel comes full circle:
More important: more access.More access: More need for redundancy.More redundancy: more speed and reliability.More speed and reliability: more important.
3 :: Redundancy and specialization :: 3
© MAYFLOWER GmbH 2008
27
„Googol records“
Implementation with Reader/Writer
© MAYFLOWER GmbH 2008
28
„Googol records“
4 :: The storage itself can keep the information.
“A storage has always physical limitations.A logical information of data which belongs together doesn't have any physical limitations.”
Alex Aulbach, Sept. 2008
© MAYFLOWER GmbH 2008
29
„Googol records“
The index is the problem!
The googol-universe is limited.The index can take “half of the galaxies”.Only the “rest” can be used for the data.
Less index means:Faster search in the “needed” index.Less time to write data and index.Less time to warm up.More space for the records.
© MAYFLOWER GmbH 2008
30
„Googol records“
Show
Access full table or split data into several parts.Index-sizeWritePresorted tables
© MAYFLOWER GmbH 2008
31
„Googol records“
5 :: Time (and sleep) :: 1
Human brain: Only three bits per second!
We all have been babies.
Trust! Just wait and see.
Developers (and customers) need to think in decadesnot in days till to the project-end.
© MAYFLOWER GmbH 2008
32
„Googol records“
Again human brain: Learns while sleeping!Why not apply this for databases?
Premise: Redundancy!Dolphins sleep only with one hemisphere at a time.
The wheel comes full circleRedundancy.Distinct read and write.
5 :: Sleep (and time) :: 2
© MAYFLOWER GmbH 2008
33
„Googol records“
Show
Well, I can’t show this, because it takes … time.
© MAYFLOWER GmbH 2008
34
„Googol records“
6 :: The journey is the reward
Future: Not so important how to search, but where.
Store step by step where to find the result, not the result.
You can find faster ways only by trying a shortcut.
It comes full circle:Search many different ways and take the fastest.While sleeping try out new things (dreaming).
© MAYFLOWER GmbH 2008
35
„Googol records“
Conclusion
Dreams may come true while sleeping.
We must invent now the toolsto solve the problems of the future.
Speed is not a matter of hardwarebut of how things are done.
Never take speed as stated:In a googol-universe wormholes exists!
Moores Law may help, but do not trust em.
Thank you!
Alex AulbachMayflower GmbH
Pleichertorstr. 2 97070 Würzburg, Germany+49 (931) 35 9 65 - [email protected]