1 Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA...
-
date post
15-Jan-2016 -
Category
Documents
-
view
216 -
download
0
Transcript of 1 Databases in Internet Applications: Case Studies Anil Nori CTO AserA Inc. Palo Alto USA...
1
Databases in Internet Applications:
Case Studies
Anil NoriCTO
AserA Inc.Palo Alto
2
Acknowledgements
Sources for some of the material Oracle Corporation CNN Custome News Excite Cisco
3
Database Technology Timeline
Early 80s Late 80s Early - Mid 90s Late 90s - 21st C
Pre-relational
EarlyRelational
Client-serverRelational
Enterprise -capable
Relational
Internet Computing
SimpleOLTP
ActiveDatabase
Data Warehouse &Hi-end OLTP
Packaged & Vertical
Applications
Simple transactions,
on-linebackup & recovery
Stored procedures,
triggers
Scaleable OLTP, parallel query, partitioning,
cluster support, row-level locking, high availability
Middleware (messaging,
queues, events)Java,
CORBA, Web interfaces
Support for all types of
data, extensibility,
objects
Simple Data Management
Global Enterprise Management
Current State of DBMSs
OLTP applications• Large amounts of data
• Simple data, simple queries and updates Update statement from debit/credit transaction:
UPDATE accountsSET abalance = abalance + :delta
WHERE aid = :aid;
• Typically update intensive
• Large number of concurrent users (transactions) Data warehousing applications
• Large amounts of data
• Simple data but complex querying
• Typically read intensive
• Large number of users
Current State of DBMSs
These applications require:• Large users/transactions
• High performance
• High availability (7x24 operations)
• Scalability
• High levels of security
• Administrative support
• Good utilities
6
Internet Applications: Challenges
TerabytesGigabytes
ImmediateBatch
UsageUsage
Business-CriticalUseful
ImportanceImportance
Every EmployeeAnalysts
UsersUsers
SizeSize
Self-ServiceTrained
Larger User PopulationsLarger User Populations
IntegratedIndependent
Network SystemsNetwork Systems
IntelligentSimple
Systems ManagementSystems Management
Global Local
Operations HoursOperations Hours
Transaction Processing Data Warehousing
7
Internet Applications: Challenges
HeterogeneousTabular
TypeType
PersonalizedGeneric
DeliveryDelivery
Lots of read-onlyRead/write
AccessAccess
Information Management
Search Direct
ContentContent
OpenProprietary
APIsAPIs
IntegratedStandalone
E-commerce/Apps
ApplicationsApplications
Low TCO, Mission Critical
ManagementManagement
24X7Occasional
AvailabilityAvailability
Site Operation
8
Internet Challenges
Availability• Need near 100% availability• Must be easy to manage• Replication, hot standby, foolproof system?
Scalability• Number of users is orders of magnitude higher
Security• Global users• Managing millions of users• Encryption• Performance
Internet user expectations• Speed vs correctness
(e.g. Search engines vs blade/cartridge/extender
• Availability vs correctness
9
Internet Application Architecture: Today
Application messages
Browser Browser
Physical Middle Tier
Data Sources
Client Tier
ORDBMS
WEB/APP Server
Middle TierApplication
Data Integration, Storage, Query, Management
Other Data Sources
Gateways
OLE/DBData source
authoring tools etc.
HTTP
HTTP
Remote messages
10
Case Studies
CNN Custom News Excite Cisco Internet Applications
11
CNN Custom News
On-line news service Allows users to customize news in a
personalized manner Offers variety of news items (e.g.
national, international, business etc.)
12
Custom News Application Architecture
Browser Browser
Physical Middle Tier
Client Tier
WEB Server
Application Server
HTTP
OracleDBMS
Application Server
Application Server
DatabaseTier
OracleDBMS
OPS
WEB Server WEB Server
Hardware Load Balancing
...
13
CNN Custom News
Backend:• SUN SOLARIS enterprise servers
• Oracle Parallel Server 7.3.4 Middle-Tier (9 Machines)
• Web Servers
• Oracle Application Servers
• PL/SQL Cartridges Load Balancing
• Harware based
• DNS router
• Round -robin
14
Oracle Application Server
CORBA Backend
Adapter
Car
trid
ge
Car
trid
ge
Car
trid
ge
15
CNN Custom News
Data feeds into the database Keeps text in the database Images in files Images accessed in the middle-tier PL/SQL Cartridge
16
PL/SQL Cartridge
OAS
PL/SQL Cartridge
Connection poolingSession CachingParameter MarshallingValidationResult Processing
OracleDBMSPL/SQL
17
PL/SQL
Server-side Used to generate HTML Suited for database logic
18
Searching
Uses Oracle ConText cartridge Content-based searching Uses bitmap indexes
19
CNN Custom News: Observations
Database-centric Uses PL/SQL based scripting Application Server for scalability
20
Excite
Personalized online service that gives Web users everything they want, all in one place
Builds tools that manage vast amounts of information available on the internet
Provides variety of user services (apps):• News • Money and Investing -- stock quotes• Message boards and Chat• Mail• Communities• Classifieds• Jobs
21
Excite
Supports suite of applications Each application uses three-tier
architecture Federated approach
• Many databases
• Databases specific to applications Application logic in the middle-tier as
multi-threaded embedded C programs (pro*c programs)
22
Excite: An Application Architecture
Browser Browser
Physical Middle Tier
Client Tier
WEB Server
Middle TierApplication
HTTP
HTTP
OracleDBMS
Middle TierApplication
Middle TierApplication
WEB Server
DatabaseTier
23
Excite - PFP Application
Personalized front page application Application is deployed as 50 middle-tier
daemon processes The middle-tier application daemons
perform:• Application logic in C
• Connection pooling Each daemons keeps about 40 connections to
the database (about 2000 total connections to the database)
• Load balancing
24
Excite - PFP Database Configuration
Oracle8 on SUN solaris server
• 2 SUN 6500s -- 28 way SMP
PFP database is split into multiple databases for load balancing and scalability
Scalar data stored in the database in relational tables
About 20 tables for storing user profiles; 100 tables for content
25
Excite - PFP Database Configuration
Multi-media content (e.g. Stock quotes or news item) stored in memory mapped files for fast access. File references stored in the database
Lot of the content is read-only; need not be backed up; can be reconstructed from the original sources
26
Excite - Scalability
By partitioning the application across multiple databases
Each application partition supported by multiple middle-tier daemon processes
Multiple web servers to reduce traffic congestion
27
Excite - Availability
Using replication and hot standby Uses oracle8 hot standby feature Uses asynchronous replication. Data
replicated at 10 sec latency Almost every database is replicated for
failover Replication preferred over hot standby.
Hot standby cannot be used for normal usage
28
Excite - Other Applications
Most of the Excite applications have similar three-tier architecture
29
Excite - Observations
Some content (specially, for communities applications) could be stored in the database. Management benefits attractive. If content stored in the database, access performance is very critical
Need fast replication Currently not using middle-tier caching.
Caching could be quite useful but coherency is an issue
30
Cisco Successfully implemented applications
for the internet Internet commerce
• Order placement• Checking order status• On-line, guided product configuration• Price quotes
Employee self-service• Provides all employee services
electronically• Employee directories• Employee benefits• Expense reports
31
Cisco
Supply chain management• Networked suppliers, resellers and
customers
• Enables business partners to manage and operate major portions of its supply chain
• Entire supply chain works off one central demand forecast
Customer care• Exchange of technical information
• Software upgrades (90% of software upgrades via internet)
• On-line support ( 70% of support on-line)
• On-line, assisted trouble-shooting
32
Cisco
Communications and collaboration• Sales and technical training
• Virtual classrooms
• Company-wide meetings and broadcasts
33
Cisco Commerce Server Architecture
Browser Browser
Physical Middle Tier
Client Tier
WEB Server
HTTP
HTTP
OracleDBMS
Commerce Server
DatabaseTier
OracleDBMS
OracleApplications
34
Cisco Commerce Server
Typical three-tier architecture Proprietary web server
• Performs content aggregation
• Encryption
• Accesses oracle DBMS
• Runs on a dedicated SUN server Proprietary commerce server
• Proprietary application server
• Performs variety of commerce functions
35
Cisco Commerce Server
Scalability and availability• Big servers for scalability
• Multiple commerce server processes for load balancing
• Databases replicated
• Hot standby for availability
36
Case Studies: Observations
Database is being used mostly for storage
Application in the middle-tier Middle-tier also provides:
• scalability
• load balancing
• large number of users
37
Analyzing Internet Applications
Web integration Web publishing Application integration E-commerce
38
WEB Integration Heterogeneous data sources Heterogeneous data types 1000s of data sources Dynamic data Warehousing
Web Publishing
Problem: internet placing new requirements on content management• Heterogeneity: access different types of
content from browsers e.g. Email, data warehouses, reports, HTML files
• Personalized: structured, dynamic, customized content
• Transactive: content blending with application
• Aggregation: portalization via major “gateways”
40
Application Integration
Integrating Multiple Applications (e.g. ERP/Front Office)
• Application workflow specification Asynchronous communication
• Queuing and propagation Message tracking Message warehouse (persistence)
• Message broker/server Data transformation
• Transforming messages to different application formats (e.g. SAP, CLARIFY, …I
41
Electronics Commerce
Automating business-to-business, business-to-consumer interactions
• Selling and buying Order management Product catalogs Product configuration
• Sales and marketing
• Education and training
• Service
• Communities
Database Technology Uses
Business/workflow transactions• Support across multiple database/ERP
systems
• Transactional
• Tools to generate compensating actions
• Transformations Queuing
• Support for heterogeneous messages
• Transactional
• Querying, e.g. On attribute, value pairs
• Indexing, e.g. On attribute, value pairs
• Publish/subscribe
Database Technology Uses
Rule engines• Complex business processing rules
• Customization/profiling rules Business domain rules Presentation rules
Repositories for Application Development• Managing Java objects, interfaces, etc.
• Must for application integration
• Standardized object models and protocols
• Directories vs repositories
Database Technology Uses
XML support• XML schema/storage
• XML caching
• XML querying
• Coexistence with SQL -- current efforts seem disjoint
Multiple caches• Consistency of middle-tier and database
caches Data mining
• Algorithms need to become more pragmatic
45
Database Technology Uses
Internet user expectations• Speed vs correctness
(e.g. Search engines vs blade/cartridge/extender)
• Availability vs correctness Component Architecture
• Caching• XML support• Querying• Transactions• Rule engines• Metadata management• Queueing
Database Technology Uses Availability
• Need near 100% availability
• Must be easy to manage
• Replication, hot standby, foolproof system? Scalability
• Number of users is orders of magnitude higher
Security• Global users
• Managing millions of users
• Encryption
• Performance
47
Internet Applications Architecture: Future
Browser Browser
Logical Middle Tier
Data Sources
Client Tier
ORDBMS
WEB/APP Server
XML enabled
XML Database
XML Integration & Query Server; Warehouse Server
XMLdocumentson the Web
Otherdocumentson the Webe.g. HTML,WORD
XML Transformer & Gateway
OLE/DBData source
XML
XML
XML
XML XML
XML enabled tools: authoring tools etc.
XML enabled Application Messages
48
XML in the Database
XML has the potential to impact four important markets
• Web integration
• Web publishing
• Application integration
• Electronic commerce
Xml-enable the DBMS
Xml-enabled DBMS
DBMS “Xml-enable” the database
system • Store XML data/documents the
database server
• Querying and searching of structured and unstructured XML
• In generate XML data from the database server
• Add XML capabilities in supporting database facilities
Store XMLStore XML
GenerateGenerateXMLXML
Integrate with Integrate with other facilitiesother facilities
Store XML Data
Enhance XML storage facilities in the database with support in utilities• Facilities to load XML data into the database
• Provide more efficient database storage (componentized storage, compression, indexing,…)
• XML export facilities from the server
51
Search and Query XML Data
Search XML data efficiently • Special SQL queries over structured +
unstructured XML
• Content-based indexing (e.g. Text indexes) for searching XML data efficiently
• Support for XML query languages (e.g. XQL) on XML data
Generate XML
Generate XML from the database server• Map SQL92, SQL3 and PL/SQL datatypes to XML
• Provide mappings between java, SQL and XML types
Script XML content from the database• Allow SQL queries to return XML results
• Provide embedded XML in stored procedures
• Java scripting: support embedded XML in java
• Common apis to access any XML content in databases
XML and Supporting Facilities
Provide XML capabilities in supporting database facilities• Support XML in database utilities - loader,
export/import ..
• Allow server-to-server replication of XML data
• Fine grained access to XML documents
54
XML Caching
Need to temporarily cache it, index it, update the cached copy, transact it
Need to query XML caches Also requires a store for managing it in
the middle-tier Provides XML logical views
55
DBMS Architecture for Internet Applications
Monolithic architecture• Enhance the DBMS with all the features
necessary for supporting internet applications
Component architecture• Provide components for supporting
internet applications
• Components can reside in the DBMS or in the middle-tier
56
Monolithic Approach
+ Database is the platform
+ Leverage DBMS infrastructure
+ Uniform management
- Not flexible
- Forces 2-tier architecture
- May not be suitable for high-end configurations
- Not suitable for heterogeneous application integration
57
Component Approach
+ Flexible
+ Accommodates multi-tier architecture - components can be deployed in the middle or database tier
+ Facilitates heterogeneous integration of applications
- Need to manage multiple components
Looking Ahead
Database Technology has lot to offer for building internet applications!
Componentized Databases?