ADBMS Unit 5 Modified

download ADBMS Unit 5 Modified

of 78

Transcript of ADBMS Unit 5 Modified

  • 7/29/2019 ADBMS Unit 5 Modified

    1/78

    Parallel and Distributed Databases

    and Client-Server Architecture

    Architectures for Parallel Database.

    Parallel Query Evaluation.

    Parallelizing individual Operations.

    Sorting.

    Joins.

    Distributed database concepts.

    Data Fragmentation.

    Replication.

    Allocation Techniques for distributed database design.

    Query processing in distributed databases.

    Concurrency Control and Recovery in Distributed Databases.

    An Overview of Client Server Architecture

  • 7/29/2019 ADBMS Unit 5 Modified

    2/78

    Parallel Databases

  • 7/29/2019 ADBMS Unit 5 Modified

    3/78

    Why we need Parallel Databases?

    We have databases that hold a high

    amount of data, in the order of 10

    terabytes.

    We have data applications that need to

    process data at very high speeds.

    A parallel databasesystem seeks to

    improve performance

    through parallelization of various

    operations, such as loading data, building

    indexes and evaluating queries.

    http://en.wikipedia.org/wiki/Databasehttp://en.wikipedia.org/wiki/Parallelizationhttp://en.wikipedia.org/wiki/Parallelizationhttp://en.wikipedia.org/wiki/Database
  • 7/29/2019 ADBMS Unit 5 Modified

    4/78

    Benefits of Parallel Databases:

    Improves Response Time.

    INTERQUERY PARALLELISM

    It is possible to process a number of transactions in

    parallel with each other.

    Improves Throughput.

    INTRAQUERY PARALLELISM

    It is possible to process sub-tasks of a transaction in

    parallel with each other.

  • 7/29/2019 ADBMS Unit 5 Modified

    5/78

    How to Measure the Benefits?

    Speed-Up.

    As you multiply resources by a certain factor,the time taken to execute a transaction shouldbe reduced by the same factor:

    10 seconds to scan a DB of 10,000 recordsusing 1 CPU.

    1 second to scan a DB of 10,000 records using10 CPUs.

  • 7/29/2019 ADBMS Unit 5 Modified

    6/78

    Sub-linear speed-up

    Linear speed-up (ideal)

    Number of CPUs

    Numberoftransactio

    ns/second

    1000/Sec

    5 CPUs

    2000/Sec

    10 CPUs 16 CPUs

    1600/Sec

  • 7/29/2019 ADBMS Unit 5 Modified

    7/78

    Cont..

    Scale-up.

    As you multiply resources the size of a task

    that can be executed in a given time should beincreased by the same factor.

    1 second to scan a DB of 1,000 records

    using 1 CPU 1 second to scan a DB of 10,000 records

    using 10 CPUs

  • 7/29/2019 ADBMS Unit 5 Modified

    8/78

    10 CPUs

    2 GB Database

    Number of CPUs, Database size

    Num

    beroftransactions/second

    Linear scale-up (ideal)Sub-linear scale-up

    1000/Sec

    5 CPUs

    1 GB Database

    900/Sec

  • 7/29/2019 ADBMS Unit 5 Modified

    9/78

    Architectures for Parallel

    Databases

    Three main architectures for building

    parallel DBMS.

    Shared Memory.

    Shared Disk.

    Shared Nothing.

  • 7/29/2019 ADBMS Unit 5 Modified

    10/78

    Shared memory architecture, where

    multiple processors share the main memory space, as

    well as mass storage (e.g. hard disk drives)

    Shared disk architecture, where each node has its own

    main memory, but all nodes share mass storage, usually

    a storage area network. In practice, each node usually

    also has multiple processors.

    Shared nothing architecture, where each node has its

    own mass storage as well as main memory.

    http://en.wikipedia.org/wiki/Shared_memoryhttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Main_memoryhttp://en.wikipedia.org/wiki/Hard_disk_drivehttp://en.wikipedia.org/wiki/Storage_area_networkhttp://en.wikipedia.org/wiki/Shared_nothing_architecturehttp://en.wikipedia.org/wiki/Shared_nothing_architecturehttp://en.wikipedia.org/wiki/Storage_area_networkhttp://en.wikipedia.org/wiki/Hard_disk_drivehttp://en.wikipedia.org/wiki/Main_memoryhttp://en.wikipedia.org/wiki/Central_processing_unithttp://en.wikipedia.org/wiki/Shared_memory
  • 7/29/2019 ADBMS Unit 5 Modified

    11/78

  • 7/29/2019 ADBMS Unit 5 Modified

    12/78

    Parallel Query Evaluation

    Shared nothing architecture is preferable.

    Goal

    Minimizing data shipping by partitioning the data and

    structuring the algorithms to do most of theprocessing at individual processors.

    If one operator consumes the output of a

    second operator, we have pipelinedparallelism.

  • 7/29/2019 ADBMS Unit 5 Modified

    13/78

  • 7/29/2019 ADBMS Unit 5 Modified

    14/78

    Data Partitioning Partitioning a large dataset horizontally

    across several disks enables us to exploitthe I/O bandwidth of the disks by reading

    and writing them in parallel.

    We can assign tuples using Round robin fashion.

    hashing

    Range partitioning.

  • 7/29/2019 ADBMS Unit 5 Modified

    15/78

    Partitioning

    Round robin partitioning.

    If there are n processors, the ith tuple isassigned to processori mod n.

    Hash partitioning

    A hash function is applied to (selected fields of)a tuple to determine its processor.

    Range partitioning.

    Tuples are sorted and n ranges are chosen for

    the sort key values so that each rangecontains roughly the same number of tuple;

    The tuples in range i are assigned to processori.

  • 7/29/2019 ADBMS Unit 5 Modified

    16/78

  • 7/29/2019 ADBMS Unit 5 Modified

    17/78

    Round robin partitioning is suitable for

    efficiently evaluating queries that access

    the entire relation.

    If only subset of the tuples are required

    (like condition age=20) hash and range

    partitioning are better than round robinpartitioning .

    If range selections such as 15

  • 7/29/2019 ADBMS Unit 5 Modified

    18/78

    Parallelizing Sequential operator

    Evaluation code. Basic idea of using parallel data streams.

    Streams are merged as needed to provide the inputsfor a relational operator, and the output of an operatoris split as needed to parallelize subsequent processing.

    A parallel evaluation plan consists of dataflow networkof relational, merge, and split operators.

    The merge and split operators should be able to buffersome data & should be able to halt the operatorsproducing their input data .

    They can then regulate the speed of the executionaccording to the execution speed of the operator thatconsumes their output.

  • 7/29/2019 ADBMS Unit 5 Modified

    19/78

    Parallelizing individual operations

    We assume that each relation is

    horizontally partitioned across several

    disks ,although this partitioning may or

    may not be appropriate for a given query.

    The evaluation of a query must take the

    initial partitioning criteria into account and

    repartition if necessary.

  • 7/29/2019 ADBMS Unit 5 Modified

    20/78

    Bulk loading and Scanning

    Consider two operations:

    Scanning a relation and Loading a Relation.

    if the relation is partitioned across several disks,

    Pages can be read in parallel while scanning a

    relation, and retrieved tuples can then be merged.

    If hashing or range partitioning is used , selection

    queries can be answered by going to those

    processors that contain relevant tuples. if relation has associated indexes, any sorting of

    data entries required for building the indexes

    during bulk loading can also be done in parallel.

  • 7/29/2019 ADBMS Unit 5 Modified

    21/78

    Sorting

    CPU sort the part of the relation that is on itslocal disk and then merge these sorted sets oftuples.

    Better idea is to first redistribute all tuples in therelation using range partitioning.

    Each processor then sorts the tuples assigned toit, using sequential sorting algorithm.

    The entire sorted relation can be obtained by

    visiting the processors in an order correspondingto the ranges assigned to them and simplyscanning the tuples.

  • 7/29/2019 ADBMS Unit 5 Modified

    22/78

    Joins

    The basic idea for joining A and B in parallel is to

    decompose the join into a collection of K smaller

    joins.

    We can decompose the join by partitioning bothA and B into a collection of K logical buckets or

    partitions.

    By using same partitioning function for both A

    and B , we ensure that the union of the k smaller

    joins compute the join of A and B.

  • 7/29/2019 ADBMS Unit 5 Modified

    23/78

    Distributed Database

    concepts

  • 7/29/2019 ADBMS Unit 5 Modified

    24/78

    Distributed Computing Systems

    A distributing Computing System consistsof a number of processing elements, notnecessarily homogeneous, that are

    interconnected by a computer network,and that corporate in performing certainassigned tasks.

    DCS partition a big, unmanageableproblem into smaller pieces and solve itefficiently in a co-ordinate manner.

  • 7/29/2019 ADBMS Unit 5 Modified

    25/78

    Distributed Database

    Distributed Database as a collection of

    multiple logically interrelated databases

    distributed over a computer network.

    Distributed Database management

    Systems as a software system that

    manages a distributed database while

    making the distribution transparent to theuser.

  • 7/29/2019 ADBMS Unit 5 Modified

    26/78

    Parallel versus Distributed Technology

    There are two main types of Multiprocessor

    systems architectures that are common.

    Shared Memory (tightly coupled) architecture:

    Multiple processors share secondary storage and also

    share primary memory. Shared disk (loosely coupled) architecture:

    Multiple processors share secondary storage but each

    has their own primary memory.

    Database management systems developed

    using the above types of architecture are

    termed as parallel database management

    systems.

    Shared nothing architecture

  • 7/29/2019 ADBMS Unit 5 Modified

    27/78

    Shared nothing architecture Every processor has its own primary and

    secondary (disk) memory.

    No common memory exists, The processors communicate over a high speedinterconnection network. (bus or switch).

  • 7/29/2019 ADBMS Unit 5 Modified

    28/78

    Difference between Shared nothing and

    distributed database computing

    Major difference is in the mode of

    execution.

    In shared nothing multiprocessor systems

    there is symmetry and homogeneity of

    nodes.

    In distributed Database environment ,

    heterogeneity of hardware and operating

    system at each node is very common

  • 7/29/2019 ADBMS Unit 5 Modified

    29/78

    A networked Architecture with centralized

    database at one of the sites

  • 7/29/2019 ADBMS Unit 5 Modified

    30/78

    A truly Distributed Database

  • 7/29/2019 ADBMS Unit 5 Modified

    31/78

    Advantages of Distributed Databases

    Management of Distributed data with

    different levels of transparency. DBMS should be distribution transparent in

    the sense of hiding the details of where each

    file is physically stored within the system.

  • 7/29/2019 ADBMS Unit 5 Modified

    32/78

    Transparency

    Types of Transparencies

    Distribution or network transparency

    Location Transparency and Naming Transparency

    Replication Transparency Fragmentation Transparency.

    Horizontal and Vertical Fragmentation.

    Increased Reliability and availability

    Improved performance Data localization

    Easier Expansion.

  • 7/29/2019 ADBMS Unit 5 Modified

    33/78

    Additional functions of distributed

    databases

    Keeping track of data. DDBMS catalog.

    Distributed query processing.

    Communication Network Distributed Transaction Management.

    Replicated data management.

    Distributed database recovery.

    Security. Authorization/Access privileges.

    Distributed directory (catalog) management.

  • 7/29/2019 ADBMS Unit 5 Modified

    34/78

    Data Fragmentation Horizontal Fragmentation:

    a horizontal Fragment of a relation is a subset of thetuples in that relation.

    The tuples that belong to horizontal fragment arespecified by a condition on one or more attributes ofthe relation.

    These fragments can be assigned to different sites inthe distributed system.

    To reconstruct the relation R from a completehorizontal fragmentation, we need to apply the

    UNION operation to the fragments. Eg: We may define 3 horizontal fragments on

    Employee relation with conditions like(DNO=5),(DNO=4) and (DNO=1)

  • 7/29/2019 ADBMS Unit 5 Modified

    35/78

    Vertical Fragmentation:

    Divides a relation vertically by columns.

    A vertical Fragment of a relation keeps only

    certain attributes of the relation.

    A vertical Fragment on a relation R can be

    specified by a Li (R)operation in relation

    algebra. L1 U L2 U..U Ln=ATTRS(R)

  • 7/29/2019 ADBMS Unit 5 Modified

    36/78

    Mixed (Hybrid) Fragmentation.

    Intermix the two types of fragmentation.

    A fragment of a relation R can be specified by

    a SELECT-PROJECT combination of

    operations

    A fragmentation Schema of a database isa definition of a set of fragments that

    includes all attributes and tuples in the

    database and satisfies the condition thatthe whole database can be reconstructed

    from the fragments.

  • 7/29/2019 ADBMS Unit 5 Modified

    37/78

    An Allocation Schema describes the

    allocation of fragments to sites of the

    DDBS; hence, it is a mapping that

    specifies for each fragment the site(s) atwhich it is stored.

    If the Fragment is stored at more than one

    site, it is said to be replicated.

  • 7/29/2019 ADBMS Unit 5 Modified

    38/78

    Data Replication and Allocation Replication is useful in improving the availability

    of data. The most extreme case is replication of the whole

    dbase at every site in the distributed system, thus

    creating a fully replicated distributed database.

    Improves availability and performance,

    Retrieval query can be processed at local site.

    Disadvantage of fully replication is that it slows

    down the update operation drastically. Make concurrency control and recovery

    techniques more expensive

  • 7/29/2019 ADBMS Unit 5 Modified

    39/78

    Inverse of full replication is NO Replication.

    Each fragment is stored at exactly one site.

    Called as nonredundant allocation

    Between these two spectrum is partial

    replication.

    Some fragments may be replicated where as

    others may not.

  • 7/29/2019 ADBMS Unit 5 Modified

    40/78

    Data Allocation

    Each fragment-or each copy of a fragment-mustbe assigned to a particular site in the distributedsystem.

    This process is called data distribution or dataallocation.

    The choice of the sites and degree of replicationdepend on the performance and availabilitygoals of the system and on the types andfrequencies of transactions submitted at eachsite.

  • 7/29/2019 ADBMS Unit 5 Modified

    41/78

    Query processing in Distributed Databases

    Data transfer Costs of Distributed Query

    Processing. In Distributed Systems, Several other factors

    complicate query processing.

    First the cost of transferring data over the network. This data includes intermediate files that are

    transferred to other sites for further processing, as

    well as the final result files that may have to be

    transferred to the site where the query result is

    needed.

    Costs is not very high in case of high performance

    LAN.

  • 7/29/2019 ADBMS Unit 5 Modified

    42/78

    Suppose Employee and Department relations are

    distributed as shown below.SITE1:

    EMPLOYEE

    10000 RECORDS

    EACH RECORD IS 100 BYTES LONG

    ENO FIELD IS 9 BYTES LONG FNAME FIELD IS 15 BYTES LONG

    DNO FIELD IS 4 BYTES LONG LNAME FIELD IS 15 BYTES LONG.

    SITE 2:

    DEPARTMENT

    100 RECORDS

    EACH RECORD IS 35 BYTES LONG

    DNUMBER FIELD IS 4 BYTES LONG DNAME IS 10BYTES LONG

    MGRENO FIELD IS 9BYTES LONG

    FNAME MINIT INITIAL ENO DOB ADDRESS SEX SALARY SUPERENO DNO

    DNAME DNUMBER MGRENO MGRSTARTDATE

  • 7/29/2019 ADBMS Unit 5 Modified

    43/78

    According to the example the size of employee

    relation is 100*10000=106 Bytes.

    The size of the Department relation is 35*100=3500Bytes.

    Consider the query

    Q:For each employee, retrieve the employee name and

    the name of the department for which the employeeworks.

    This is stated as

    Q:FNAME,LNAME,DNAME(EMPLOYEE DNO=DNUMBER DEPARTMENT)

  • 7/29/2019 ADBMS Unit 5 Modified

    44/78

    The result of the query will include 10,000 records,

    assuming every employee is related to a

    department. Suppose that each record in the queryresult is 40 bytes long.

    The query is submitted at distinct site 3, which is

    the called the result site because the query result isneeded there.

    Neither the EMPLOYEE nor the DEPARTMENT

    relations reside at site 3.

    There are 3 strategies for executing this distributed

    query.

  • 7/29/2019 ADBMS Unit 5 Modified

    45/78

    Case1

    Transfer both the EMPLOYEE and the

    DEPARTMENT relations to the result site,

    and perform the join at the site 3.

    In this Case a total of1,000,000 +3500 =

    1,003,500 bytes must be transferred.

  • 7/29/2019 ADBMS Unit 5 Modified

    46/78

    Case 2:

    Transfer the Employee Relation to SITE 2,

    Execute the join at Site 2, and send the

    result to site 3.

    The size of the query result is

    40*10,000=400,000,

    so 400,000+1,000,000=1,400,000must be

    transferred.

  • 7/29/2019 ADBMS Unit 5 Modified

    47/78

    Case 3

    Transfer the Department relation to site 1,

    execute join at site 1 and send the result to site 3.

    In this case 400,000+3500=403,500 bytes must

    be transferred.

    IF MINIMIZING THE AMOUNT OF DATA TRANSFER IS

    OUR OPTIMIZATION CRITERION, WE SHOULD

    CHOOSE

    STRATEGY3.

  • 7/29/2019 ADBMS Unit 5 Modified

    48/78

    Q:For each dept retrieve the dept name

    and the name of the dept manager.

    Q:FNAME,LNAME,DNAME(DEPARTMENTeno=mgr_eno EMPLOYEE)

  • 7/29/2019 ADBMS Unit 5 Modified

    49/78

    Case 1

    Transfer both the EMPLOYEE and the

    DEPARTMENT relations to the result site,

    and perform the join at the site 3.

    In this Case a total of1,000,000 +3500 =

    1,003,500 bytes must be transferred.

  • 7/29/2019 ADBMS Unit 5 Modified

    50/78

    Case 2

    Transfer the Employee Relation to SITE 2,

    Execute the join at Site 2, and send the

    result to site 3.

    The size of the query result is

    40*100=4000 bytes

    so 4000+1,000,000=1,004,000must be

    transferred.

  • 7/29/2019 ADBMS Unit 5 Modified

    51/78

    Case 3

    Transfer the Department relation to site 1,execute join at site 1 and send the result to site

    3.

    In this case 400000+3500=403500 bytes must

    be transferred.

    WE SHOULD AGAIN CHOOSE STRATEGY3, IF MINIMIZINGTHE AMOUNT OF DATA TRANSFER IS OUR OPTIMIZATION

    CRITERION

  • 7/29/2019 ADBMS Unit 5 Modified

    52/78

    However if instead of site 3 , site 2 is the

    result site, then we have 2 strategies :

    (Case 1)

    Transfer the Employee Relation to SITE 2,

    Execute the query ,and present the result

    to the user at site 2.

    Here the same number of bytes -1,000,000-

    must be transferred for both Q and Q

  • 7/29/2019 ADBMS Unit 5 Modified

    53/78

    (Case 2)

    Transfer the Department relation to site 1,

    execute the query at site 1 and send the

    result back to site 2 .

    In this case ,400,000+3500=403,500 bytes

    must be transferred for Q and

    4000+3500=7500 bytes for Q.

    Di t ib t d Q P i i

  • 7/29/2019 ADBMS Unit 5 Modified

    54/78

    Distributed Query Processing using

    Semi join Idea behind DQP using semi join operation is to reduce

    the number of tuples in a relation before transferring it toanother site.

    Send the joining column of one relation R to the site

    where other relation S is located. This column is then joined with S. Following that , the join

    attributes, along with the attributes required in the result,are projected out and shipped back to the original site and

    joined with R.

    Hence, only the joining column of R is transferred in onedirection, and a subset of S with no extraneous tuples istransferred in the other direction.

    Example:

  • 7/29/2019 ADBMS Unit 5 Modified

    55/78

    Example:1. Project the join attributes of

    DEPARTMENT at site 2, and transfer

    them to site 1. For Q, we transfer

    F=Dnumber(DEPARTMENT)

    Whose size is 4*100=400 bytes, where as

    , for Q, we transfer

    F=MGRENO(DEPARTMENT),

    Whose size is 9*100=900 bytes.

  • 7/29/2019 ADBMS Unit 5 Modified

    56/78

    Join the transferred file with the Employeerelation at site 1, and transfer the requiredattributes from the resulting file to site 2.

    For Q, We transfer R=DNO, FNAME,LNAME,(F DNUMBER=DNOEMPLOYEE EMPLOYEE)

    WHOSE SIZE IS 34*10,000=340000 BYTES.

    Where as for Q, we transfer R=

    MGRENO, FNAME,LNAME,(F MGRENO=ENO EMPLOYEE)

    WHOSE SIZE IS 39* 100=3900 BYTES.

  • 7/29/2019 ADBMS Unit 5 Modified

    57/78

    Execute the query by joining the

    transferred file R or R with

    DEPARTMENT, and present the result to

    the user at site 2. Using this strategy, we transfer 340,400

    bytes for Q and 4800 bytes for Q.

  • 7/29/2019 ADBMS Unit 5 Modified

    58/78

    Query and Update decomposition

    for query decomposition, the DDBMS can

    determine which fragments may contain the

    required tuples by comparing the query

    condition with the guard conditions. Consider the query

    Retrieve the names and hours per week for

    each employee who works on some projectcontrolled by department 5

  • 7/29/2019 ADBMS Unit 5 Modified

    59/78

  • 7/29/2019 ADBMS Unit 5 Modified

    60/78

    SQL query for the schema would be

    SELECT FNAME,LNAME,HOURS

    FROM EMPLOYEE,

    PROJECT,WORKS_ONWHERE DNUM=5 AND PNUMBER=PNO

    AND ESSN=SSN;

    Concurrency Control and Recovery in

  • 7/29/2019 ADBMS Unit 5 Modified

    61/78

    y yDistributed Databases

    Problems that arise during handlingconcurrency control and recovery in

    distributed database and not in Centralized

    systems are: Dealing with multiple copies of the data items.

    Failure of individual sites.

    Failure of communication links. Distributed commit.

    Distributed Deadlock.

  • 7/29/2019 ADBMS Unit 5 Modified

    62/78

    Dealing with multiple copies of the data

    items :

    The concurrency control method is

    responsible for maintaining consistencyamong these copies.

    The recovery method is responsible for

    making a copy consistent w/ other copiesif the site on which the copy is stored fails

    and recovers later.

  • 7/29/2019 ADBMS Unit 5 Modified

    63/78

    Failure of individual sites.

    The DDBMS should continue to operate

    with its running sites ,if possible, when one

    or more individual sites fail.

    When a site recovers ,its local database

    must be brought up-to-date with rest of the

    sites before it rejoins the system.

  • 7/29/2019 ADBMS Unit 5 Modified

    64/78

    Failure of communication links.

    The system must be able to deal with failure

    of one or more of the communication links

    that connect the sites. An extreme case of

    this problem is that networking partitioningmay occur .

    This breaks up the sites into 2 or more

    partitions, where the sites within eachpartition can communicate only with one

    another and not with sites in other

    partitions. `

  • 7/29/2019 ADBMS Unit 5 Modified

    65/78

    Distributed commit.:

    Problems can arise with committing a

    transaction that is accessing databases

    stored on multiple sites if some sites failduring the commit process.

    The 2-phase commit protocol is often used to to

    deal w/ this problem.

  • 7/29/2019 ADBMS Unit 5 Modified

    66/78

    Distributed Deadlock.

    Deadlock may occur among several

    sites, so techniques for dealing w/

    deadlocks must be extended to take thisinto account.

    Distributed Concurrency Control

  • 7/29/2019 ADBMS Unit 5 Modified

    67/78

    yBased on a Distinguished Copy of

    a Data item. Several Concurrency control methods

    have been proposed that extend the

    concurrency control techniques forcentralized database.

    PRIMARY SITE TECHNIQUE

  • 7/29/2019 ADBMS Unit 5 Modified

    68/78

    PRIMARY SITE TECHNIQUE

    Single primary site is designated to be thecoordinator site for all database items.

    All locks are kept at that site, and all locking or

    unlocking are sent there. Extension of centralized approach.

    Disadvantages:

    All locking requests are sent to a single site, possibly

    overloading and causing system bottleneck.

    Failing of the primary site will paralyses the system,

    since all locking information is stored there.

  • 7/29/2019 ADBMS Unit 5 Modified

    69/78

    Primary Site with Backup Site.

    A second site is designated as back up ofprimary site.

    All locking information is maintained at both theprimary and secondary sites.

    In case of primary site failure the secondary sitetakes over as primary site and a new backup siteis chosen.

    Simplifies the recovery from failure of theprimary site.

    It slows down the process of acquiring locks.

    Primary Copy Technique

  • 7/29/2019 ADBMS Unit 5 Modified

    70/78

    Primary Copy Technique

    Distribute the load of lock coordination

    among various sites by having thedistinguished copies of different data items

    stored at different sites.

    Failure of one site affects any transactionsthat are accessing locks on items whose

    primary sites copies reside at that site, but

    other transactions are not affected.

    Choosing a New Coordinator in

  • 7/29/2019 ADBMS Unit 5 Modified

    71/78

    Choosing a New Coordinator in

    Case of Failure.

    When a coordinator site fails in any of thepreceding techniques, the sites that are stillrunning must choose a new coordinator.

    In case of primary site approach with no back-up

    site, all executing transactions must be abortedand restarted in tedious recovery process.

    For methods that use back-up sites, transactionprocessing is suspended while the backup site isdesignated as the new primary site and a newbackup site is chosen and is sent copies of all thelocking information from the new primary site.

    Election can be used to choose the new

  • 7/29/2019 ADBMS Unit 5 Modified

    72/78

    Election can be used to choose the newcoordinator site, if both primary and backupsite are down.

    Any site Y that attempts to communicatewith the coordinator site repeatedly and failsto do so can assume that the coordinator is

    down. Y can start the election process by sending

    a message to all running sites proposingthat Y become the new coordinator .

    As soon as Y receives a majority of yesvotes, Y can declare that it is the newcoordinator.

    Distributed Concurrency Control

  • 7/29/2019 ADBMS Unit 5 Modified

    73/78

    Distributed Concurrency Control

    Based on Voting

    In the voting method ,there is no distinguishedcopy; rather a lock request is sent to all sites thatincludes a copy of the data item.

    Each copy maintains its own lock and can grant or

    deny the request for it. The transaction that request a lock is granted that

    lock by a majority of the copies,it holds the lockand informs all copies that it has been granted thelock.

    If a transaction does not receive a majority ofvotes granting it a lock within certain time-outperiod ,it cancels its request and informs all sitesof the cancellation.

  • 7/29/2019 ADBMS Unit 5 Modified

    74/78

    3 Tier Client-Server Architecture

    Three layers exist

    Presentation Layer (Client)

    Application Layer (Business logic)

    Database Server

  • 7/29/2019 ADBMS Unit 5 Modified

    75/78

    Presentation Layer

    Provides user interface and interacts with theuser.

    Web interfaces, forms to the clients in order tointerface with application.

    Web Browsers are often utilized, and thelanguages used include HTML,JAVA,JAVA SCRIPT,PERL, VISUAL BASIC,

    and so on

    Handles user inputs, output, and navigation byaccepting user commands and displaying theneeded information.

  • 7/29/2019 ADBMS Unit 5 Modified

    76/78

    Application Layer

    Programs the application logic.

    Security checks, identity verification, and

    other functions.

    Can interact with one or more databases

    and data sources as needed by

    connecting to the database using

    ODBC,JDBC,SQL/CLI or other databaseaccess techniques.

  • 7/29/2019 ADBMS Unit 5 Modified

    77/78

    Database Server

    Handles Query and update requests fromthe application layer, processes the

    requests and send the results.

    Usually SQL is used.

    Query result may be formatted into XML

    when transmitted between the application

    server and the database server.

  • 7/29/2019 ADBMS Unit 5 Modified

    78/78

    End of Unit 5