Testing Database Applications - Semantic Scholar...5. Output Validator captures the applications...

Testing Database Applications

MTech Seminar Report

by

Rishi Raj Gupta

Roll No: 05305014

under the guidance of

Prof. S. Sudarshan

Computer Science and EngineeringIIT-Bombay

Department of Computer Science and Engineering

Indian Institute of Technology, BombayMumbai

Acknowledgments

I would like to thank my guide, Prof. S. Sudarshan for the consistent motivation anddirections he has fed into my work.

Rishi Raj GuptaMTech-I

CSE, IIT Bombay

Abstract

Database applications play an important role in nearly every organization, yet lit-tle has been done on testing of database applications. They are becoming increasinglycomplex and are subject to constant change. They are often designed to be executedconcurrently by many clients. Testing of database application hence is of utmost impor-tance to avoid any future errors encountered in the application, since a single faults indatabase application can result in unrecoverable data loss and which may effect workingof any organization. Many tools and frameworks for performing testing of database appli-cations, such as AGENDA, has been proposed to populate the test database and generatetest cases which checks the correctness of application. They check database applicationsfor consistency constraints and transactions concurrency. In this report we first presentvarious such procedures and strategies, proposed in the literatures for testing databaseapplications. Strategies for testing of embedded SQL queries within imperative languageare presented next. Finally we present strategies for performing efficient regression testsfor testing database applications and parallel execution of test runs for application system.

Contents

1 Introduction 1

2 Database Testing Tool Set-AGENDA 32.1 Overview of Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32.2 Test Database Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.3 Testing Database Transactions and Concurrency . . . . . . . . . . . . . . . 6

3 Database generation 103.1 Automatic Generation of DB instances . . . . . . . . . . . . . . . . . . . . 103.2 Data Generation Language (DGL) . . . . . . . . . . . . . . . . . . . . . . 11

3.2.1 Overview of Data Types and Iterators . . . . . . . . . . . . . . . . 113.2.2 DGL programs and Expressions . . . . . . . . . . . . . . . . . . . . 123.2.3 Annotated Schemas Using DGL . . . . . . . . . . . . . . . . . . . . 12

4 Testing SQL Queries 144.1 Static Checking of Generated Queries . . . . . . . . . . . . . . . . . . . . . 144.2 Testing of SQL Evaluation Engine by RAGS . . . . . . . . . . . . . . . . . 15

5 Efficient Regression Test of Database 175.1 Database Applications Regression Tests . . . . . . . . . . . . . . . . . . . . 17

5.1.1 Regression Test framework and Overview . . . . . . . . . . . . . . . 175.1.2 Centralized Scheduling Strategies. . . . . . . . . . . . . . . . . . . . 195.1.3 Discussion and Results . . . . . . . . . . . . . . . . . . . . . . . . . 21

5.2 Parallel Execution of Test Runs . . . . . . . . . . . . . . . . . . . . . . . . 21

6 Conclusions and Future Work 23

i

Chapter 1

Introduction

Database application programs play a central role in operation of almost every modernorganization. It is essential that they are throughly tested for all conditions. TestingDatabase applications hasn’t gone through much work. We have many testing tools, suchas JUnit, for testing programs, but thus far only a limited amount of work has beendone on testing database application. Regression testing tools present currently are notoptimized for database testing due to they been designed for stateless applications. Re-gression testing is a type of testing which seeks to uncover regression bugs. The purpose ofregression tests is to detect changes in the behavior of an application after the applicationor its configuration has been changed.Testing of Database applications requires various tasks :

1. Database schema parsing for extracting information.

2. Test database generation.

3. Test cases generation.

4. Validation of state and output of test cases.

As per the report generated by the test tool, the tester may check the consistency andcorrectness of database applications. Various such tools for database testing has beenproposed so far. Generation of test database can also be done in different ways. Dependingupon the technique and tool used, generation of test database provide different amountof coverage of test database. Since Real time data is hard to gather and it even maynot cover all the test cases, therefore different test database generation strategies forgenerating synthetic data has been proposed.

David Chays has proposed an AGENDA tool set [2], designed for testing relationaldatabase applications. Firstly in this report, we will cover the prototype of the tool and theextensions of it to test transactions [11] and concurrency [3]. The tool set uses a parserto extract various schema information from the database, populates the test databasewith either live or synthetic data, generates test cases and executes them, observes thedatabase state before and after test execution and validates the state and output of thesystem to check the database application program. The AGENDA tool finds and generateinteresting and legal schedules to populate the database. AGENDA is also extended totest transactions with multiple queries. In reality, transactions doesn’t work serially, they

1

work concurrently. In order to have feasibility of the tool, it must be capable of identifyingconcurrent transactions and must check for concurrency faults. To reveal the concurrencyfault, a dataflow analysis technique for identifying schedules of transaction execution wasproposed in [11] which is covered in this report.

Secondly, in this report we will look at techniques to test the SQL Queries. Databaseapplication programs typically contain statements written in an imperative programminglanguage with embedded data manipulation commands, such as SQL. There is a need tohave a good test coverage for database systems. Typical SQL test libraries contain a largenumber of statements and it takes time to compose them manually. These test librariescover an important, but tiny, fraction of the SQL input domain. A system called RAGS(Random Generation of SQL) [9] was build to explore automated testing. This systemhelps in generation of SQL statements stochastically (or randomly) which provides thespeed as well as wider coverage of the input domain.

Thirdly, we will consider a fault-based approach for the generation of database in-stances to support white-box testing of embedded SQL program. SQL embedded in im-perative language has to be dealt in order to have a good database test cases or instancesgenrated. It generates the database instances that respect the semantics of SQL state-ments, generating a set of constraints. These constraints collectively represent a propertyagainst which the program is tested. Database instances for program testing can be de-rived by solving the set of constraints using existing constraint solvers. Data-intensiveapplications often dynamically construct database query strings and execute them. Theunderlying SQL query has also to be tested so as to avoid any runtime error encounteringby application program due to SQL. A technique for static checking of such dynamicallygenerated queries in database application proposed by Carl Gould [1] is also covered.

Finally, we will present heuristics for centralized scheduling strategies to perfrom ef-ficient regression tests [4] and parallel execution of test runs [5] for database applicationsystems, such as Shared nothing (SN) and Shared Database (SDB). Database applicationsare becoming increasingly complex, and most databases are subject to constant change.Due to these, the application and its configuration must be frequently changed. Unfortu-nately, such a change is costly, mostly due to carrying out of the tests in order to ensurethe integrity of the application. Testing of database application requires a great deal ofmanual work. Common methods of regression testing include re-running previously runtests and checking whether previously-fixed faults have reemerged. The regression testtools can store the request and responses and some heuristics can be applied to schedulethe test runs for the efficient regression tests [4]. By controlling the state of the databaseduring testing and by ordering the test runs efficiently, the time for regression test can beoptimized.

This report discusses issues that arise in testing database applications. Chapter 2presents an overview of AGENDA tool set and the handling of data generation and con-currency issue. Chapter 3 discusses two different techniques for database generation.Chapter 4 focuses on role SQL Queries can have for testing database applications. Chap-ter 5 presents framework for efficient regression testing and parallel execution of test runsusing some heuristics. Chapter 6 concludes with a summary of the report and the relatedfuture work.

2

Chapter 2

Database Testing Tool Set-AGENDA

The AGENDA system [2] is a comprehensive tool set that semi-automatically generates adatabase and test cases, and assists the tester in checking the results. AGENDA has beendevised to satisfy various integrity constraints while generating test database and they areeven extended to handle transaction and concurrency issues effectively. They deal withthe issue whether the application program is behaving as specified. In this report, wewill see what are the different components of AGENDA tool and how it handles differentissues.

2.1 Overview of Architecture

AGENDA, as shown in Figure 2.1, takes as input the database schema of the databaseon which the application runs, the application source code and the sample values file. Asample-values file contains the suggested values of the attributes, partitioned into groups,called data groups. The user interactively selects test heuristics and provides informationabout expected behavior of test cases. Using this information, AGENDA populates thedatabase, generates inputs to application, executes the application on those inputs andchecks some aspects of correctness of the resulting database state and the applicationoutput. The live data generally do not reflect a sufficient wide variety of possible situationsthat could occur, so there is a need to generate synthetic data for database testing.AGENDA consists of five interacting components that operate with guidance from thetester.

1. Agenda Parser: The core of this parser is SQL Parser. Given a schema, the Post-greSQL parser creates an Abstract Syntax Tree that contains relevant informationabout the tables, attributes, and constraints. However, it is also possible to usedifferent SQL DDL syntactic constructs to express the same underlying informa-tion about a table. It extracts the information from database schema, applicationqueries, tester-supplied sample-values files, and makes this information available toother four components as shown in Figure 2.1. It does this by creating an inter-nal database, which we refer to as AgendaDB, to store extracted information. TheAgendaDB is used and/or modified by the remaining four components.

3

Figure 2.1: Architecture of AGENDA Tool Set.

2. State Generator: This uses the database schema along with information from thetester’s sample-value files indicating useful values for attributes and populates thedatabase tables with data satisfying the constraints. It retrieves the informationfrom the AgendaDB and generates an initial database state for the application,which we refer to as the ApplicationDB.

3. Input Generator generates input data to be supplied to the application. The dataare created by using information that is generated by the Agenda parser and Stategenerator components, along with the information derived from parsing the SQLstatements in the application program and information that is useful for checkingthe test results. Using the AgendaDB, along with the tester’s choice of heuristics,the Input Generator instantiates the input parameters of the application with actualvalues, thus generating test inputs.

4. State Validator: It investigates the changes in the state of the application databaseduring execution of a test. It logs the information in the application tables andchecks the state change.

5. Output Validator captures the applications outputs and checks them against thequery preconditions and postconditions that have been generated by the tool orsupplied by the tester. The precondition could be a join condition supplied by theuser, and the postcondition could the condition satisfying the constraint.

4

Some of the information stored in Agenda database is also stored in the DBMS’sinternal catalog tables. However, building and then querying a separate Agenda databaseallows us to decouple the remaining components from the details. This allows AGENDAto be ported to a different DMBS by changing only the Agenda Parser.

2.2 Test Database Generation

The relational database schema is a set of relation schema’s along with a set of integrityconstraints. There are several types of constraints such as Domain constraints, Uniquenessconstraints, Referential integrity constraints, Semantic integrity constraints, not-NULLconstraints, etc. The AGENDA takes care of these while generating the test databaseand test cases. The AGENDA tool has also been extended to test the transactions withmultiple queries. Concurrency related faults can also be handled by AGENDA as pre-sented in [3].

AGENDA can take care of constraints present in the database tables. It extracts in-formation about uniqueness constraints, referential integrity constraints and not-NULLconstraints. It also extracts limited information from semantic constraints, namely bound-ary values from sufficient simple boolean expressions.

Chays [2] initially considers transactions that consist of a single parameterized query,but later extends it to transactions consisting of multiple queries. A host variable inan SQL query represents either an input parameter or an output parameter. Generat-ing a test case involves instantiating each input parameter with an appropriate value.Validating a test case involves examining the output of a select query and/or exam-ining the resulting database state in order to check that it changed or did not changedappropriately. In general, an SQL query can retrieve many tuples. Typically, the hostlanguage program processes the retrieved tuples one at a time via a cursor. A cursor canbe thought of as a pointer to a single tuple(row) from the result of a query. The goal isthe selection of consistent (that doesn’t violate integrity constraint) and comprehensive(includes different situations to increase likelihood of increasing fault) data.

Consider the following example:SELECT ename, bonus INTO : out name, : out bonus FROM emp WHERE

( (emp.deptno = : in deptno) and (salary> 7000.00 AND salary ≤ 9000.00) );

Here host variables in the INTO clause are output parameters, wheres as those presentin WHERE clause are input parameters. When the above SELECT query is parsed, Agendaparser stores in the Agenda database the host variables and the corresponding attributesof the relation associated with them. Now depending upon the direct or indirect as-sociation between them, the Input Generator initializes the value for the correspondinghost variables. Boundary values can also be easily identified. Agenda Parser extracts theboundary values 7000.00 and 9000.00 along with the attribute and stores this too in theAgenda database. If tester selects boundary values also as a heuristic, to guide the datageneration, then these stored values are used.

Constraints are handled as follows:a) Uniqueness Constraints is handled by looking at the frequency field for a attribute,in the AgendaDB table, to avoid selecting the same value more than once.

5

b) Referential Integrity Constraint: When selecting a value for attribute A in tableT, where this attribute references attribute A

′

in table T′

, the State generator refers tothe value of the records associated with the attribute records for attribute A

′

in table T′

and selects a value that has already been used. The implementation uses a topologicalsort to impose an ordering on the application table names, stored in the AgendaDB.c)Not Null Constraints Each attribute that doesn’t have a not-NULL constraint isconsidered a candidate for NULL by the state generator. Thus, a NULL group is implicitlyadded for this attribute in the AgendaDB. By this, the State generator knows that it canchose NULL when generating a value for such an attribute, and Input generator knowsthat it can instantiate an input parameter with a NULL value for this attribute.

2.3 Testing Database Transactions and Concurrency

Deng and Chays [11] focus on testing database transactions to check whether, when runin isolation, they are consistent with their requirement. AGENDA’s approach to generat-ing and executing tests is based on integrity constraints, specifically state constraints (apredicate over database states) and transition constraints (which involves database statesbefore and after execution of a transaction). Transaction consistency has two aspects:when run in isolation, it should remain consistent, and the relation between old and newstate should satisfy the requirements of the transaction’s specification.

To check a state constraint, which is not enforced by the DBMS, AGENDA createstemporary tables to store the relevant data. It translates such constraints to the con-straints on temporary tables which can be enforced by the DBMS. The tester can specifypreconditions and postconditions to test for the global consistency constraint. Temporarytables are created to deal with joining relevant attributes from different tables and to re-place calls to aggregate functions by single attributes representing the aggregate returned.For example, a reference to SUM(X) in a constraint gives rise to a temporary attributeX SUM. The constraint to be checked is translated into a constraint on a temporarytable. To populate the temporary table, constraints are added to temporary table andcontents are copied into it, with the constraint checked automatically after each insertion.Constraint violations are reported to AGENDA, indicating that the transaction violateda global state constraint.

Current version AGENDA-0.1 can check relatively simple transition constraints thatinvolve a single table. This tool modifies the schema so that for each table, there isan additional log table that records all modifications made to the table on execution ofapplication program. Appropriate log tables are filled in response to each insert, modifyor delete operation using a trigger. The tester supplies simple constraints involving oldand new values in a row and AGENDA translates these into constraints on log tables.Logging is based on temporary tables with attributes representing old and new valuesof attributes and aggregates from all the tables involved in the constraint. The checkconstraint on that tables comes from the postcondition of the transition constraint. TheSELECT statement selects the relevant attributes from the application/log tables and storesits result into a cursor, which then fills the temporary table, with the constraints beingautomatically checked for each row.

Initially, all global consistency constraints are validated for the initial database. After a

6

transition commits, AGENDA checks the log tables for all the application tables associatedwith each global constraint. If they are all empty, nothing relevant to that constraint haschanged, else the global constraint is checked. A faulty updating of tables by a transitioncant be exposed only by transition check, but if this violation of transition consistencyconstraint causes a violation of a state constraint, AGENDA will find the problem. Statechecking and transition checking complement each other to check consistency efficiently.

Testing Database Concurrency

Deng, Franki and Chen in their work [3] present a dataflow analysis technique for iden-tifying schedules of transaction execution to reveal concurrency faults potentially due tooffline concurrency problem. Although the DBMS employs sophisticated mechanisms toassure that transactions satisfy the ACID properties - Atomicity, Consistency, Isolation,and Durability - an application can still have concurrency related faults if the applicationprogrammer erroneously placed queries in separate transactions when they should havebeen executed as a unit. In their work [3], authors considered application programs writtenin embedded SQL, where multiple clients executing an application program concurrentlycan have their own host variable. In this section I will refer to the work mentioned intheir paper.

Consistency constraints are violated if more than one instance tries to modify the samedata element concurrently. To avoid interference from other instances, related operationsshould be grouped into a transaction. To improve efficiency, operations on different tablesor different tuples of the same table could be perform concurrently. Concurrency problemwhich occurs when data are manipulated across multiple database transactions is termedoffline concurrency problem.

Consider an example of a Ticket booking system. Suppose a host variable avail con-tains the information about the available seats for the flight. Now suppose the applicationerroneously commits between the functions made for booking of the ticket and updatingthe value of the avail, then we get two different transactions which if concurrently accessedby, say, two different clients can result in inconsistent value of the variable avail.

During the execution of concurrent database transactions, different phenomena arepossible [3]:a) P0 (Lost update/ Dirty Write)b) P1 (Dirty Read)c) P2 (Non-repeatable Read)d) P3 (Phantom Phenomena)

DBMS systems allow transactions to run at four isolation levels from least stringent(highest concurrency) to most stringent (lowest concurrency): READ UNCOMMITTED(level 0), READ COMMITTED (level 1), REPEATABLE READ (level 2), SERIALIZ-ABLE (level 3). P0 is impossible at any isolation level while P1, P2, and P3 may happenat levels 0, 1 and 2. When a transaction is executed at the SERIALIZABLE level, theDBMS transaction manager assures that none of the phenomena can occur. It allowstransaction concurrency only if the result of concurrent executions is the same as thatof a sequential execution of transactions in some order. The authors had consideredtransactions running at the SERIALIZABLE level.

7

Procedure to test Transaction Concurrency

Testing concurrent systems is difficult due to non-determinism and synchronization prob-lems. It is even difficult to diagnose and debug faults that have been detected. Concurrentprograms have their own problems of data access control and synchronization. The au-thors proposed the following procedure to test transaction concurrency :

1. Find interesting and legal schedules (or partial schedules). A schedule is interestingif it could potentially produce concurrency errors and it is legal if it is a feasiblepath in the application execution. A schedule can be represented by the form S = <

T mA , T l

B, T kA >, where T m, T i and T k are transactions and are not necessarily distinct,

and A and B are two different application instances. Schedule S is interesting ifT m

A , T lB, T k

A access the same data element.

2. Generate test cases to execute the given transactions for each generated schedule.Transactions in a schedule usually access the same data element (table and at-tribute). To guarantee that all transactions in a schedule access the same tuple inthe same database table, the test case generator should populate the inputs to thesetransactions properly.

3. Execute the test cases in such a way that the interleaving of executions of applicationinstances conforms to the specified schedule. This can be achieved by using twomethods, modification of the concurrency control engine in the DBMS backend andinstrumentation of the application source code.

A flow sensitive data flow analysis (DFA) technique is used to find interesting andlegal schedules. For this, firstly, a transaction control flow graph (TCFG) is constructedbased on the application source code. To figure out the set of data elements and the set ofhost variables a transaction accesses, we can use the Agenda tool set. When SQL queriesin a transaction are parsed by Agenda Parser, information about all data elements to beaccessed (reads or writes) is stored in the table xact read write, and information abouthost variables to be accessed (defines or uses) and their associated tables and attributesis stored in the table xact parameter. Suppose in the example we considered above, iftwo transactions T m, and T k in the same instance access the same data element in somepath of the TCFG, there is a potential concurrency error. To generate the interesting andlegal transaction, the DFA algorithm is used to propagate data information in the TCFG.Each node in the TCFG corresponds to a transaction, and the DFA information witheach node is a set of tuples consisting of data elements read/written and host variablesdefined/used by the node (transaction). Now using some simple heuristics and these datasets associated with each node, interesting schedules can be generated.

To enforce the generated schedule (or partial schedule), test cases must be generatedin such a way that transactions in the given schedule manipulate the same tuple (row) inthe database table. To guarantee that the test is executed according to the desired sched-ule, two methods were suggested, modification of the DBMS transaction manager andinstrumentation of the application. Both methods use shared memory and semaphoresfor synchronization and coordination of different application instances. In first case, thetransaction manager can be modified so that it reads information about the desired sched-ule (e.g., from a configuration file) into its shared memory. Semaphore is used to control

8

the access by all DBMS back-ends. Based on the connection identifier (i.e. process iden-tifier) and transaction identifier, the transaction manager can determine if the currentapplication instance is the desired one.

In the second scenario, a partial schedule is generated and the execution order of asubset of transactions is specified. Different transactions may contain the same query. Todistinguish transactions, we assign a static identifier (xid) for each transaction. In thissecond case, we add a control in the application source code. This approach has beenused in [3] to check for consistency in transactions. They also provide a procedure fortesting transactions atomicity/durability. Details can be found in [3].

9

Chapter 3

Database generation

Database generation for testing is largely done by synthetic data. We had seen in section??, generation of test database handled by AGENDA tool set. There are various alterna-tives to do so. Generation of database can be achieved by inclusion of database instances[6] that respect the semantics of SQL statements. Some special purpose languages, suchas DGL (Data Generation Language), FGL (Fact generation language) exist which canbe used to develop framework for database generation. DGL [8] use various datatypeswhich can be utilized to populate the database. Genie [10], a database generation toolfor testing inference detection is using FGL for database generation. In this chapter wewill discuss database generation using database instances and by using DGL framework.

3.1 Automatic Generation of DB instances

In this section we will look into a generation strategy for database instances that respectthe semantics of SQL statements embedded in a database application program [6]. Wehave some tools available which can generate a set of constraints. These constraintscollectively represent a property against which the program is tested, and the databaseinstances can be derived out of it using existing constraint solvers.

To test a program with embedded SQL statement, we need to find appropriate inputdata, which include a database instance. Test data generation necessitates the tacklingof the following general problem:“Given an SQL statement and a property, find a database instance such that the resultof executing the statement satisfies the property.”

The property depend upon the requirements of application. For instance, they can belike:(P1) The result is empty, i.e., it does not have any row.(P2) The result table has a row which has a negative attribute value.

The above two scenarios represent null case and exception respectively. They can evenbe represented by the following two formulas in the syntax of SQL, say, NOT EXIST Rstand 0 > ANY Rst, where Rst denotes the result table.

To generate the test data, we use a formula to describe relationship between theinput (database instance) and the output (result table). If we can find the appropriatevalues for the variables satisfying the formula, such that all the constraints hold, then we

10

can obtain the desired database instances. It is known that the relational algebra withthe basic operators such as union and join has the same expressive power as first-orderpredicate logic. Thus, expressing constraints in the form of formulas and satisfying allthe constraints, generates good database instances.

Depending on the form of constraints, we can have different automatic constraintsolving tools. One such tool is BoNuS [6]. It can solve constraints involving variablesof differnet types (e.g., enumeration, integers, reals). Suppose there is an conditionalexpression like, ((x ≥ 10)or(x ≤ 4)and(x ≥ 2y + 1)and(y ≥ 6). We may give thefollowing as input to BoNuS:

int x,y;bool B1 = (x >= 10);bool B2 = (x <= 4);bool B3 = (x >= 2*y + 1);bool B4 = (y >= 6);and(or(B1,B2), B3, B4);

A solution will be found, e.g., x = 13, y = 6.Apart from constraints generated from the SQL, we can have constraints of other

kinds as well, like uniqueness constraints, business rules, etc. An automatic tool forgenerating database instances has been been described on similar lines as above in [6].Its input consists of an SQL statement, the database schema definition, together with anassertion which represents the requirement of the tester. The output is a set of constraintswhich can be given to existing constraint solvers. If they are satisfiable, we obtain thedesired database instances. Having a fully automatic tool for test data generation is closeto impossible, but we can reasonably expect that a powerful tool will assist the testersignificantly.

3.2 Data Generation Language (DGL)

Bruno [8] has presented a flexible framework for database generation. In many situations,for generating data for testing database applications, synthetic databases are the onlychoices: real data might not be available or comprehensible enough to evaluate certainapplications. Unfortunately, there is no available data generation framework capable ofmodelling varying rich data distributions in context of RDBMS. To avoid use of ad-hocgeneration tools by individual researchers, this framework has been proposed, which usesa special purpose language with a functional flavor, called DGL.

DGL uses the concept of iterators as basic units that can be composed to producestreams of tuples. DGL can also interact with an underlying RDBMS and leverage its well-tuned and scalable algorithms. It can further simplify the specification by adding “DGLannotations”, as mentioned in section 3.2.3, to the SQL CREATE TABLE statement,which specifies how a table should be populated.

3.2.1 Overview of Data Types and Iterators

DGL supports various datatypes to generate synthetic data distributions. We will discusssome main data types to get the power and use of DGL.

11

Scalars are the most basic type in DGL, and are further subdivided into integers (Int),double precision (Double), strings (string) and dates (Datetime).Rows are fixed-size heterogeneous sequences of scalars. For instance, R=[1.0,2,‘Rajdeep’]has type [Double, Int, string]. Expression dim(R) returns the number of columns inR. The operator ++ combines rows, so [2,‘Vishal’] ++ [‘deep’,3] returns a new row[2,‘vishal’,‘deep’,3].Iterators are objects in DGL. They support operations such as open (to initialize iter-ators), getNext (to return next new Row or an end-of-iterator special value). An Ex-ample to sum up the basic of DGL, consider I1=Step(1,100,2), I2=Step(5, 21, 3), andI3=Constant([10,20,30]). In that case,

I3[1] = [20], [20],...I1++I2 = [1,5], [3,8], ..., [11,20]I1+I2-I3[0] = [-4], [1], ..., [21]

Similarly, there are various primitive iterators which take care of the distributions(uniform, exponential, etc), duplicate eliminations and for providing a bridge betweenDGL and underlying RDBMS. Persit and Query. Persit bulk loads all rows provided byiterator I to a database table s, and returns the string s. Query takes parameterized querystring and sequence of iterators, materializes the i-th iterator, replaces the parameter withtemporary table and executes the resulting query. The i-th parameter is denoted as ¡i¿in sqlStr. The expression below returns a random permutation of all odd numbers below1000:Query( “SELECT v0 FROM < 0 > ORDER BY v1”, Step(1,1000,2) ++ Uniform(0,1))

In this expression, Query takes an iterator of two columns, where the first columnconsists of all odd numbers between 1 and 1000 and the second is a random number. Thisis a very primitive functionality of DGL, but it can have more complex expression withmuch additional functionality.

3.2.2 DGL programs and Expressions

A DGL program is a set of function definitions followed by an expression (called the main

expression). For instance, we can define a DGL function simpeF that parameterizes theDGL expression and can be called as simpleF(0.65, 10000).

simpleF (P, count) =LET U = Uniform([5,7], [15,13]),

N = Normal([5,5], [1,2])IN Top(ProbUnion(U, N, P), count)

Evaluating a DGL program is equivalent to evaluating its main expression, casting theresult to an iterator, and returning all the rows produced by this iterator.

3.2.3 Annotated Schemas Using DGL

Annotating the SQL CREATE TABLE statement additionally specify how to populatethe created table [8]. The syntax is as follows:

12

CREATE TABLE T (col1type1, ..., colntypen)[ other CREATE TABLE options ]POPULATE N AS ((col11, ...col1n) = expr1,

...(colk1, ...colkn) = exprk)

where N is an integer that specifies an upper bound on the size of the created table, eachcolumn colj in T is mentioned exactly once in the POPULATE clause, and expri is aDGL expression with some additional syntactic sugar. A database specification consistsof a batch of annotated CREATE TABLE statements and it is processed as follows. First,each table is created omitting the POPULATE clause in its CREATE TABLE statement.Then, a single DGL program is built from all the POPULATE annotations. Finally, theDGL program is evaluated, populating database tables as a side effect. DGL is believedto be an important first step towards generation of reusable synthetic databases.

13

Chapter 4

Testing SQL Queries

Testing is a critical activity for database application programs, since undetected faults canlead to unrecoverable data loss. Database application programs typically contain state-ments written in an imperative programming language with embedded data manipulationcommands, such as SQL. Generation of SQL test statements which has a wider coverageof input domain and static checking of queries in a database applications are just the fewefforts towards effective testing of Database Applications with regard to embedded SQLqueries. These are discussed in brief in the following sections.

4.1 Static Checking of Generated Queries

Testing database applications can have another flavor too, apart from testing the Databasestate transitions and testing database systems. Testing could also be done on SQL queriesdynamically or statically generated within an imperative programming language. Theprobability of SQL queries giving run-time exception should be brought down by apply-ing proper testing. Many data intensive applications dynamically construct and executequeries in response to client requests. For example, the servlet programmer enjoys staticchecking via Java’s strong type system.

In this section we discuss results from [1] to illustrate the errors that programmersmight make when programming Java servlet applications. Consider a front-end Javaservlet for a store, with an SQL-driven database back-end. The database table INVEN-TORY, contains list of all items in the store, while table TYPES is used to look uptype-codes for items. The TYPE column is considered as an integer, representing theproduct type-codes of the items in the table.

ResultSet getPrices(String lowerBound) {String query = “SELECT ‘ $ ’ ”+ “(RETAIL/100) FROM INVENTORY ”+ “WHERE ”;

if (lowerBound != null) {query += “WHOLESALE > ” + lowerBound + “AND ”;

} query += “TYPE IN (“ + getPerishableTypeCode()+ ”);”;return statement.executeQuery(query);

}

14

String getTypeCode() {return “SELECT TYPECODE, TYPEDESC FROM TYPES ”+ “WHERE NAME = ‘fish’ OR NAME = ‘meat’ ”;

}

The method getTypeCode constructs the string query to hod an SQL SELECTstatement to return prices of the items and executes the query. It uses string returnedby the method getTypeCode as a sub-query. If lowerbound is ”250”, then query to beexecuted is:

SELECT ‘$’ ||(RETAIL/100) FROM INVENTORYWHERE WHOLESALE > 250 AND TYPE IN(SELECT TYPECODE, TYPEDESC FROM TYPESWHERE NAME = ‘fish’ OR NAME = ‘meat’);

However, the Java type system does little to check for possible errors in the dynam-ically generated SQL query strings. Several different runtime errors can arise with thisexample, as described below. Note that none of these would be caught by Javas typesystem:Error (1) The expression ‘$’ || (RETAIL/100) concatenates the character ’$’ with theresult of the numeric expression RETAIL/100. While some database systems will implic-itly type-cast the numeric result to a string, many do not, and will issue a runtime error.Error (2) Consider the expression WHOLESALE > lowerBound. The variable lower-Bound is declared as a string, and the WHOLESALE column is of type integer. As long aslowerBound is indeed a string representing a number, there are no type errors. However,this is risky: nothing (certainly not the Java type system itself) keeps the string variablelowerBound from containing non-numeric characters.Error (3) The string returned by the method getTypeCode() constitutes a subquerythat selects two columns from the table TYPES. Because the IN clause of SQL supportsonly subqueries returning a single column (in this context), a runtime error would arise.This can happen if the method getTypeCode() did return a single column before, but wasinadvertently changed to return two columns.

A static analysis to flag potential errors or guarantee their absence in dynamicallygenerated SQL queries has been proposed in [1]. In this work, a variant of the context-free language (CFL) reachability problem has been used to achieve this aim. As a firststep, the analysis builds upon a static string analysis to build a conservative representationof the generated query strings as a finite-state automaton. Then, it statically check thefinite-state automaton with a modified version of the context-free language reachabilityalgorithm.

4.2 Testing of SQL Evaluation Engine by RAGS

A system called RAGS (Random Generation of SQL) [9] was built to explore automatedtesting of SQL database system. The input domain, all SQL statements, from any numberof users, combined with all states of the database, is gigantic. It is also difficult to verifyoutput for positive tests because the semantics of SQL are complicated. In such a case,

15

tool such as RAGS comes to rescue which can generate, rapidly and randomly, a verylarge number of SQL statements without human intervention.

The SQL test libraries provided provide only a narrow region of input domain. Theuse of RAGS contributes in increasing the size of a SQL test library million folds, tryingto make all aspects of generated SQL statement configurable and thereby maximizing bugdetection rate.

The RAGS program first reads from the configuration file (which identifies SQL sys-tems), connects to the DBMS and reads the schema information. RAGS loops to generateSQL statements and optionally execute them. Several instances of DBMS can be executedconcurrently to represent multiple users. We can even execute the same query, say SQLselect, on multiple vendor’s DBMSs and then compare the result. This can be used torecord the errors and then print the report accordingly. Different tests such as multi-userTest, comparison tests, visualization, etc can be later used to test the SQL statementsand help in validating the outputs or return values of SQL queries.

SQL Statement Generation

RAGS [9] generates SQL statements by walking a stochastic parse tree and printing itout. Consider the SQL statement.

SELECT name, salary+commissionFROM EmployeeWHERE (salary > 10000 ) AND (department = ‘sales’)

We can find the parse tree for the statement, and the program would walk throughthe tree and print out the SQL text. RAGS is like that program except that it builds thetree stochastically as it walk it. RAGS follows the semantic rules of SQL by carrying stateinformation and directives on its walk down the tree and results of stochastic outcomesas it walks up. For example, the datatype of an example expression is carried down anexpresion tree and the name of a column reference that comprises an entire expression iscarried up the tree.

16

Chapter 5

Efficient Regression Test of Database

Database applications are increasingly becoming complex. They are subject to constantchange, for instance, business processes are re-engineered, authorization rules are changed,or optimizations are added for better performance. Unfortunately, changing DatabaseApplication is very costly. In order to ensure the integrity of the applications, mostorganizations have test installations of their software components and special test databaseinstances.

5.1 Database Applications Regression Tests

There are various tools that supports regression testing, most popular being JUnit, buttesting a database applications through these tools requires a great deal of manual work.The reason for this is that these tools are designed for stateless applications and order ofexecution is important for database applications. In order to carry out regression tests,companies uses various commercial tool, one being HT Trace. This tool has a teach-incomponent inorder to record test runs and a batch component in order to automaticallyrun test runs and detect changes in the answers produced by the application for each useraction. The test tool can be extended by a control component that controls in whichorder the test runs are executed and when test database is reset.

5.1.1 Regression Test framework and Overview

User interacts with the database application, as in Figure 5.1, in the form of request andreceives the answer from the application; e.g., query results, acknowledgments, and errormessages. The purpose of the regression tests is to detect changes in the behavior ofan applications or its configuration has been changed. To carry the regression tests wefocus on so-called black-box tests; i.e., there is no knowledge of the implementation oft heapplication available. In the first phase, test engineers or a test case generation tool createtest cases. In other words, interesting requests are generated and issued to a regressiontest tool. We expect the Phase 1 to work correctly so that the answers returned by theapplication are correct and the new state of the test database is expected to be correct,too. For complex applications, many thousands of such requests (and answers) are storedin the repository.

17

Figure 5.1: Regression Test’s test run execution phase.

After the application has changed, the regression test tool is started in order to find outhow the changes have affected the behavior of the application. This is done by re-issuingautomatically the requests recorded in the repository and comparing the answers of theupdated application with the answers stored in the repository as shown in the figure 5.1.At the end, a report is created for for all the requests that are failed.

Usually, several requests are bundled into test runs and failures are reported in thethe granularity of test runs. This has two fold advantage. Firstly, building requests intotest runs improves the manageability of the regression tests. Secondly, If a whole businessprocess has to be tested, in a specific sequence of requests. Testing a Database applicationinvolves the following components. The terminology is as follows:

Test Database D: The state of an application at the beginning of each test. In general,this state can involve several database instances, network connections, message queues,etc.Reset R: An operation that brings the application back into state D. This operation ispotentially needed after the execution of a test that updated the database. Since testingchanges the state of an application, this operation needs to be carried out in order to beable to repeat tests.Request Q: The execution of a function of the application. The result of the functiondepends on the parameters of the function call (encapsulated in the request) and the stateof the test database at the time the request is executed. A request can have side effects;i.e., change the state of the test database.Test Run T: A sequence of requests Q1,...,Qn that are always executed in the sameorder. For instance, a test run tests a specific business process that is composed of severalactions (login, view product catalog, place order, specify payment, etc.). The test run isthe unit in which failures are reported. It is assumed that the test database is in state D

18

at the beginning or the execution of a test run. During the execution of a test run thestate may change due to the execution of requests with side-effects.Schedule S: A sequence of test runs and resets. The test runs and reset operations arecarried out one at a time; there is no concurrency in this framework.Failed Test Run: A test run for which at least one request does not return the expectedresult. A failed test run indicates a bug in the application program.False Negative : A test run fails although there was no bug in the application. A reasonfor it may be the application being in the wrong state at the beginning of the execution.False negatives are expensive by obvious reasons.False Positive : A test run does not fail although the application has a bug. The reasonfor it may be same as above.

5.1.2 Centralized Scheduling Strategies.

A control strategy determines the schedule by deciding the order of test run executionand knowing when the database is reset (R operation). The way they deal with falsenegatives, they have two possible alternatives: (i) avoidance and (ii) resolution. No-updateand Reset Always are representatives for avoidance based strategies. The strategies arebriefly mentioned as under :a) No-Update :It creates test runs in a way that the test database remains unchanged. For example, forevery request causing insert in the database, test run also contain a request in order todelete that inserted value from database. Thus test runs can be executed in any orderand there is no need for reseting the test database.b) Reset Always :It follows the simplest control strategy. Here also the test runs can be executed in anyorder. This is possible since using this strategy needs to apply Reset over the test databasebefore each execution of test run. It executes like :

R T1 R T2 R T3 ... Tn

c) Optimistic :Obviously, the Reset Always strategy is sub-optimal since it requires Reset after every testrun. Resets are altogether not necessary after every test run. Optimistic control strategytries to overcome this disadvantage by making test database to be reset, only when a testrun fails. In such a case, this strategy reports this test run as a failure and re-run of thattest run after a reset. Here test run can be executed in any order.As an example, the Optimistic control strategy could result in the following schedule:

R T1 T2 T3 R T3 T4 ... Tn

d) Optimistic++ :The Optimistic strategy has to carry out some test run twice (although better than doingunnecessary resets by Reset Always strategy). The Optimistic++ strategy remembersinvalidations and , therefore, avoid the execution of the test run twice. In other words,the Optimistic++ behaves exactly like Optimistic strategy for the first time a regressiontest is executed, but avoids double execution of test runs in later iterations.

19

Optimistic++ strategy is known to perform better than both Reset Always and Optimistic

strategy. Continuing the example of Optimistic strategy, the Optimistic++ produces thefollowing schedule for the second and later iterations:

R T1 T2 R T3 T4 ... Tn

In Optimistic++ strategy, a test run failure can have two possible explanations: (i) theapplication program has a bug, or (ii) the test database was in a wrong state due tothe earlier execution of other test runs. To be sure that only bugs are reported, theOptimistic++ strategy first Resets the test database (i.e., execute R) and then re-executethe test run that failed. If the test run fails again, it is reported as a potential bug inthe application program. If not, then the first failure was obviously due to test databasebeing in the wrong state. In this case, the test run is not reported and Optimistic++

remembers the test runs that were executed before this test run and records a conflict ina conflict database.e) Slice :Slice extends the Optimistic++ strategy. The Slice approach reorders whole sequences oftest runs that can be executed without a reset; these sequences are called slices. The Slice

heuristics use the conflict information in order to find a schedule in which as few resetsas possible are necessary. The conflict information is gathered in the same way as for theOptimistic++ strategy. If there is a conflict between test runs < Ti > and T, then Sliceexecutes T before < Ti >. At the same time, however, Slice does not change the orderin which the test runs in < Ti > are executed because those test runs can be executedin that order without requiring a database reset. Such a sequence of test runs is calleda slice. The Slice heuristics can best be described by an example with five test runs T1,. . . , T5. Initially, no conflict information is available. Assume that the random orderexecution of test runs results in the following schedule:

R T1 T2 T3 R T3 T4 T5 R T5

From this schedule, we can derive two conflicts: < T1T2 >→ T3 and < T3T4 >→ T5.Correspondingly, there are 3 slices: < T1T2 >, < T3T4 >, and < T5 >. Based on theconflicting information in the conflict database and the collected slices, Slices executes T3before < T1T2 > and T5 before < T3T4 > in the next iteration. In other words, the testruns in the following order: T5 T3 T4 T1 T2. Let us assume that this execution resultsin the following schedule:

R T5 T3 T4 T1 T2 R T2

In addition to the already known conflicts, the following conflict is added to the conflictdatabase: < T5T3T4T1 >→ T2. As a result, the next time the test runs are executed,the Slice heuristics try the following order: T2 T5 T3 T4 T1.

The Slice heuristics reorders the test runs with every iteration until reordering doesnot help anymore either because the schedule is perfect or because of the cycles in theconflict data. The details can be found in [4].

20

f) Graph-based heuristics:

Graph-based heuristics are alternatives to the Slice heuristics. The idea is to modelconflicts as a directed graph in which the nodes are test runs and the edges Tj → Tk areconflicts indicating that Tj might update the test database in such a way that a reset (R)is necessary with probability w i order to execute T, after Tj. The edges are weighted bythis probability w, based on which a graph reduction algorithm can be applied to find agood order to execute test runs. The best known heuristics for the graph reduction werecalled MaxWeightedDiff in [4].

5.1.3 Discussion and Results

We previously found that Optimistic++ strategy shows better performance than boththe Reset Always and Optimistic strategies in all cases. The Slice heuristics are alwaysbetter than the basic Optimistic++ approach. It can be shown that the number of resetsdecreases monotonically with the number of iterations. In other words, re-ordering slicesnever results in a worse schedule. Therefore, the Slice heuristics always converge after afinite number of iterations. On the negative side, it is easy to see that Slice is not alwaysoptimal. For detail refer to example in [4]. The Slice and Graph-based heuristics comeunder Progressive Algorithms, since they extend the Optimistic++ approach. The basicidea is to progressively change the order in which test runs are executed based on theexperience obtained in previous regression tests.In general, the Slice and MaxWeightedDiff heuristics perform better than Optimistic

strategy due to re-ordering of the test runs so that conflicting test runs were not executedconsecutively. The Optimistic strategy performs well, on an average, if there were fewconflicts, whereas Slice was better, if there were many conflicts. MaxWeightedDiff wasfound to be the best graph-based heuristics, and was better than Slice for few conflicts.

5.2 Parallel Execution of Test Runs

Executing the test runs in parallel [5] is obviously very important if many test runs needto be executed. The goal is to exploit the available resources as well as possible. Toachieve the linear speed up, the load on all machines need to be balanced, state of thetest database(s) need to be controlled, and test runs need to be executed in such a waythat the number of database reset operations is minimized. Parallel Testing involvessolving a two-dimensional optimization problem: (i) partitioning: deciding which test runto execute on which machine. (ii) ordering: deciding the order of execution of test runson each machine.

Here we will look at two architectures, Shared-Nothing (SN) and Shared-Database

(SDB), for presenting parallel execution strategies for Test runs. SN executes test runsconcurrently on different machines and database instances so that the execution of con-current test runs does not interfere. SDB executes test runs concurrently on differentthreads using the same database instance and, thus, the concurrent test runs interfere.Depending on the architecture, SDB or SN, a parallel execution can increase the numberof resets due to interference (SDB) or decrease the number of resets (SN) by executing

21

test runs, that are in conflict, concurrently. Here we will discuss 3 important heuristicsand compare them for SN and SDB.

i) Parallel Optimistic++ :The parallel Optimistic++ strategy for SN and SDB works in exactly the same way.That is, the test runs are put into the scheduler’s input queue in random order and thescheduler always selects the first test run from its input queue when a test run has beencompleted. (i.e, when a test thread becomes available in case of SDB). The Optimistic++strategy for SN and SDB differs in the way the conflicts are recorded.ii) Parallel Slice :The key idea of the parallel Slice heuristics for SN is to schedule whole slices rather thanindividual test runs. In SN, naturally, the test runs of a slice should be executed sequen-tially on the same machine. In the SDB architecture, the idea is to execute test runsfrom the same slice concurrently on different threads because these test runs are knownnot to be in conflict. In SDB, a database reset terminates the execution of test runsin all concurrent threads. For detailed explanation, readers may refer [5]. Parallel Slice

heuristic, uses timestamps to define the total order on concurrent the test runs.iii) Parallel MaxWeightedDiff :In SN architecture, the Conflict graph construction and order determination are donesimilarly to regular MaxWeightedDiff heuristics for a serial execution. When the execu-tion of a test run has completed on, say, machine Mx, then the scheduler selects the firsttest run from its input queue, if the cumulated weights of test runs in the history of Mxto that test run is lower than a certain threshold. Otherwise, this criterion is tested forthe second test run, third test run and so, until a suitable test run is found. If no suchtest run exists, then the first test run is selected. Similarly for SDB, the scheduler alwaysselects the first test run from its input queue when a test thread becomes available.

22

Chapter 6

Conclusions and Future Work

In response to a lack of existing approaches specifically designed for testing database appli-cations, the proposed framework AGENDA [2], discussed here, is able to address variousdatabase issues. It’s ability to handle constraints like not-NULL, uniqueness, referen-tial integrity, along with handling of transactions concurrency has made it a prominentframework for testing database applications. Currently, AGENDA is believed to be theonly framework specifically addressing the testing of database applications. Various othertechniques have been proposed to improve the process of testing, but to successfully bringall those under a single functional tool is the next task of researchers. A method forautomatic generation of database instances has been proposed [6], which can be used forwhite-box testing. But it has some limitations, for instance, it does not handle stringvariables. Improvement of such constraint generation tools will help in the generationof database instances, for the selection of test cases to test the databases, as per thesemantics of SQL statements embedded in a application program.

Applying regression tests over database application naively, doesn’t scale well andplaces heavy burden on test engineers. They often limit the number of tests that can becarried out automatically. To avoid this, alternative approaches in the form of heuristicshave been proposed in [4]. Among various heuristics described, Slice and MaxWeightedDiff

were identified as the heuristics that performed well in test generation. These heuristics,apart from assisting in efficient regression tests, are also useful in generating parallelexecution of test runs for database application systems. Three strategies Slice, Optimistic

and MaxWeightedDiff were discussed which help in parallel execution of test runs fortwo common architectures, Shared-Nothing (SN) and Shared-Database (SDB). A flexibledatabase generation framework [8] based upon the DGL (Data Generation Language) wasalso covered in the report. DGL, a simple specification language, has been introduced togenerate databases with complex synthetic distributions. It is believed that such techniquewill be much useful towards creation of reusable synthetic database.

Apart from current proposals, the whole topic of testing database applications in still inits infancy. Although many methodologies have been devised and proposed, yet no singlemethodology has been able to satisfy all the demands of database application testing.Still there are several open issues such as the automatic generation and evolution of testruns, the generation of test databases, and the development of platform independent tools.These issues and others are under study and are expected to be addressed in near futureand integrated in a single tool to perform efficient testing of database applications.

23

References

[1] Z. Su C. Gould and P. Devanbu. Static Checking of Dynamically Generated Queriesin Database Applications. In International Conference on Software Engineering,,may 2004.

[2] D. Chays. Test Data Generation for Relational Database Applications. PhD thesis,Polytechnic University, Jan. 2004.

[3] Yuetang Deng, Phyllis G. Frankl, and Zhongqiang Chen. Testing database transac-tion concurrency. ASE, pages 184–195, 2003.

[4] A. Kreutz F. Haftmann, D. Kossmann. Efficient Regression Tests for Database Ap-plications. Proceedings of CIDR Conference, 2005.

[5] E. Lo F. Haftmann, D. Kossmann. Parallel Execution of Test Runs for DatabaseApplication Systems. Proceeding of the 31st VLDB Conference, Trondheim, Norway,pages 589–600, 2005.

[6] C. Xu J. Zhang and S.C. Cheung. Automatic Generation of Database Instances forWhite-Box Testing. Proceedings of 25th Annual International Computer Software

and Applications Conference, October 2001.

[7] Javier Tuya Maria Jose Suarez Cabal. Improvement of test data by measuring SQLcoverage. Eleventh International workshop on Software Technology and Engineering

Practice., pages 234–240, 2003.

[8] S. Chaudhuri N. Bruno. Flexible Database Generators. Proceeding of the 31st VLDB

Conference, Trondheim, Norway, 2005.

[9] Don Slutz. Massive Stochastic Testing of SQL. Proceeding of the 24th VLDB Con-

ference, New York, USA, 1998.

[10] Randall P. Wolf Thomas H. Hinke, Harry S. Delugach. Genie:A Database generatorfor Inference Detection Tool, Feb. 1995. University of Alabama in Huntsville, USA.

[11] P. Frankl Y. Deng, D. Chays. Testing Database Transactions with AGENDA. Pro-

ceedings of the 27th international conference on Software engineering, pages 78–87,2005.

24

Testing Database Applications - Semantic Scholar...5. Output Validator captures the applications...

Documents

Transcript of Testing Database Applications - Semantic Scholar...5. Output Validator captures the applications...