A DBMS With SQL Interpreter

15
Final Project End Term Report A DBMS WITH SQL INTERPRETER A Project Report Submitted in Partial Fulfillment of the Requirements of the Award of Degree of Bachelor of Technology of National Institute of Technology Calicut By NADAR JERIL DAVID Y2.358 NASIM AKHTER MONDAL Y2.270 PRANAB KUMAR JHA Y2.252 Under Dr. Vineeth Kumar Paleri Group No. 12 DEPARTMENT OF COMPUTER SCIENCE ENGINEERING NATIONAL INSTITUTE OF TECHNOLOGY CALICUT- 673601 March 2006

Transcript of A DBMS With SQL Interpreter

Page 1: A DBMS With SQL Interpreter

Final Project End Term Report

A DBMS WITH SQL INTERPRETER

A

Project Report

Submitted in Partial Fulfillment of the

Requirements of the Award of Degree of

Bachelor of Technology

of

National Institute of Technology Calicut By

NADAR JERIL DAVID Y2.358 NASIM AKHTER MONDAL Y2.270

PRANAB KUMAR JHA Y2.252

Under

Dr. Vineeth Kumar Paleri

Group No. 12

DEPARTMENT OF COMPUTER SCIENCE ENGINEERING

NATIONAL INSTITUTE OF TECHNOLOGY CALICUT- 673601

March 2006

Page 2: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

2

ACKNOWLEDGEMENT

We thank Dr. Vineeth Kumar Paleri, Professor, Department of Computer Science and Engineering, NIT Calicut for his guidance and co-operation. We also acknowledge the advice and help given to us by our friends. We would like to extend our gratitude to the entire faculty and staff of the CSED NITC, who stood by us in all pits and falls we had to face during the development phase of this project.

Page 3: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

3

CERTIFICATE

We certify that material contained in this dissertation is our own work and does not contain unreferenced or unacknowledged material. We also warrant that the above statement applies to the implementation of the project and also associated documentation. Signed: Date:

Page 4: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

4

Table of Content

1 INTRODUCTION .................................................................................................6 1.1 Project Specification ......................................................................................6 1.2 Literature Survey ...........................................................................................6 1.3 Motivation......................................................................................................6 1.4 Report Overview............................................................................................6

2 DESIGN.................................................................................................................8 2.1 Lexical Analysis.............................................................................................9 2.2 Syntax Analysis .............................................................................................9 2.3 Semantic Analysis and Preprocessing ...........................................................9 2.4 Type Checking ...............................................................................................9 2.5 Intermediate File Generation .........................................................................9 2.6 Evaluation ....................................................................................................10 2.7 GUI ..............................................................................................................11

3 WORK DONE .....................................................................................................12 4 RESULTS ............................................................................................................13 5 CONCLUSION....................................................................................................14 6 REFERENCES……………………………………………………………..........15

Table of Figures

Figure 1-1 DESIGN .......................................................................................................8

Page 5: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

5

ABSTRACT

This project aims to implement a database management system. This interprets a subset of database administration and SQL commands. It possesses a subset of functionalities provided by MySQL 4.0. The interpretation would result in actions on the underlying database. The development includes the design of the database architecture and a SQL interpreter. The various stages involved are Lexical Analysis, Syntax Analysis, Semantic Analysis, Type Checking, Intermediate File Generation, Evaluation and displaying results on a GUI.

Page 6: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

6

1. INTRODUCTION 1.1. PROBLEM SPECIFICATION

This project develops a database management system, which would serve the basic functionalities. It would accept database administration and SQL commands. The SQL Interpreter then interprets these commands. The result of interpretation is then transformed into an understandable form. The changed form is evaluated to give the result. The result could be modifying or retrieving from the database. It would prompt the user with error messages if the query were not correct. 1.2. LITERATURE SURVEY

Extensive study was done on topics relevant to the project. These

topics include development of compilers and interpreters [6]. Issues regarding the database architecture were dealt with [1]. A SQL query can be processed in different ways [3]. We looked into several strategies and finally chose one of then to implement in our project. Hence this area was also looked into. Evaluation of a SQL query also can be done in a number of ways [8]. This task of evaluation was thus an integral part of the project. We had to look into the syntax of SQL queries of all types for making an interpreter [2]. File handling and other implementation issues in C and Java was also learnt [4][5].

1.3. MOTIVATION

Database Management System is a vast field in computer science and there is a lot to explore in it. This has a great potential in the real world where all that matters is data and how efficiently one stores and retrieves it. This is a great challenge posed by leading software giants like Google, Yahoo! and Oracle.

Someone who wants to learn DBMS in great depths has to

understand the underlying database architecture and its variants. We wanted to explore to these depths by developing a DBMS on the lines of a simple RDBMS MySQL 4.0.This project would enable us to learn the intricacies involved in developing an interpreter for the query language used as well as design the database. Additional support was given by teachers and guide to continue on the above lines 1.4. REPORT OVERVIEW The remaining of the report deals with design of the project, work done, results obtained, conclusion and references. Details will be explained in the respective sections.

Page 7: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

7

Section 2: This section will describe the design of the whole project. It will start

from the first step in the design from where input is taken till the last step where we obtain the output. Phases like lexical analysis, syntax analysis, Semantic analysis, type checking, intermediate file generation and evaluation is explained.

Section 3: This section will describe the work done in the project. It will put light

on the various stages in which the implementation was done. Section 4: This section will mainly deal with the results which were obtained by

running the program. Section 5: This section will conclude the whole document. It will try to put some

light on what was achieved by doing this project and what more has to be done.

Page 8: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

8

2. DESIGN

The overall design of the project:

Fig 1.1 DESIGN

SQL Query

Lexical Analysis

Syntax Analysis

Semantic Analysis

Type Checking

Intermediate File Generation

Evaluation

Output

Data

Sequence of tokens

Query Compilation

Query Execution

Page 9: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

9

The details of the design stages: 2.1. LEXICAL ANALYSIS

This stage takes the SQL Query as input and produces a sequence of tokens. In case of any lexical errors an error message is reported. The input is taken in a file “input.txt” from the GUI and is passed to the lexical analyzer. The tool used was Flex 2.5 [11].

Flex is a tool for generating scanners: programs which recognized lexical patterns in text. Flex reads the given input files, or its standard input if no file names are given, for a description of a scanner to generate. 2.2. SYNTAX ANALYSIS

This stage checks for any syntax errors in the query/statement as described in the grammar. In case of any error conforming to the grammar an error message is reported. The tool used to implement this stage was Bison 2.1 [10].

Bison, a general-purpose parser generator that converts a grammar

description for an LALR(1) context-free grammar into a C program to parse that grammar. Once we are proficient with Bison, we can use it to develop a wide range of language parsers, from those used in simple desk calculators to complex programming languages. 2.3. SEMANTIC ANALYSIS AND PREPROCESSING

Semantic actions have been written for each clause in the grammar wherever necessary. This checks whether the input query/statement is meaningful. Preprocessor tries to check the constraints involving the action of a command of SQL. We used file handling in C [5] for carrying out such checks. The details of the databases and tables were stored in “database.txt”. 2.4. TYPE CHECKING

The types of two scalar expressions are checked before applying any mathematical operation on them. It is also checked whether a user fills in the data in the proper order as described in the table. 2.5. INTERMEDIATE FILE GENERATION

After all validation checks the file “eval.txt” is generated. The format of the file is different for different commands. This file is the output of the parser and is passed on to the evaluator.

Page 10: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

10

Format of file: 1) SELECT Printable attributes table name Conditional Attributes Conditions 2) UPDATE table name Assignment Statements Conditional Attributes Conditions 3) DELETE table name Conditional Attributes Conditions 2.6. EVALUATION

The data is retrieved in this phase. All the tables are kept as files and named as tablename.txt. For all the three commands that are used, each row in this file is retrieved just once.

Some basic steps are always followed in executing each of this command. First of all, we find out the query type to be processed (e.g. SELECT, UPDATE etc). Next we retrieve the tablename from eval.txt and open that file. Similarly the printable attributes, conditional attributes, assignment attributes and the condition (in postfix form) are retrieved from the eval.txt.

This evaluation program is written in Java [4]. As the file containing the data “tablename.txt” was an unformatted one, hence it would have been difficult to implement it in C. We needed two-dimensional array to store the details of printable attributes and conditional attributes. Two-dimensional array in Java was implemented through Vector of vectors.

Basically there are four fields in each row of this two-dimensional vector. The various fields are:

1) Attribute name (printable or conditional) 2) Flag (0 or 1 to show whether its used for printing in case of SELECT or used to

set values in case of UPDATE or used to check the condition) 3) Index (to show the column number of the attribute in the table) 4) Value (this value is retrieved from each row in the table and put over here in case

of conditional attributes and printable attributes and in case of assignment attributes in the set value)

Index of all the attributes are only obtained when we read the first

line of the tablename.txt. We find out all the indexes and then sort all the rows in the vector according to index value.

After this, each row of table, which contains data, is read. As the row is read, the values in the 4th column of the vector are put. Once this is over, we divert over attention to the condition, which is in postfix form. There are still some attributes in

Page 11: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

11

this condition, which have to be substituted by their values, which are in the vector. This is done by the substitution function. After this the condition is finally evaluated. If the condition returns true, then we select the row and do further processing.

This is done for each row. This process is efficient because the rows

are read just once. 2.7. GUI

The GUI only serves the purpose of inputting the query and displaying the result. The GUI takes the input in a file input.txt and passes it to the lexical analyzer. The evaluator generates the output file output.txt and this is displayed on the GUI. This file may contain results from table or error messages. The tool used for this was Net Beans 3.6 as it has extensive GUI library.

Page 12: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

12

3. WORK DONE

The project was divided into modules covering all stages and implemented stage by stage.

The remaining, around 70% of the syntax analyzer and preprocessor

was completed by writing the grammar according to Bison 2.1 specifications and SQL syntax..

Semantic actions were coded in C for each of the commands that

could be used by the user. Type checking was done for all scalar expressions involved in the mathematical operations.

The structure of database file and table files was designed to insert

the data as simple unformatted text files. Coding for the generation of an intermediate file was done, keeping in mind the required format for each command.

A Java program was written to evaluate the intermediate file

obtained after parsing. GUI was developed for taking the input and displaying the result or error messages.

Page 13: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

13

4. RESULTS

The output of the program was tested with appropriate test data and compared with expected output for all commands and was observed to be correct for all runs. The test strategy which was followed was to test all the possible paths which were taken while execution of the program. Thus the code runs successfully for commands CREATE, DROP, INSERT, SELECT, UPDATE and DELETE. The test data was chosen such that the errors covered could be checked. The error checking is not exhaustive but most of them were checked.

The implementation results into the creation of several files which

are used by different modules to display the end result. Thus we have successfully created our own database in the form of

files and we use it create and maintain the database for all users. We have provided facilities to the user to run different basic commands to manage the database system.

Page 14: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

14

5. CONCLUSION

We implemented our own small scale DBMS which interprets a subset of SQL commands. This can be easily extended to any large scale by just modifying the grammar and writing actions for the same based on the same lines. The database architecture, which was implemented using files, could have been more efficient with the use of B-Trees or B+-Trees which also can be easily extended. Other features of a DBMS like server handling, transactions etc could have also been added. Overall this project helped us to learn the development of the interpreter, query processing techniques and database architecture.

Page 15: A DBMS With SQL Interpreter

A DBMS WITH SQL INTERPRETER

15

6. REFERENCES

1. Elmasri and Navathe, Fundamentals of Database System, 4th edition, Pearson Education Asia, , 2004.

2. SQL tutorial, http://www.mysql.com.

3. Hector Garcia-Molina, Jeffrey D. Ullman and Jennifer Widom, Database System

Implementation, International Edition, Prentice Hall, 2000.

4. Herbert Schildt, Java 2: The Complete Reference, 5th edition, Tata Mc-Graw Hill, 2002.

5. Yashwant Kanetkar, Let Us C, 3rd edition, BPB, 1999.

6. Alfred V. Aho, Ravi Sethi and Jeffrey D. Ullman, Compilers : Principles,

Techniques and Tools, 2nd edition, Addison Wesley, 1986.

7. Garret Swart, MinSQL: A Simple Componentized Database for the Classroom, University College Cork, 2003.

8. Peter Buneman, Robert E. Frankel and Rishiyur Nikhil, An Implementation

Technique for Database Query Language, ACM trans. on Database Systems, Vol 7, June 1982, Pages 164-186.

9. V Kevin M. Whitney, Relational Data Management Implementation Techniques,

ACM-SIGFIDET Workshop on Data Description, Access and Control, May 1974, Pages 321-348.

10. Bison Manual,http://www.gnu.org/software/bison/manual/html_mono/bison.html.

11. Flex Manual, http://www.gnu.org/software/flex/manual/.