Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book:...

16
Data Structures Wen-Chih Peng ( 彭文志) Department of Computer Science National Chiao Tung University Fall, 2012

Transcript of Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book:...

Page 1: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Data Structures

Wen-Chih Peng (彭文志) Department of Computer Science National Chiao Tung University

Fall, 2012

Page 2: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Course Information

Instructor: Wen-Chih Peng (彭文志) Office: EC 542 (Tel:31478) E-mail: [email protected] TA: 廖忠訓 (EC621) Text book: Fundamentals of Data Structures in C++,

E. Horowitz, S. Sahni and D. Mehta, Computer Science Press.

Reference book: Data Structures & Algorithm Analysis in Java, M. A. Weiss, Pearson Education.

Web site (e3.nctu.edu.tw)

Page 3: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Course Information (Cont’d)

Prerequisite object oriented programming C++

Grading(Tentative) Midterm 30%, final 30% and 4-5 homeworks

(40%)

Page 4: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Course materials Introduction (Concepts, Recursion,

Algorithm Analysis) Arrays Stacks & Queues Lists Trees Graphs Sorting Hashing Advanced Data Structures

Page 5: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Importance Basis of all fields in computer

science Data structures + Algorithms +

Coding skills = Programming

Page 6: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Web search engine

1. Web crawling 2. Indexing 3. Searching

Page 7: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Big-Table for Storing Web pages

Page 8: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Indexing

Problem: Given a document and a string, find the all

appearances of the string in the document.

Related techniques: String Matching (Brute-Force, KMP)

Page 9: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Example

Tiger Woods Clinches PGA Player of the Year Award 2006/ 9/ 8 By Michael Buteau NEW YORK, Bloomberg Tiger Woods, who has won his past five tournaments and seven overall this season, clinched his eighth PGA Player of the Year Award.

Tiger Woods Clinches PGA Player of the Year Award 2006/ 9/ 8 By Michael Buteau NEW YORK, Bloomberg Tiger Woods, who has won his past five tournaments and seven overall this season, clinched his eighth PGA Player of the Year Award.

Search keyword is “Tiger Woods”

Page 10: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Keywords as an Indexing Problem:

Given a document and a list of stop words, find all words and their frequencies.

Related techniques: Array Linked list Binary search tree

Page 11: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Ranking Web pages

Page 12: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Ranking Web Pages Problem:

There are many web pages containing the search keyword. But, how to score them?

Algorithms: PageRank or HITS: Graph

Page A

Page C

Page B

Page D

Page 13: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

How to program Requirements Analysis: bottom-up vs. top-down Design: data objects and

operations Refinement and Coding Verification

Program Proving Testing Debugging

Page 14: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Algorithm Definition

An algorithm is a finite set of instructions that accomplishes a particular task

Criteria input output definiteness: clear and unambiguous finiteness: terminate after a finite number of steps effectiveness: instruction is basic enough to be carried

out

Page 15: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Specification vs. Implementation

Operation specification function name the types of arguments the types of the results

Implementation Program language (e.g., C++, Java) How to implement using one of data

structures learned

Page 16: Data Structures - National Chiao Tung Universityocw.nctu.edu.tw/course/ds011/00.pdfText book: Fundamentals of Data Structures in C++, E. Horowitz, S. Sahni and D. Mehta, Computer Science

Evaluation (Measurement)

Criteria Is it correct? Is it readable?

Performance Analysis (machine independent) space complexity: storage requirement time complexity: computing time

Performance Measurement (machine dependent)