File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in...

33
File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level I/O 1. Built-in functions that are specific to one task 1

Transcript of File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in...

Page 1: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

File I/O

1. Goals of this chapter

2. General Concept, lots of examples

3. High-level I/O1. Built-in functions that do everything in one

step

4. Low-level I/O1. Built-in functions that are specific to one

task 1

Page 2: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

1. Goals of this chapter

Understand why external files are greatly used in programming, such as .txt, .xls, .dat…

Recognize when to use a file vs. hardcoding data Understand how Matlab sees any file Define and recognize delimiting-characters Depending on which type of file, choose an appropriate

built-in function Remember the overall idea behind each built-in function Remember the logical order of reading/writing a file

2

Page 3: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

2. General Concept

What is an “input-file”?

It is file that contains the givens needed to solve the entire problem, or maybe just a part of the problem.

LOTS of data (as little as 5 lines to billions of lines…)

What is an “output-file”?

It is a file which contains the results (once the problem has been solved).

Most common example: I _ V O _ C _

3Some examples of input files…

Page 4: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Property Sales in Flagler (North of Daytona Beach)

4INPUT FILE

Page 5: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Property Listing in Ormond Beach (North of Daytona Beach)

5

Page 6: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Hurricane Data

6

INPUT FILE

Page 7: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Hurricane Data... What?

7

Same as previous file…

A text file doesn’t always look pretty. Try different softwares to open it until one shows clearly the columns and rows!

-Notepad-Wordpad-Internet explorer-Firefox-word

INPUT FILE

Page 8: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Coefficient of Friction

8.xls, Excel format.txt, text format (INPUT FILES)

Page 9: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

File FormatsFiles are stored in essentially two different formats:

ASCII text, and binary.

ASCII text consists entirely of characters which have meaning within the ASCII table. The only special codes are already present in the ASCII table, such as newlines and tabs. Notepad, for example, works only with ASCII text files.

Page 10: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

File Formats, cont.

Binary files may contain some ASCII text, but will also contain data which has meaning to the program which was intended to use the file. MS Word, for example, embeds codes for bold, underline, HTML, etc in the file. These codes only have meaning to Word – not to programs like Notepad.

Word can work with Notepad files, but Notepad cannot work with Word files.

Page 11: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

General Concept, cont.

When reading data from a file, it continually requests values from the file data stream until the end of the file is reached. As the data is received, the program must identify the delimiting

characters (tabs, spaces, enter key, special characters..) and reformat the data to reconstruct the organization of the data as represented in a file.

When writing data to a file: to preserve the organization of the data, the appropriate delimiting characters must be inserted into the stream of information.

11

File ProgramAn Input Stream of information

Program FileAn Ouput Stream of information

Page 12: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

General Concept, cont.

Using the data contained inside input-files Analyze, min, max, average Estimate trends (stock market) Use it as a database for one or two variables (coefficient of friction) Use it to give user possibilities to pick from Plot

Using output-files, once populated with data Print them Email them Publish in a report Reuse as an-input file for another software

12Examples…

Page 13: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Analyzing Flagler Sales

Price of properties function of square footage

Linear Fitting of all data points (red)

Standard deviation (green)

13

Page 14: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Attendance Sheet

14

OUTPUT FILE

Page 15: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Analyze grades

15

INPUT FILE OUTPUT FILE

Page 16: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Another Hurricane Data

This one shows all coordinates (latitude/longitude) of the storm at different times during the day, as well as the wind speed.

16

INPUT FILE

Page 17: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Plot Hurricane Data

This is a final project in JAVA (CS225) in Fall 2007 that analyzed the hurricane input file.

17

Page 18: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

General Concept, cont. Although there are only a few ADVANTAGES to using

files, the ones that do exist are sufficient in themselves!!!

LOTS of data, usually contained in very little variables (possibly only 1 cell-array!)

The software is developed knowing that the file cannot change format, but definitely knowing that the file can grow in size! Although the hurricane data obviously gets updated yearly, the

software to analyze the data does not change a bit!

User does not have to enter the data: less prone to errors.

Long term saving/backup of data.18

Page 19: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

General Concept, cont.

Only certain formats can be analyzed .txt .xls .csv .dat .jpg .png .gif .trk

The following is not possible under MATLAB .doc

19

Page 20: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

NOW THAT YOU HAVE AN IDEA..How is all this done?

20

Page 21: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

General Concept, cont.

There are two types of built-in functions High-level functions: one (and 1 only) command opens the file,

read or writes data, and closes the file. All is stored immediately in variable(s).

Low-level functions: 1. one command needs to open the file,

2. another command (maybe more than one) needs to read or write the data,

3. and finally another command closes the file.

Learn both. High-level functions can only be used with some very specific files.

21

Page 22: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

3. HIGH LEVEL FUNCTIONSIn one command-line, the function grabs all the data

22

Page 23: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

High Level, cont.

Load data from external sources (INPUT) Save data back to a file (OUTPUT)

When reading data, it is a good habit to explore the data in the file first. What type of file is it? (excel, text, jpg..) Find the organization overall (data in rows, or data in columns) Recognize the delimiters (space, tabs, new lines, dash, colon, or

any specific symbol. What makes it obvious it is a new column? What makes it obvious it is a new row?)

Recognize the data types (all numerical? all strings? Or combination of both)

23

Page 24: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

High Level, cont.

NUMERICAL AND CHARACTERS Excel worksheet xlsread()

(sometimes does not work depending on which system you have)

NUMERICAL DATA ONLY (of course, not in an excel file…)

General-delimited dlmread() Tab-separated dlmread() (same function as

tab)

Comma-separated csvread()

Most can be interchanged. These are powerful functions. Remember xlsread() and dlmread() mostly!

24

Page 25: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Whatever you do, the files (text, .m, .mat) should all be in the same directory for function calls to work!

25

Page 26: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

3.1 Using xlsread() The function recognizes the data types

of each cell, and groups similar data-type together

>>[v1 v2 v3] = xlsread('grades.xlsx') First variable (v1) will have all the numerical

data Second variable (v2) will have all the

character data Third variable (v3) groups ALL the data

together in one cell-array.

Not column by column>> [names age grade] =

xlsread('grades.xlsx') 26

Page 27: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

xlsread(), cont.

27

2nd return value: characters

3rd return value: The data all together

1st return value: numerical

Page 28: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

xlsread(), cont.

txt =

19 78

22 83

98 99

21 56

23 89

19 51

28

nbs =

'Name' 'Age' 'Grade'

'Fred' '' ''

'joe' '' ''

'SaLLy' '' ''

'CharliE' '' ''

'mary' '' ''

'Ann' '' ''

raw =

'Name' 'Age' 'Grade'

'Fred' [ 19] [ 78]

'joe' [ 22] [ 83]

'SaLLy' [ 98] [ 99]

'CharliE' [ 21] [ 56]

'mary' [ 23] [ 89]

'Ann' [ 19] [ 51]

>> [txt nbs raw] = xlsread('grades.xlsx‘)

Variable names are up to the programmer. If placed in wrong order by mistake… BAD.

The order of the return values is important.

Page 29: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

xlsread(), cont. Simply omit the 2nd and 3rd return value to collect only

numerical values.values = textread(‘grades.xlsx’);

If a project needs all the data together, collect the 2nd and 3rd return values into a dummy variable.[trash trash data] = textread(‘grades.xlsx’);

If there happens to be ‘holes’ in the spreadsheet, Matlab fills it with a NaN value (not a number). The function isnan() can help determine where those ‘holes’ are.

(This function will not work under Citrix, since Excel is not installed on Citrix. Come to the lab to do labs and homeworks.) 29

Page 30: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Writing to an excel sheet?

After manipulating, analyzing, or creating an array, excel spreadsheets can also be written-to using:

xlswrite(<filename>, <array>, <sheet>, <range>)

The <sheet> and <range> arguments are optional.

30

clcclear %create phony datatable = rand(5,3)*100; %print to excelxlswrite('testingTesting.xls',table)

Then use excel to plot and analyze the data!

Page 31: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

3.2 Using delimited files – Numbers only Columns are delimited by tabs, and/or spaces. Rows are delimited by the new line character (enter key,

invisible to the human eye).

31

Matlab automatically filled in with zeros where data was missing.

Page 32: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

Same function used forComma-delimited numbers Added feasibility to ‘skip’ columns

32

Page 33: File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in functions that do everything in one step 4. Low-level.

High-Level, conclusion

The variables are automatically created by MATLAB. Sometimes a regular array [ ] Other times, a cell array { }

Use knowledge from previous chapter to analyze the data in the arrays. FOR loops, A(position), A(row,col), A{row,col}, max(arrayName),

mean(arrayName), sort(..)