File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in...
-
Upload
melinda-turner -
Category
Documents
-
view
214 -
download
0
Transcript of File I/O 1. Goals of this chapter 2. General Concept, lots of examples 3. High-level I/O 1. Built-in...
File I/O
1. Goals of this chapter
2. General Concept, lots of examples
3. High-level I/O1. Built-in functions that do everything in one
step
4. Low-level I/O1. Built-in functions that are specific to one
task 1
1. Goals of this chapter
Understand why external files are greatly used in programming, such as .txt, .xls, .dat…
Recognize when to use a file vs. hardcoding data Understand how Matlab sees any file Define and recognize delimiting-characters Depending on which type of file, choose an appropriate
built-in function Remember the overall idea behind each built-in function Remember the logical order of reading/writing a file
2
2. General Concept
What is an “input-file”?
It is file that contains the givens needed to solve the entire problem, or maybe just a part of the problem.
LOTS of data (as little as 5 lines to billions of lines…)
What is an “output-file”?
It is a file which contains the results (once the problem has been solved).
Most common example: I _ V O _ C _
3Some examples of input files…
Property Sales in Flagler (North of Daytona Beach)
4INPUT FILE
Property Listing in Ormond Beach (North of Daytona Beach)
5
Hurricane Data
6
INPUT FILE
Hurricane Data... What?
7
Same as previous file…
A text file doesn’t always look pretty. Try different softwares to open it until one shows clearly the columns and rows!
-Notepad-Wordpad-Internet explorer-Firefox-word
INPUT FILE
Coefficient of Friction
8.xls, Excel format.txt, text format (INPUT FILES)
File FormatsFiles are stored in essentially two different formats:
ASCII text, and binary.
ASCII text consists entirely of characters which have meaning within the ASCII table. The only special codes are already present in the ASCII table, such as newlines and tabs. Notepad, for example, works only with ASCII text files.
File Formats, cont.
Binary files may contain some ASCII text, but will also contain data which has meaning to the program which was intended to use the file. MS Word, for example, embeds codes for bold, underline, HTML, etc in the file. These codes only have meaning to Word – not to programs like Notepad.
Word can work with Notepad files, but Notepad cannot work with Word files.
General Concept, cont.
When reading data from a file, it continually requests values from the file data stream until the end of the file is reached. As the data is received, the program must identify the delimiting
characters (tabs, spaces, enter key, special characters..) and reformat the data to reconstruct the organization of the data as represented in a file.
When writing data to a file: to preserve the organization of the data, the appropriate delimiting characters must be inserted into the stream of information.
11
File ProgramAn Input Stream of information
Program FileAn Ouput Stream of information
General Concept, cont.
Using the data contained inside input-files Analyze, min, max, average Estimate trends (stock market) Use it as a database for one or two variables (coefficient of friction) Use it to give user possibilities to pick from Plot
Using output-files, once populated with data Print them Email them Publish in a report Reuse as an-input file for another software
12Examples…
Analyzing Flagler Sales
Price of properties function of square footage
Linear Fitting of all data points (red)
Standard deviation (green)
13
Attendance Sheet
14
OUTPUT FILE
Analyze grades
15
INPUT FILE OUTPUT FILE
Another Hurricane Data
This one shows all coordinates (latitude/longitude) of the storm at different times during the day, as well as the wind speed.
16
INPUT FILE
Plot Hurricane Data
This is a final project in JAVA (CS225) in Fall 2007 that analyzed the hurricane input file.
17
General Concept, cont. Although there are only a few ADVANTAGES to using
files, the ones that do exist are sufficient in themselves!!!
LOTS of data, usually contained in very little variables (possibly only 1 cell-array!)
The software is developed knowing that the file cannot change format, but definitely knowing that the file can grow in size! Although the hurricane data obviously gets updated yearly, the
software to analyze the data does not change a bit!
User does not have to enter the data: less prone to errors.
Long term saving/backup of data.18
General Concept, cont.
Only certain formats can be analyzed .txt .xls .csv .dat .jpg .png .gif .trk
The following is not possible under MATLAB .doc
19
NOW THAT YOU HAVE AN IDEA..How is all this done?
20
General Concept, cont.
There are two types of built-in functions High-level functions: one (and 1 only) command opens the file,
read or writes data, and closes the file. All is stored immediately in variable(s).
Low-level functions: 1. one command needs to open the file,
2. another command (maybe more than one) needs to read or write the data,
3. and finally another command closes the file.
Learn both. High-level functions can only be used with some very specific files.
21
3. HIGH LEVEL FUNCTIONSIn one command-line, the function grabs all the data
22
High Level, cont.
Load data from external sources (INPUT) Save data back to a file (OUTPUT)
When reading data, it is a good habit to explore the data in the file first. What type of file is it? (excel, text, jpg..) Find the organization overall (data in rows, or data in columns) Recognize the delimiters (space, tabs, new lines, dash, colon, or
any specific symbol. What makes it obvious it is a new column? What makes it obvious it is a new row?)
Recognize the data types (all numerical? all strings? Or combination of both)
23
High Level, cont.
NUMERICAL AND CHARACTERS Excel worksheet xlsread()
(sometimes does not work depending on which system you have)
NUMERICAL DATA ONLY (of course, not in an excel file…)
General-delimited dlmread() Tab-separated dlmread() (same function as
tab)
Comma-separated csvread()
Most can be interchanged. These are powerful functions. Remember xlsread() and dlmread() mostly!
24
Whatever you do, the files (text, .m, .mat) should all be in the same directory for function calls to work!
25
3.1 Using xlsread() The function recognizes the data types
of each cell, and groups similar data-type together
>>[v1 v2 v3] = xlsread('grades.xlsx') First variable (v1) will have all the numerical
data Second variable (v2) will have all the
character data Third variable (v3) groups ALL the data
together in one cell-array.
Not column by column>> [names age grade] =
xlsread('grades.xlsx') 26
xlsread(), cont.
27
2nd return value: characters
3rd return value: The data all together
1st return value: numerical
xlsread(), cont.
txt =
19 78
22 83
98 99
21 56
23 89
19 51
28
nbs =
'Name' 'Age' 'Grade'
'Fred' '' ''
'joe' '' ''
'SaLLy' '' ''
'CharliE' '' ''
'mary' '' ''
'Ann' '' ''
raw =
'Name' 'Age' 'Grade'
'Fred' [ 19] [ 78]
'joe' [ 22] [ 83]
'SaLLy' [ 98] [ 99]
'CharliE' [ 21] [ 56]
'mary' [ 23] [ 89]
'Ann' [ 19] [ 51]
>> [txt nbs raw] = xlsread('grades.xlsx‘)
Variable names are up to the programmer. If placed in wrong order by mistake… BAD.
The order of the return values is important.
xlsread(), cont. Simply omit the 2nd and 3rd return value to collect only
numerical values.values = textread(‘grades.xlsx’);
If a project needs all the data together, collect the 2nd and 3rd return values into a dummy variable.[trash trash data] = textread(‘grades.xlsx’);
If there happens to be ‘holes’ in the spreadsheet, Matlab fills it with a NaN value (not a number). The function isnan() can help determine where those ‘holes’ are.
(This function will not work under Citrix, since Excel is not installed on Citrix. Come to the lab to do labs and homeworks.) 29
Writing to an excel sheet?
After manipulating, analyzing, or creating an array, excel spreadsheets can also be written-to using:
xlswrite(<filename>, <array>, <sheet>, <range>)
The <sheet> and <range> arguments are optional.
30
clcclear %create phony datatable = rand(5,3)*100; %print to excelxlswrite('testingTesting.xls',table)
Then use excel to plot and analyze the data!
3.2 Using delimited files – Numbers only Columns are delimited by tabs, and/or spaces. Rows are delimited by the new line character (enter key,
invisible to the human eye).
31
Matlab automatically filled in with zeros where data was missing.
Same function used forComma-delimited numbers Added feasibility to ‘skip’ columns
32
High-Level, conclusion
The variables are automatically created by MATLAB. Sometimes a regular array [ ] Other times, a cell array { }
Use knowledge from previous chapter to analyze the data in the arrays. FOR loops, A(position), A(row,col), A{row,col}, max(arrayName),
mean(arrayName), sort(..)