Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American...

55
Interpolation 2 Interpolation is a method of constructing new data points within the range of a discrete set of known data points. The function interp1(x , v , xq) returns interpolated values of a 1D function at specific query points. 1 Optional parameters: nearest, linear, spline, pchip, and extrap. Note that x must be in ascending order. Try interp2, interp3, interpn, and interpft. 1 See http://www.mathworks.com/help/matlab/ref/interp1.htm. 2 See https://en.wikipedia.org/wiki/Interpolation. Zheng-Liang Lu 278 / 332

Transcript of Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American...

Page 1: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Interpolation2

� Interpolation is a method of constructing new data pointswithin the range of a discrete set of known data points.

� The function interp1(x , v , xq) returns interpolated values ofa 1D function at specific query points.1

� Optional parameters: nearest, linear, spline, pchip, and extrap.� Note that x must be in ascending order.

� Try interp2, interp3, interpn, and interpft.

1See http://www.mathworks.com/help/matlab/ref/interp1.htm.2See https://en.wikipedia.org/wiki/Interpolation.

Zheng-Liang Lu 278 / 332

Page 2: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 clear; clc; close all;2

3 x = 0 : 4 * pi;4 y = sin(x) .* exp(-x / 5);5 xi = 0 : 0.1 : 4 * pi;6 y1 = interp1(x, y, xi, 'nearest');7 y2 = interp1(x, y, xi, 'linear');8 y3 = interp1(x, y, xi, 'pchip');9 y4 = interp1(x, y, xi, 'spline');

10 plot(x, y, 'o', xi, y1, xi, y2, xi, y3, xi, y4);11 legend('Original', 'Nearest', 'Linear', 'Pchip', ...

'Spline');

Zheng-Liang Lu 279 / 332

Page 3: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

0 2 4 6 8 10 12 14-0.4

-0.2

0

0.2

0.4

0.6

0.8

OriginalNearestLinearPchipSpline

Zheng-Liang Lu 280 / 332

Page 4: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Exercise

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5−1

0

1

2

3

4

5

6

OriginalSpline

Zheng-Liang Lu 281 / 332

Page 5: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

1 clear; clc; close all;2

3 x = [0 2 4 3 1 2 1];4 y = [4 1 1 4 5 2 0];5 t = 0 : length(x);6 tq = linspace(1, length(t), 100);7 xx = interp1(t, x, tq, 'spline');8 yy = interp1(t, y, tq, 'spline');9 plot(x, y, 'o', xx, yy, '-');

10 legend('Original', 'Spline');

� You may try various methods for interpolation.

Zheng-Liang Lu 282 / 332

Page 6: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Reconstructing A Surface

� In practice, we collect the data from experiments.

� One posterior analysis on these data points is to find thepossible curves or surfaces to fit the observed data points.

� The function griddata interpolates the surface at the querypoints specified by (xq, yq) and returns the interpolatedvalues.

� So the function griddata(x , y , v , xq, yq) returns a fittedsurface of the form v = f (x , y) to the scattered data in thevectors (x , y , v).

Zheng-Liang Lu 283 / 332

Page 7: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 clear; clc; close all;2

3 x = 6 * rand(100, 1) - 3;4 y = 6 * rand(100, 1) - 3;5 z = peaks(x, y); % 100 sample points in total6 [xq, yq] = meshgrid(-3 : 0.1 : 3);7 zq = griddata(x, y, z, xq, yq, 'cubic');8 mesh(xq, yq, zq); hold on;9 plot3(x, y, z, '.', 'markersize', 16); axis tight;

� Note that the resulting figure may be different when theprogram runs.

Zheng-Liang Lu 284 / 332

Page 8: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

−3−2

−10

12

3

−2

0

2

−6

−4

−2

0

2

4

6

Zheng-Liang Lu 285 / 332

Page 9: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

3D Graph Viewpoint Specification

� The function view(az, el) sets the view angle from which anobserver sees the current 3D plot.

� az: azimuth (horizontal) rotation� el: vertical elevation

� Try some predetermined values for view.

� Also, you can use the Rotate 3D button in the figure or callrotate3d on.

1 clear; clc;2

3 peaks;4 view([60, -15]); % degree5 colorbar; % appends a colorbar to the current axes6 colormap spring; % change the colormap

Zheng-Liang Lu 286 / 332

Page 10: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

−2

0

2−3 −2 −1 0 1 2 3

−6

−4

−2

0

2

4

6

8

x

Peaks

y

−6

−4

−2

0

2

4

6

8

Zheng-Liang Lu 287 / 332

Page 11: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

colormap3

3The colormap is parula as default since R2015a.Zheng-Liang Lu 288 / 332

Page 12: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Exporting to Files

� In the menu of the figure, you can save as any file type ofpictures.

� Use the hot key ctrl + s.

Zheng-Liang Lu 289 / 332

Page 13: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

� The function print(gh, ’-type’, ’fileName’) saves the contentsof the current figure, herein gh, with a specified file type anda file name.

� For example,

1 clear; clc;2

3 surf(peaks);4 print(gcf, '-djpeg', 'peaks.jpg');

� You can find more optional arguments here.

Zheng-Liang Lu 290 / 332

Page 14: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

1 >> Lecture 52 >>3 >> -- User-Controlled Input and Output4 >>

Zheng-Liang Lu 291 / 332

Page 15: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

American Standard Code for Information Interchange(ASCII)5

� Everything in the computer is encoded in binary.

� ASCII codes represent text in computers, communicationsequipment, and other devices that use text.

� ASCII is a character-encoding scheme originally based on theEnglish alphabet that encodes 128 specified characters intothe 7-bit binary integers (see the next page).

� Unicode4 became a standard for the modern systems from2007.

� Unicode includes ASCII.

4See Unicode 8.0 Character Code Charts.5The first version was in 1967. See ASCII.

Zheng-Liang Lu 292 / 332

Page 16: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Zheng-Liang Lu 293 / 332

Page 17: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Import Tool

� One can load data from the specified file byimportdata(filename), which can recognize the common fileextensions.

� For example, txt, csv, jpg, bmp, wav, avi, and xls.6

� Note that the file name must be a string.� If not, importdata interprets the file as a delimited ASCII file

as default.� importdata(’-pastespecial’) loads data from the system

clipboard rather than from a file.

� Try uiimport.

6See http://www.mathworks.com/help/matlab/import_export/

supported-file-formats.html.Zheng-Liang Lu 294 / 332

Page 18: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 >> A = importdata('ngc6543a.jpg');2 >> image(A); % show image

100 200 300 400 500 600

100

200

300

400

500

600

Zheng-Liang Lu 295 / 332

Page 19: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

� Use a text editor to create a space-delimited ASCII file withcolumn headers like this:

1 Day1 Day2 Day3 Day4 Day5 Day6 Day72 35.627 48.483 35.94 41.978 42.941 48.429 37.9583 37.976 45.544 54.247 53.332 54.411 45.959 53.0384 45.23 47.361 54.34 51.759 44.33 40.981 51.9375 46.924 36.816 42.832 41.372 38.775 45.613 40.8856 45.632 36.362 40.214 51.419 49.265 44.252 44.048

Zheng-Liang Lu 296 / 332

Page 20: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

1 clear; clc;2

3 A = importdata('myfile01.txt', ' ', 1);4 for k = [3, 5]5 disp(A.colheaders{1, k}); % headers of columns6 disp(A.data(:, k)); % numeric data7 end

Zheng-Liang Lu 297 / 332

Page 21: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Access to Delimited Text Files

� dlmread(filename, delimiter) reads ASCII-delimited file ofnumeric data.

� dlmwrite(filename, M, delimiter) writes the array M to thefile using the specified delimiter to separate array elements.

� The default delimiter is the comma (,).� dlmwrite(filename, M, ’-append’) appends the data to the end

of the existing file.

� dlmread(filename, delimiter, R, C ) reads data whose upperleft corner is at row R and column C in the file.

� R and C start from 0.� (R, C )=(0, 0) specifies the first element in the file.

Zheng-Liang Lu 298 / 332

Page 22: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

� Write to the file.

1 >> M = gallery('integerdata', 100, [5 8], 0);2 >> dlmwrite('myfile.txt', M, 'delimiter', '\t');

� Read from the file.

1 >> dlmread('myfile.txt', '\t')2 >> dlmread('myfile.txt', '\t', 2, 3)

Zheng-Liang Lu 299 / 332

Page 23: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

textread

� textread is useful for reading text files with known formats.

� [A, B, C, . . . ]=textread(filename, format, N) reads datafrom the file filename into the variables (A, B, C,. . . ) usingthe specified format, format, for N lines.

� format determines the number and types of return arguments.(See next page.)

� Without N, textread reads until the end of file.

� The common conversions are as follows:� %d: signed integer values.� %f: floating-point values.� %s: strings.

Zheng-Liang Lu 300 / 332

Page 24: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

� Create a file with the following lines:

1 Sally Level1 12.34 45 Yes2 Arthur Level2 19.85 29 No

1 >> [names, types, x, y, answer] = ...textread('mydata.dat', '%s %s %f %d %s') % ...normal usage

1 >> [names, types, x, answer] = ...textread('mydata.dat', '%s Level%d %f %*d %s', ...1) % check the difference!

� In %*f, * ignores the matching characters specified by *.

Zheng-Liang Lu 301 / 332

Page 25: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Access to Excel Files

� xlsread(filename, sheet, xlRange) reads from the specifiedsheet and range.

� The variable sheet can be the sheet name7 or the sheetnumber in the excel file.

� The variable xlRange is optional for the rectangular portion ofthe worksheet to read.

� For example, xlRange = ’B:B’ is used to import column B.� To read a single value, use xlRange = ’B1:B1’.8

� xlswrite(filename, A, sheet, xlRange) writes the array A tothe specified range of the sheet.

7The default sheet name is “工作表1”.8Contribution by Mr. Tsung-Yu Hsieh (MAT24409) on August 27, 2014.

Zheng-Liang Lu 302 / 332

Page 26: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 >> values = {1, 2, 3; 4, 5, 'x'; 7, 8, 9};2 >> headers = {'First', 'Second', 'Third'};3 >> xlswrite('myExample.xlsx', [headers; values]); ...

% write

1 >> subsetA = xlsread('myExample.xlsx', 1, 'B2 : ...C3') % read

2

3 subsetA =4

5 2 36 5 NaN

Zheng-Liang Lu 303 / 332

Page 27: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Low-Level File I/O

� Low-level file I/O functions allow the most control overreading/writing data to files.

� However, these functions require to specify more detailedinformation about the files than the easier-to-use high-levelfunctions, such as importdata.

� When the high-level functions cannot import your data, youmay consider to use low-level file I/O.

� The normal procedure is as follows:

1. Open a file.2. Read/write data into the file.3. Close the file.

Zheng-Liang Lu 304 / 332

Page 28: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Open Files

� The command f = fopen(filename, permission) is used toopen a file, where

� filename refers the file name,� permission means the file access code, specified as a string,� this function returns an integer (at least 3) as the file identifier

assigned to the file handle f9

� Note that you can create a new file with the permission code’w’.

� If the file number is −1, then this means that fopen fails toopen the file with ’r’.

9Matlab reserves 1 and 2 for the standard output and the standard error,respectively.

Zheng-Liang Lu 305 / 332

Page 29: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Permission Codes10

10See http://www.mathworks.com/help/matlab/ref/fopen.html

Zheng-Liang Lu 306 / 332

Page 30: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

� The function feof, which refers to “end-of-file”, returns 1 if aprevious operation set the end-of-file indicator for thespecified file.

� The function fgetl returns the next line of the specified file,removing the newline characters.

� If the line contains only the end-of-file marker, then the returnvalue is −1.

1 clear; clc;2

3 f = fopen('fgetl.m', 'r');4 while ~feof(f)5 disp(fgetl(f));6 end7 fclose(f);

Zheng-Liang Lu 307 / 332

Page 31: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Close Files

� fclose(f ) closes the opened file referenced by f .

� fclose(’all’) closes all opened files.

� fclose returns a status of 0 when the close operation issuccessful.

� Otherwise, it returns −1.

Zheng-Liang Lu 308 / 332

Page 32: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

fprintf

� Recall that fprintf(format, A1, . . . ,An) displays the results onthe screen according to the preset format.

� format: a format for the output fields; it should be a string.� A1, . . . ,An: arrays for each output field.

� In fact, fprintf(f, format, A1, . . . ,An) applies the format to allelements of arrays A1, . . . ,An in column order, and writes to atext file referenced by f.

� File identifier number 1 and 2 is reserved for the standardoutput and the standard error, respectively:

� fprintf(1, ’This is standard output!\n’);� fprintf(2, ’This is standard error!\n’);

� Similarly, the function sprintf returns the results as a stringbut not to files.

Zheng-Liang Lu 309 / 332

Page 33: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 >> format = 'X is %4.2f meters or %8.3f mm.\n';2 >> fprintf(format, 9.9, 9900); % print on the screen3

4 X is 9.90 meters or 9900.000 mm.

� %4.2f specifies that the first value in each line of output is afloating-point number with a field width of four digits,including two digits after the decimal point.

� Can you explain %8.3f?

Zheng-Liang Lu 310 / 332

Page 34: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Escape Characters

� %%: percent sign

� \\: backslash

� \b: backspace

� \n: new line

� \t: horizontal tab

Zheng-Liang Lu 311 / 332

Page 35: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Format Conversion

� Signed integer: %d

� Unsigned integer: %u

� Oct and hex integer� %o: base 8� %x: base 16

� Floating-point number� %f: fixed-point notation� %e: scientific notation, such as 3.141593e+00

� Text� %c: single character� %s: string

Zheng-Liang Lu 312 / 332

Page 36: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

� The function findstr(S1, S2) returns the starting indices ofany occurrences of the shorter in the longer.11

1 function findit(f, pattern)2 f = fopen(f, 'r');3 cnt = 0;4 while ~feof(f)5 t = fgetl(f);6 cnt = cnt + 1;7 w = findstr(t, pattern);8 if ~isempty(w)9 fprintf('%d: %s\n', cnt, t);

10 end11 end12 fclose(f);13 end

11See Knuth–Morris–Pratt string searching algorithm (1974).Zheng-Liang Lu 313 / 332

Page 37: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Exercise

� Write a program which writes a multiplication table into a file.

1 1 2 3 4 5 6 7 8 92 2 4 6 8 10 12 14 16 183 3 6 9 12 15 18 21 24 274 4 8 12 16 20 24 28 32 365 5 10 15 20 25 30 35 40 456 6 12 18 24 30 36 42 48 547 7 14 21 28 35 42 49 56 638 8 16 24 32 40 48 56 64 729 9 18 27 36 45 54 63 72 81

Zheng-Liang Lu 314 / 332

Page 38: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

1 clear; clc;2

3 f = fopen('multiplicationTable.txt', 'w');4 for i = 1 : 95 for j = 1 : 96 fprintf(f, '%3d', i * j);7 end8 fprintf(f, '\n');9 end

10 fclose(f);

Zheng-Liang Lu 315 / 332

Page 39: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Access to Internet12

� urlread(URL, name, value) returns the contents of a URL.

1 A = ...urlread('http://www.csie.ntu.edu.tw/~d00922011/matlab.html');

2 f = fopen('matlab.html', 'w');3 fprintf(fid, '%s', A);4 fclose(f);5 dos('start matlab.html');

� Try sendmail, ftp.

12Seehttp://www.mathworks.com/help/matlab/internet-file-access.html.

Zheng-Liang Lu 316 / 332

Page 40: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example: Yahoo Finance API

� Current market and historical data from the Yahoo! dataserver

� Blog: 研究雅虎股票API (Yahoo finance stock API)

� Google: yahoo-finance-managed

� Historical Stock Data downloader by Josiah Renfree (2008)

Zheng-Liang Lu 317 / 332

Page 41: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

1 >> Lecture 62 >>3 >> -- String and Regular Expressions4 >>

Zheng-Liang Lu 318 / 332

Page 42: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Characters and Strings

� A character array is a sequence of characters, just as anumeric array is a sequence of numbers.

� A string array is a container for pieces of text, providing a setof functions for working with text as data.

Zheng-Liang Lu 319 / 332

Page 43: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example: Caesar Cipher14

� Write a program which implements Caesar cipher algorithm toencrypt an input string.

� The cipher alphabet is the plain alphabet rotated left or rightby an integer, called shifter.

Input: plain text x and an integer shifterOutput: cipher text y

� Note that every character is encoded in binary.13

13See https://en.wikipedia.org/wiki/ASCII.14See https://en.wikipedia.org/wiki/Caesar_cipher.

Zheng-Liang Lu 320 / 332

Page 44: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

� It performs the following actions in order on the input stringx :

� Convert all letters to uppercase, since classical Latin had onecase.

� For letters with even ASCII values, perform a Caesar shiftusing the given shift number.

� For letters with odd ASCII values, perform a Caesar shift usingthe negative of the shift number.

� Replace all instances of ’J’ with ’I’ and ’U’ with ’V’ , sinceclassical Latin had no J’s or U’s (Julius Caesar was written asIVLIVS CAESAR).

� Concatenate the number of consonants in the resulting stringwith the output string.

� You may use the function length to calculate the number ofcharacters and mod(a, b) to calculate the remainder of a aftera is divided by b.

Zheng-Liang Lu 321 / 332

Page 45: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Introduction

� A regular expression, also called a pattern, is an expressionused to specify a set of strings required for a particularpurpose.15

� Usually this pattern is widely used by string searchingalgorithms for ”find” or ”find and replace” operations onstrings.

15See https://en.wikipedia.org/wiki/Regular_expression.Zheng-Liang Lu 322 / 332

Page 46: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 >> str = 'bat cat can car coat court CUT ct ...CAT-scan';

2 >> expression = 'c[aeiou]+t';3 >> startIndex = regexp(str,expression)4

5 startIndex =6

7 5 17

� The regular expression ’c[aeiou]+t’ specifies this pattern:� c must be the first character.� c must be followed by one of the characters inside the

brackets, [aeiou].� The bracketed pattern must occur one or more times, as

indicated by the + operator.� t must be the last character, with no characters between the

bracketed pattern and the t.

Zheng-Liang Lu 323 / 332

Page 47: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Formalisms

Operator Definition

| Boolean OR

* 0 or more times consecutively

? 0 times or 1 time

+ 1 or more times consecutively

{n} exactly n times consecutively

{m, } at least m times consecutively

{, n} at most n times consecutively

{m, n} at least m times, but no more than n times consecutively

Zheng-Liang Lu 324 / 332

Page 48: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Operator Definition

. any single character, including white space

[c1c2c3] any character contained within the brackets

[∧c1c2c3] any character not contained within the brackets

[c1-c2] any character in the range of c1 through c2\s any white-space character

\w a word; any alphabetic, numeric, or underscore character

\W not a word

\d any numeric digit; equivalent to [0-9]

\D no numeric digit; equivalent to [∧0-9]

Zheng-Liang Lu 325 / 332

Page 49: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example

1 >> str = {'Madrid, Spain', 'Romeo and Juliet', ...'MATLAB is great'};

2 >> capExpr = '[A-Z]';3 >> spaceExpr = '\s';4

5 >> capStartIndex = regexp(str, capExpr)6 >> spaceStartIndex = regexp(str, spaceExpr)

Zheng-Liang Lu 326 / 332

Page 50: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Keywords for Outputs

Output Keyword Returns

’start’ starting indices of all matches, by default

’end’ ending indices of all matches

’match’ text of each substring that matches the pattern

’tokens’ text of each captured token

’split’ text of nonmatching substrings

’names’ name and text of each named token

Zheng-Liang Lu 327 / 332

Page 51: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example: Match

1 >> str = 'EXTRA! The regexp function helps you ...relax.';

2 >> expression = '\w*x\w*';3 >> matchStr = regexp(str, expression, 'match')

Zheng-Liang Lu 328 / 332

Page 52: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example: Split

1 >> str = ['Split ˆthis text into ˆseveral pieces'];2 >> expression = '\ˆ';3 >> splitStr = regexp(str, expression, 'split')

� You may try the function strtok16 which splits the inputstring into a token and the rest of the original string.

16See https://www.mathworks.com/help/matlab/ref/strtok.html.Zheng-Liang Lu 329 / 332

Page 53: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Exercise: Web Crawler

� Write a script which collects the names of HTML tags bydefining a token within a regular expression.

� For example,

1 >> str = '<title>My Title</title><p>Here is some ...text.</p>';

2 >> expression = '<(\w+).*>.*</\1>';3 >> [tokens, matches] = regexp(str, expression, ...

'tokens', 'match')

Zheng-Liang Lu 330 / 332

Page 54: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

Example: Names

� You can associate names with tokens so that they are moreeasily identifiable.

� For example,

1 >> str = 'Here is a date: 01-Apr-2020';2 >> expr = '(?<day>\d+)-(?<month>\w+)-(?<year>\d+)';3 >> mydate = regexp(str, expr, 'names')4

5 mydate =6

7 day: '01'8 month: 'Apr'9 year: '2020'

Zheng-Liang Lu 331 / 332

Page 55: Introduction to Matlab Programming with Applicationsd00922011/matlab/280/20170325.pdf · American Standard Code for Information Interchange (ASCII)5 Everything in the computer is

More Relevant Functions

� Check contains, regexpi, regexprep, regexptranslate,replace, strfind, strjoin, strrep, and strsplit.

� See the following links:� https:

//www.mathworks.com/help/matlab/ref/regexp.html� https://en.wikipedia.org/wiki/Regular_expression� https://regexone.com/

Zheng-Liang Lu 332 / 332