String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 ›...
Transcript of String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 ›...
![Page 1: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/1.jpg)
String Processing in C C Programming and Software Tools N.C. State Department of Computer Science
![Page 2: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/2.jpg)
Standard Library: <ctype.h>
• Many functions for checking whether a character is a digit, is upper case, … – isalnum(c), isalpha(c), isspace(c),…
• Also, functions for converting to upper case and converting to lower case – toupper(c), tolower(c), …
• Argument is an int and return is an int
– Works fine with unsigned chars or 7-bit character types
– Need to cast to unsigned char for safety
CSC230: C and Software Tools © NC State University Computer Science Faculty 2
![Page 3: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/3.jpg)
<ctype.h> (cont’d) Checking:
CSC230: C and Software Tools © NC State University Computer Science Faculty 3
isalnum (c) c is a letter or a digit
isalpha(c) c is a letter
isdigit (c) c is a decimal digit
islower (c) c is a lower-case letter
isspace (c) c is white space (\f\n\r\t\v)
isupper (c) c is an upper-case letter
Only a partial list (see p. 612-613 or library for full list)
tolower (c) convert c to lower case
toupper (c) convert c to upper case
Converting:
![Page 4: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/4.jpg)
Strings
• Simply 1-D arrays of type char, terminated by null character ('\0')
• A variety of standard library functions provided for processing
CSC230: C and Software Tools © NC State University Computer Science Faculty 4
![Page 5: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/5.jpg)
scanf() and printf() for Strings
• sscanf(s, "…", …) scans a string (instead of stdin) for expected input
• sprintf(s, "…", …) outputs to a string (instead of stdout) the specified output
CSC230: C and Software Tools © NC State University Computer Science Faculty 5
![Page 6: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/6.jpg)
Standard Library: <string.h>
• Lots of string processing functions for
– copying one string to another
– comparing two strings
– determining the length of a string
– concatenating two strings
– finding a substring in another string
– …
• Function headers at end of slides
• More details in King text book (Section 23.6)
CSC230: C and Software Tools © NC State University Computer Science Faculty 6
![Page 7: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/7.jpg)
A Useful Memory Operation: memcpy()
• Must #include <string.h>
• Syntax: void * memcpy (void *dest,
void *src,
size_t n)
• Copy n bytes from memory pointed to by src to memory pointed to by dest – memory areas must not overlap!
• Returns pointer to dest
CSC230: C and Software Tools © NC State University Computer Science Faculty 7
note order!
![Page 8: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/8.jpg)
memcpy() (cont’d)
• Since C does not have an operator to assign one array to another, this is a handy function
CSC230: C and Software Tools © NC State University Computer Science Faculty 8
#define SZ 1000
int *ip, *jp;
int A[1000], B[1000];
… assign some values to A …
memcpy(B, A, 1000*sizeof(int));
![Page 9: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/9.jpg)
Variant: memmove()
• memmove() works just like memcpy(), except src and dest areas may overlap
CSC230: C and Software Tools © NC State University Computer Science Faculty 9
![Page 10: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/10.jpg)
Another Useful Operation: memcmp() • Syntax: int memcmp (void *s1, void *s2,
size_t n)
• Returns 0 if n bytes starting at s1 are equal to n bytes starting at s2
• Else, return val < 0 if first non-equal byte of s1 < byte of s2, > 0 if …
• Useful for comparing arrays, but byte-by-byte comparison only
– e.g., don't use for comparing arrays of ints, floats, structs, etc.
CSC230: C and Software Tools © NC State University Computer Science Faculty 10
![Page 11: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/11.jpg)
memcmp()... (cont'd)
CSC230: C and Software Tools © NC State University Computer Science Faculty 11
char X[1000], Y[1000];
int A[1000], B[1000];
… assign some values to A, B, X, Y …
if (memcmp(X, Y, 1000) < 0)
...X is less than Y...
Do not try this as-is with A and B; why not?
![Page 12: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/12.jpg)
String function summary Raw memory String String with limit Purpose
memcpy
memmove1
strcpy strncpy Copy
- strcat strncat Concatenate (append) strings
memcmp strcmp strncmp Compare
memchr strchr
strrchr2
- Find a char
CSC230: C and Software Tools © NC State University Computer Science Faculty 12
String Purpose
strspn
strcspn
strpbrk
Find any of a set of chars in a string
strstr Find one string within another
strtok Split a string into tokens
strlen Find the length of a string
1 Allows overlapping memory 2 Reverse (right-to-left) search
Raw mem Purpose
memset Fill a block of memory
Reference: http://www.cplusplus.com/reference/cstring/
![Page 13: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/13.jpg)
Danger zone (1)
• What’s wrong with this?
CSC230: C and Software Tools © NC State University Computer Science Faculty 13
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char filename[32];
strcpy(argv[1],filename);
printf("Opening %s...\n",filename);
// more code goes here
return 0;
}
![Page 14: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/14.jpg)
Danger zone (2)
• What’s wrong with this, then?
CSC230: C and Software Tools © NC State University Computer Science Faculty 14
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char filename[32];
strcpy(filename,argv[1]);
printf("Opening %s...\n",filename);
// more code goes here
return 0;
}
![Page 15: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/15.jpg)
Safety zone (1)
• The common way to fix this
CSC230: C and Software Tools © NC State University Computer Science Faculty 15
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char* filename;
filename = argv[1];
printf("Opening %s...\n",filename);
// more code goes here
return 0;
}
![Page 16: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/16.jpg)
Safety zone (2)
• If you absolutely need a copy of the string.
CSC230: C and Software Tools © NC State University Computer Science Faculty 16
#include <stdlib.h>
#include <string.h>
#include <stdio.h>
int main(int argc, char* argv[]) {
char* filename = malloc(strlen(argv[1])+1);
strcpy(filename,argv[1]);
printf("Opening %s...\n",filename);
// more code goes here
return 0;
}
![Page 17: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/17.jpg)
Good Practice
• You should be able to write the code for any of the standard library functions
– e.g., computing the length of a string…
CSC230: C and Software Tools © NC State University Computer Science Faculty 17
char s[1000] = “a string”;
char *p = s;
while (*p++)
;
return (p – s);
![Page 18: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/18.jpg)
<stdlib.h> String Functions • double atof( char s[] ) converts a string
to a double, ignoring leading white space
• int atoi( char s[] ) converts a string to an int, ignoring leading white space
– These don’t return information about errors
• Could also use – strtol
– strtod/f
– sscanf
CSC230: C and Software Tools © NC State University Computer Science Faculty 18
That sucks.
Nicest, but expensive.
Fine, but error reporting is a little complicated.
![Page 19: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/19.jpg)
Arrays of Strings
• Creating a two dimensional array of chars is inefficient
– Wasted space when strings of different lengths
• Instead we want a ragged array
– Create an array where the elements are pointers to strings
char *planets[] = {“Mercury”,
“Venus”, “Earth”, “Mars”, “Jupiter”,
“Saturn”, “Uranus”, “Neptune”);
CSC230: C and Software Tools © NC State University Computer Science Faculty 19
![Page 20: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/20.jpg)
Arrays of Strings (con’t)
• Accessing a string in the array – planets[i]
• Accessing a character in a string – planets[i][j]
Example:
for (int i = 0; i < 8; i++)
if (planets[i][0] == ‘M’)
printf(“%s\n”, planets[i]);
CSC230: C and Software Tools © NC State University Computer Science Faculty 20
![Page 21: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/21.jpg)
Reminder: Go to course web page for link to exercise form. Paste code into ideone.com and submit the link.
Exercise 15a
• Make a function that does this:
CSC230 - C and Software Tools © NC State University Computer Science Faculty
21
Upper-case-ify
void uppercaseify(char* c) {
// YOUR CODE HERE
}
int main() {
char s[] = “Hey everyone!";
printf("%s\n",s); // Hey everyone!
uppercaseify(s);
printf("%s\n",s); // HEY EVERYONE!
}
Pro-mode: Don’t use any brackets in uppercaseify
![Page 22: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/22.jpg)
HERE’S A BUNCH OF FUNCTION PROTOTYPES YOU CAN READ YOURSELF
Better yet, read the manpages, or a C reference library like cplusplus.com.
CSC230: C and Software Tools © NC State University Computer Science Faculty 22
![Page 23: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/23.jpg)
<string.h>: Copying
• void *memcpy(void * restrict s1, const void
* restrict s2, size_t n);
• void *memove(void *s1, const void *s2,
size_t n);
• char * strcpy(char * restrict s1, const char
* restrict s2);
• char *strncpy(char * restrict s1, const char
* restrict s2, size_t n)
CSC230: C and Software Tools © NC State University Computer Science Faculty 23
![Page 24: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/24.jpg)
<string.h>: Concatenation
• char *strcat(char * restrict s1, const char
* restrict s2);
• char *strncat(char * restrict s1, const char
* restrict s2, size_t n);
CSC230: C and Software Tools © NC State University Computer Science Faculty 24
![Page 25: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/25.jpg)
<string.h>: Comparison
• int memcmp(const void *s1, const void *s2,
size_t n);
– n comparisons
• int strcmp(const char *s1, const char *s2)
– Stops when reaches null in either string
• int strcoll(const char *s1, const char *s2);
– Locale dependent
• int strncmp(const char *s1, const char *s2,
size_t n);
– Stops when reaches null in either string or n
comparisons, which ever is first
CSC230: C and Software Tools © NC State University Computer Science Faculty 25
![Page 26: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/26.jpg)
<string.h>: Search
• void *memchr(const void *s, int c, size_t
n);
– Like strchr, but stops searching after n characters
• char *strchr(const char *s, int c);
– Searches a string for a particular character
– Use pointer arithmetic to find additional characters
• size_t strcspn(const char *s1, const char
*s2);
– Index of first character that’s in the set s2
• char *strpbrk(const char *s1, const char
*s2);
– Pointer to leftmost character in s1 that matches any
character in s2
CSC230: C and Software Tools © NC State University Computer Science Faculty 26
![Page 27: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/27.jpg)
<string.h>: Search
• char *strrchr(const char *s, int c);
– Searches string in reverse order
• size_t strspn(const char *s1, const char
*s2);
– Index of first character that’s NOT in the set s2
• char *strstr(const char *s1, const char
*s2);
– Pointer to first occurrence of s2 in s1
• char *strtok(char * restrict s1, const char
* restrict s2);
– Scans s1 for the non-empty sequence of characters that
are not in s2
– Use to tokenize strings
CSC230: C and Software Tools © NC State University Computer Science Faculty 27
![Page 28: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/28.jpg)
<string.h>: Other
Functions • void *memset(void *s, int c, size_t n);
– Stores copy of c to area of memory of size n
• size_t strlen(const char *s);
– Length of the string, not counting the null
character
CSC230: C and Software Tools © NC State University Computer Science Faculty 28
![Page 29: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/29.jpg)
Command Line Arguments
• To use command line arguments, define main as:
int main(int argc, char *argv[]) {}
– argc: argument count
• Includes the program itself
– argv: argument vector
• Array of pointers to command line arguments stored as strings
• argv[0]: name of program
• argv[1]-argv[argc-1]: other arguments
• argv[argc]: null pointer
CSC230: C and Software Tools © NC State University Computer Science Faculty 29
![Page 30: String Processing in C - Duke Universitypeople.duke.edu › ~tkb13 › courses › ncsu-csc230 › lecture...String Processing in C C Programming and Software Tools N.C. State Department](https://reader036.fdocuments.net/reader036/viewer/2022081406/5f15746c61533a1e82078526/html5/thumbnails/30.jpg)
Processing Command Line Args
• Using arrays for (int i = 1; i < argc; i++)
printf(“%s\n”, argv[i]);
• Using pointers for (char **p = &argv[1]; *p != NULL; p++)
printf(“%s\n”, *p);
CSC230: C and Software Tools © NC State University Computer Science Faculty 30