The Relations between Document Familiarity, Frequency, and Prevalence...

The Relations between Document Familiarity, Frequency, and Prevalence and DocumentLiteracy Performance among Adult ReadersAuthor(s): Dale J. Cohen and Jessica L. SnowdenSource: Reading Research Quarterly, Vol. 43, No. 1 (Jan. - Mar., 2008), pp. 9-26Published by: International Reading AssociationStable URL: http://www.jstor.org/stable/20068327 .

Accessed: 31/10/2013 15:54

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

International Reading Association is collaborating with JSTOR to digitize, preserve and extend access toReading Research Quarterly.

http://www.jstor.org

This content downloaded from 152.20.158.159 on Thu, 31 Oct 2013 15:54:28 PMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=ira

http://www.jstor.org/stable/20068327?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


The Relations Between Document

Familiarity, Frequency, and Prevalence

and Document Literacy Performance

Among Adult Readers

Dale J. Cohen

University of North Carolina Wilmington, USA

Jessica L. Snowden

University of Nebraska-Lincoln, USA

This study assessed the utility of document prevalence and familiarity as predictors of adult document literacy

performance. Three indexes?quantifying document prevalence, document familiarity, and the frequency of document use?were constructed using survey responses from an adult community sample and documents collected from

government agencies and businesses. All three indexes significantly predicted document task performance on the 1992

National Adult Literacy Survey and the 2003 National Assessment of Adult Literacy, both of which were administered

by the U.S. Department of Education. The three indexes, as individual predictors, accounted for 70% (familiarity), 51%

(frequency of use), and 31% (prevalence) of the variation in document task performance. Document familiarity may aid in the search and retrieval of information from documents, thereby facilitating document literacy.

A document is a symbolic display of information

that does not consist predominantly of written

prose (Guthrie, Weber, & Kimmerly, 1993).1 Whereas prose materials are principally read to enter

tain or educate, documents are primarily used to obtain or distribute information (Guthrie, Seifert, & Kirsch, 1986). Documents distribute information efficiently by

presenting information in a format that can be searched

without reading the entire content (Guthrie & Kirsch, 1987; Guthrie & Mosenthal, 1987; Mosenthal & Kirsch, 1992). Although documents often contain text, the text is

usually "noncontinuous" (White & Dillow, 2005, p. 4).

Examples of documents include diagrams, charts,

graphs, tables, forms, lists, floor plans, blueprints, and

maps. In this article, we address two basic issues associ

ated with document processing: (1) the types of docu ments that are encountered in society and with what

frequency and (2) the relation between document famil

iarity and document literacy. Document literacy is a core component of an individ

ual's ability to function in modern society. It is essential

for effective participation in financial transactions (e.g.,

filling out checks, deposit slips, and loan applications and comprehending bills and benefit statements from in

surance companies), promotion of health and well

being (e.g., understanding nutritional information and

risk and dosage information on food and pharmaceuti cal packaging, respectively), and engaging in transporta tion and leisure activities (e.g., deciphering bus and

television schedules, airport arrival and departure list

ings, and sports results). Because of the importance and

pervasiveness of documents, an inability to use them ef

fectively can dramatically inhibit societal participation. Guthrie et al. (1986) revealed the ubiquity of docu

ments in society. The authors randomly selected and in

terviewed 102 adult wage earners with different

occupations in a town of 6,000 households. The occu

pations held by those in the sample were clerical/small

business (34%), skilled (29%), unskilled (19%), and pro fessional (18%). The average amount of time that these

individuals spent reading documents each day (99 min

utes) far exceeded the daily time spent reading any

Reading Research Quarterly 43(1) pp. 9-26 dx.doi.org/10.1598/RRQ.43.1.2 ? 2008 International Reading Association 9



other form of written material such as reference (32 min

utes), fiction/viewpoint (32 minutes), society/science (21

minutes), news/business (16 minutes), or recreation (14

minutes). In addition, documents were read more at

work than anywhere else.

Given the centrality of documents in modern society, the U.S. government devotes considerable resources to

the measurement of the U.S. population's ability to use

documents successfully. The 2003 National Assessment

of Adult Literacy (NAAL; see Kutner, Greenberg, & Baer,

2005; White & Dillow, 2005) and its predecessor, the

1992 National Adult Literacy Survey (NALS; see Kirsch,

Jungeblut, Jenkins, & Kolstad, 2002), constitute the most

comprehensive measures of adult literacy to date in the

United States. These assessments measure three types of

literacy: prose, quantitative, and document. The inclu

sion of document literacy in these assessments reflects

the federal government's determination that it is as im

portant as prose and quantitative literacy to the success

ful functioning of U.S. adults. In 2003, only 13% of the

adults surveyed in the NAAL displayed proficient docu

ment literacy (with proficient defined by the National

Center for Education Statistics as the ability to "perform more complex and challenging literacy activities"; Kutner

et al., 2005, p. 3). Several researchers have proposed that readers' famil

iarity with a document type is likely associated with how

efficiently they process documents of that type (Guthrie,

Britten, & Barker, 1991; Guthrie & Mosenthal, 1987; Kirsch & Mosenthal, 1990; Mosenthal & Kirsch, 1998; Shah & Hoeffner, 2002; Shah, Mayer, & Hegarty, 1999;

Winn, 1993). In this article, we use the term document

type to refer to a set of documents that have similar stan

dardized formats. The critical issue is that all exemplars of a specific document type adhere to the same structur

al and graphical conventions. Familiarity likely predicts

performance on any task; however, because documents

make extensive use o? signaling conventions, familiarity with specific document types may be more predictive of

the effective use of documents than other, more general

predictors of literacy such as reading comprehension

ability or education. Signaling conventions are the cus

tomary use of spatial and graphical arrangements to indi

cate associations between pieces of information; that is, most documents follow specific spatial and graphical conventions that dictate where important information

can be found, independent of the document content.

Signaling conventions often serve as substitutes for ex

planatory text and are generally not explained in the doc

ument. To illustrate, a typical form may contain the

following:

Date

This text should be interpreted as "Write today's date on

the following underline." Here, the graphical conven

tion of an underline symbolically identifies the place where the reader should insert information, and the spa tial convention of placing a word adjacent to that under

line symbolically identifies the word as indicating the

type of information to be inserted. Successful interpreta tion of any document relies almost exclusively on the

reader's knowledge of the signaling conventions of that

document rather than on the document's actual content

or the reader's reading comprehension ability. Thus, even

people who are highly literate may misinterpret specific document types simply because they are not familiar

with the signaling conventions associated with those doc

ument types.

Unfortunately, little research has directly addressed

the role of familiarity in general document literacy. Spratt,

Seckinger, and Wagner (1991) provided limited evidence

for the influence of familiarity on document use. These re

searchers tested the document literacy skills of Moroccan

third through sixth graders, whose performance was

scored as correct or incorrect across 14 common house

hold tasks. These tasks (e.g., interpreting dosage from a

medicine box, reading an electric bill) required familiarity with document structure and content. The study showed

that children who had previous exposure to similar

documents performed the tasks more successfully than

children who had not had such exposure. One reason for the limited research addressing the re

lation between document familiarity and effective docu

ment use might be the difficulties associated with

measuring such a relation. To determine the relation be

tween familiarity with specific document types and effec

tive document use, researchers must (a) know the types and frequency of documents found in a society, (b) have

a general measure of relative familiarity with each docu

ment type from a large and representative sample, and (c) have a general measure of individuals' abilities to use

different document types from a large and representa tive sample. This article attempts to satisfy these three cri

teria and to assess the relations between document

familiarity, document frequency, document prevalence, and document literacy.

To gain knowledge of the relative prevalence of dif

ferent document types, we collected and indexed a large

sample of documents from the print and electronic me

dia. Any such sampling is difficult to conduct and in

evitably imperfect. Nevertheless, obtaining even an

imperfect document sample can be a valuable first step: Such a sample provides information on the frequency

with which documents may be encountered in society

and, accordingly, can serve as the foundation for predic tions of document literacy based solely on a reader's fa

miliarity with specific document types.

10 Reading Research Quarterly 43(1)



It is unlikely, however, that the distribution of vari

ous kinds of documents in society correlates perfectly with the average familiarity of adults with each kind of

document. For example, although financial tables

abound in the media (e.g., the business sections of ma

jor daily newspapers around the world contain numer

ous financial tables), the average reader may not use

them. Therefore, we also assessed perceived familiarity with document types within a large, representative sam

ple of adults. From these data, we developed three doc

ument indexes: the first based on the relative frequency of each document type in the document sample, the sec

ond on adults' self-reported familiarity with each docu

ment type, and the third on adults' self-reported

frequency of use of each document type. Finally, we as

sessed the relations between these three measures of doc

ument familiarity and performance percentiles on

document literacy tasks from the NALS and the NAAL.

Method

Participants Adult participants (aged 18 years or above) recruited

from two North Carolina, USA, Division of Motor

Vehicles (DMV) offices, in Raleigh and Wilmington, were

asked to serve as paid volunteers in the present study. The DMV offices were chosen as the data collection lo

cations because (a) a diverse and representative adult

population was found at these locations, (b) adults at

these offices often had time to complete a survey while

waiting for DMV services, and (c) recruiting participants at these locations was economically feasible.

Of the 660 people approached, 548 agreed to par

ticipate in the study (a refusal rate of 17%; 67 female, 45 male). The main reason for refusal was lack of inter

est (43%), followed by inability to read or write English (28%), lack of time (18%), a medical condition that

would have impaired participation (6%), and the need to supervise children (5%). An 83% acceptance rate ex

ceeds the acceptance rates of most representative sur

veys (Hochstim, 1967; Locander, Sudman, & Bradburn,

1976; Siemiatycki, 1979). Of those who participated, 76 (13.9%) were removed

for administrative reasons. Specifically, 3.3% were re

moved because they failed to respond to at least five

items in the survey, and 10.6% were removed for lack

of variability in their responses (i.e., they provided the same response to more than 25% of the survey ques tions). The data from the remaining 472 (86.1%) partic

ipants were analyzed. To assess how well the sample demographics

matched those of the U.S. and North Carolina (NC) adult

populations, we compared the demographic characteris

tics reported by participants to those found in the U.S.

2000 Census and the NC 2000 census data (United States Census Bureau, 2005a, 2005b) using either chi

square or z tests. The chi-square is an omnibus test;

therefore, if just one level of a variable is significantly different from the expected values (here the census val

ues), the chi-square value will be significant. If the om

nibus chi-square value was significant, then we

calculated posthoc chi-square tests to discover which lev

el(s) differed significantly from the expected value(s). To

compensate for alpha inflation, the chi-square tests at

each level were Bonferroni corrected (alpha = 0.025 for

two-level chi-square tests; alpha = 0.017 for three-level

chi-square tests; etc.). For inferential tests, the alpha lev

el was held at .05. See Table 1 for a comparison of the de

mographic characteristics of the sample, U.S., and NC

populations. Our sample demographics corresponded closely with

those of the U.S. and NC populations. Nevertheless, be cause of the large N, statistical differences in some sample

demographics were found between the sample and the

U.S. and NC populations. In these cases, we weighted the

variable appropriately and tested whether the small dif

ferences found between the sample demographics and

those of the larger U.S. and NC populations affected our

results in a meaningful way (see Results, following).

Materials

Document Sample First, it is important to clarify some terms. We use docu ment element to refer to the smallest meaningful struc

tural component of a document. Document elements

may form a specific exemplar of a given document in iso

lation (e.g., a two-dimensional bar graph is a specific type of graph and may meaningfully exist in isolation) or in

combination with other document elements (e.g., an

electric bill may contain a calendar element, a bar graph element, and several list elements). Document elements constitute the building blocks of documents; they repre sent a constellation of features that may or may not be

present in a given example of a document. Document cat

egory refers to a broad categorization of the document

elements or their combinations (e.g., forms, lists, graphs).

Finally, document refers to the combination of one or

more document elements on a single page (e.g., a finan

cial report containing several graphs, lists, and tables). More complex documents usually contain multiple doc ument elements. For example, a simple form may con

tain only checkboxes, whereas a complex form may contain checkboxes, intersected table forms, left-labeled

lines, and so on.

Because there was no published U.S. document cen

sus, we conducted a systematic document sample.

During November 2004, we collected all printed material

The Relations Between Document Familiarity, Frequency, and Prevalence and Document Literacy Performance Among Adult Readers 11



Table 1. Comparison of Sample Demographics With Those of North Carolina and the United States

Sample U.S. NC Sample vs. U.S. Sample vs. NC

Variable Count proportion proportion proportion omnibus p value omnibus p value

Age

Under 35 240 51.61% 32.05%* 33.09%* < 0.001 < 0.001 35-60 213 45.81% 46.10% 45.70%

Over 60 12 2.58% 21.83%* 21.10%*

Education

Less than HS diploma 32 6.85% 19.60%* 21.80%* < 0.001 < 0.001

Only HS diploma 269 57.60% 55.90% 55.70%

At least bachelor's degree 166 35.55% 24.40%* 22.50%*

Sex

Female 248 53.22% 50.90% 51.00% 0.16 0.17 Male 218 46.78% 49.10% 49.00%

Hispanic Ethnicity

No 404 95.73% 87.50%* 95.30% < 0.001 0.34 Yes 18 4.27% 12.50%* 4.70%

Income

0-24K 165 35.71% 28.60%* 30.70% < 0.001 0.0035 25-49K 157 33.98% 29.30% 31.60%

50K+ 140 30.30% 42.00%* 37.70%*

Race

White 294 63.23% 78.39%* 73.57%* < 0.001 < 0.001 Black 150 32.26% 12.84%* 22.04%*

Other 21 4.52% 8.64%* 4.37%

Note. The source for the U.S. and NC data is the United States Census Bureau (2005a, 2005b). HS: high school; K: thousands of U.S. dollars.

*p<.05.

that contained information presented in a nonprose for

mat (e.g., tables, maps, diagrams, lists, graphs) from the

following sources: (a) the 10 magazines with the highest U.S. circulation (Better Homes and Gardens, Family Circle, Good Housekeeping, Ladies' Home Journal, National

Geographic, The New York Times Style Magazine, Reader's

Digest, Time Magazine, TV Guide, and Woman's Day) and

(b) the first two levels of the 10 websites with the most

unique visitors as ranked by ranking.com (msn.com,

passport.com, yahoo.com, passport.net, google.com,

trafficmp.com, microsoft.com, tickle.com, aol.com, geoc

ities.com, and ebay.com). Further, for one week, we col

lected the five newspapers with the highest U.S.

circulation (the Los Angeles Times, The New York Times, USA Today, The Wall Street fournal, and The Washington Post). During the same month, we also systematically col

lected documents from NC public service agencies (e.g., the DMV; the Wilmington Housing Authority; and New

Hanover County government offices, including the

Department of Social Services, the Tax Administration, and the Health Department). We visited the local offices

of public service agencies and obtained one of each piece of literature in the lobby area. From those agencies that

provided a response, we collected additional written ma

terials noted as being commonly used and referenced

(e.g., application forms for Section 8 housing, the U.S.

Department of Housing and Urban Development's

Housing Choice Voucher program). In addition, two graduate research assistants, both

of whom had undergraduate psychology backgrounds, collected the documents they encountered during a two

week period. This student collection was intended to

produce personal documents (e.g., receipts, bills) that

would supplement the document types obtained from

government and public service agencies. All duplicate documents were removed, and a representative sample of

the remaining documents was scanned into a computer. The document sample resulted in scans of 3,118 unique document elements.

We categorized these document elements into 10

broad document categories: bar graphs, line graphs, pie charts, diagrams, maps, lists, tables, forms, bills and re

ceipts, and Internet elements. These categories were then

further subclassified according to subcomponent fea

tures, which may or may not have been present in any

particular example of a document. The subcomponent




features of the document categories are referred to as

document elements. We subdivided the broad document

categories with the intention of having 50-100 different

document elements in total. (For example, the document

category "Tables" was divided into six subcomponent document elements: intersected, borderless, feature,

schedule, split, and diagonal.) We felt that this would

be a manageable number of document elements for par

ticipants to review in our familiarity study. Our subdivi

sion of the 10 broad document categories resulted in the

identification of 74 different document elements (see

Appendix A).

Stimulus Booklets

Using examples from the document sample, we created

two versions of a stimulus booklet that illustrated and de

scribed each of the 74 document elements. The two

booklets, presented in three-ring binders, were identical

with the exception of the specific examples illustrating each document element. The descriptions of each docu

ment element did not vary across booklets. The book

lets were divided into 10 sections, one for each of the

broad document categories, in the following order: bar

graphs, line graphs, pie charts, diagrams, maps, lists, ta

bles, forms, bills and receipts, and Internet elements.

The first page of each section in the stimulus book

let displayed the title of the document category and a ver

bal description or definition of that category. Within each

section, the different document elements of a given doc

ument category were presented individually, one per

page (see Appendix A for the element order). In most

cases, the examples illustrating each document element were complete documents in which the element being

highlighted was the only or the major element present

(e.g., a bar graph). In some cases, only the portion of a

document that contained the document element was in

cluded (e.g., a list of checkboxes from a larger form). Each page within a section contained the names of the

broad document category and the document element, a

one- or two-sentence description of the document ele

ment, and an example illustrating the document element.

The names of the document elements were included in

the stimulus booklet to facilitate participants' ability to lo

cate the appropriate page in a corresponding response booklet. Each page in the stimulus booklet was protected

by a clear plastic page protector. See Figure 1 for an ex

ample of a stimulus booklet page.

Response Booklets A response booklet was created for each participant to

record familiarity with and frequency of use of each of

the 74 document elements depicted in the stimulus

booklet. The response booklet, which was stapled along the left-hand side and printed in black and white, con

Figure 1. Example of a Stimulus Booklet Page

Three-Dimensional Bar Graphs

The bars in three-dimensional bar graph appear to go back into space and looks like posts or pillars.

How Much Disposable Income Is Enough?

individual Family of 2 Family of 3 Family of 4

$600 $850 $1,050 $1,250

Loan Services and Procedures. (2007). [Brochure]. Raleigh, NC: State

Employees' Credit Union (of North Carolina).

sisted of 20 pages, front and back, for a total of 40 pages. The first two pages of the response booklet contained de

mographic questions and directions. The last page was

blank, with the remaining 37 pages depicting 2 elements

each, for a total of 74 elements.

The first page of the response booklet contained

questions about the demographic characteristics of the

participant, including age, sex, race, annual income, and

highest level of education. The back of the first page

(page 2) consisted of instructions on how to complete the

response booklet. In particular, the instructions, which

survey administrators read aloud, asked participants to

indicate

how familiar you are with each document type and how fre

quently you use each document type. Please do not let the

specific content of the example of a document type influence

your decision. For example, we may present you with a bar

graph displaying the income level of families in the United States. We are interested in how familiar you are with bar

graphs in general and how frequently you use bar graphs in

general. We are not interested in how familiar you are with the

income level of families in the United States.

We further instructed the participants on how to use the

response scale:




We will ask you to report your familiarity with a document

type using a visual scale. The visual scale will simply be a long,

horizontal line. One end of the line indicates that you are com

pletely unfamiliar with the document type. The other end of

the line indicates that you are very familiar with the document

type. We ask you to simply put a mark on the line that indi

cates your true familiarity with the document type. So, if you

are only vaguely familiar with the document type, put a line

closer to the end that indicates complete unfamiliarity. If you

are moderately, familiar with the document type, put a line

near the middle of the line and, if you are extremely familiar

with the document type, put a line closer to the end that in

dicates very familiar. Please see the example below.

Figure 2 shows the familiarity and frequency visual

scales from the response booklet. The scales used to rate

familiarity and frequency were presented as bracketed

lines upon which participants were instructed to mark a

perpendicular line indicating how familiar they were with

the document element and how often they used it. For ex

ample, the familiarity question for a two-dimensional bar

graph read, "How familiar are you with two-dimensional

bar graphs?" The corresponding frequency question read, "How frequently do you use two-dimensional bar

graphs?" Below each question was a bracketed, horizon

tal line with anchors at each end. The anchors for the fa

miliarity question read Not At All (left anchor) and Very Familiar (right anchor). The anchors for the frequency

question read Never (left anchor) and Very Often (right anchor). The lines were both either 129 mm or 135 mm

in length. The 6 mm difference resulted from slight dif

ferences in reproduction of a subset of booklets. These

differences, however, did not pose a problem in the sur

vey because they occurred between, not within, booklets.

Therefore, each participant saw only one scale length. In

addition, before analysis, the raw response values were

converted from the absolute distance between the left

scale bracket to the participant's mark (in mm) to the

proportion of the scale to the left of the participant's mark (i.e., distance in mm from the left scale bracket to

the participant's mark divided by the total length of the

visual scale). The remaining 37 printed pages of the response

booklet each presented two document elements to be rat

ed. The top half of each page consisted of a thumbnail

image of the first document element to be rated with the

two visual rating scales printed underneath (one for. fa

miliarity and one for frequency); the bottom half con

tained the thumbnail image of the second document

element to be rated, also with visual rating scales print ed below (for an example of a document element and rat

ing scales, see Figure 2). We created four versions of the response booklet,

varying the order in which the familiarity and frequency

Figure 2. Example of Visual Scales From the Response Booklet

THREE-DIMENSIONAL BAR GRAPHS (page 4)

Not At All

Never

How Much Disposable JSrsconnc h Enough?

all **,t#e ii.H?

How familial* aie you with three-dimensional bar graphs?

How frequently do you use three-dimensional bar graphs?

?I

Veiy Familial

H

Very Often

Loan Services and Procedures. (2007). [Brochure]. Raleigh, NC: State Employees' Credit Union (of North Carolina).




rating scales were presented and the specific examples of document elements pictured above the rating scales.

Two stimulus booklet-response booklet pairs were creat

ed such that the thumbnail images contained in the re

sponse booklet were identical to the 74 document

element examples found in the corresponding stimulus

booklet. (Some of the elements in the stimulus booklets

had more than one example; however, the response booklets showed a thumbnail of only one of the images.) Half of the response booklets in each of the two pairings asked for the participants' familiarity with the document

element before asking how frequently they used that doc

ument element; the other half asked how frequently the

participants used the document element before asking how familiar they were with the document element. No

definitions were provided in the response booklet; how

ever, the name of the document element to be rated,

along with the page number on which it could be found

in the stimulus booklet, was listed above each thumb

nail image. This information was provided to assist the

participants in synchronizing the stimulus and response booklets. Below the thumbnail images, participants used

the visual scales to rate their familiarity with and frequen

cy of use of each document element.

Procedure

Survey administrators approached potential participants and asked if they would like to participate in a study in

exchange for $5. To obtain a representative sample, every individual waiting in the lobby of the two DMV offices

who was not involved in a distracting task (e.g., talking on a cell phone, studying a"driver's license manual) and

who confirmed that he or she was at least 18 years of age was invited to participate in the survey. Upon consent, each participant was handed a stimulus booklet

response booklet pair. Most participants were able to

complete the survey before their turn came to obtain the

services that had brought them to the DMV office.

The participants began by reporting their demo

graphic information on the first page of the response booklet. Then, the survey administrator read the instruc

tions aloud to the participants, who followed along on

the second page of the response booklet. The participants were instructed to rely on the examples and definitions of

the document elements found in the stimulus booklet to

determine their responses and to use the thumbnail im

ages in the response booklet only as a guide to ensure

that they were judging the correct element. Then, they were told to ignore the content information presented in

the document element examples and to focus instead on

the general format of the document element. Finally, the

participants were instructed on how to mark the line on

the familiarity and frequency visual scales to indicate

their ratings, and they were given an opportunity to ask

for clarification regarding the instructions. The survey administrator then observed participants unobtrusively

while they completed the response booklet.

Questions about the meaning of familiarity or fre

quency were answered prior to beginning the survey. The survey administrator answered additional questions as they arose during the survey. Each response booklet

contained responses from a single participant. Upon

completion of the response booklet, participants were

paid $5 for their involvement. The survey generally last

ed from 15 to 20 minutes.

Results To obtain familiarity and frequency scores from the sur

vey, we calculated the proportion of each visual scale to

the left of the line drawn by the participant. Familiarity and frequency rating values ranged from 0 to 1 with larg er numbers indicating higher familiarity and higher fre

quency. These proportion scores were standardized by

participant and within scales (familiarity and frequency) to eliminate each participant's individual bias associated

with using the rating scales. The resulting scores for each

rating scale by participant had a mean of 0 and a standard

deviation of 1.

From these standardized scores, we created a

Document Familiarity Index and a Document Frequency Index. Specifically, the average standardized frequency and average standardized familiarity ratings were com

puted for each document element. Within each scale (fa

miliarity and frequency), each document element was

assigned a ranked value corresponding to the magnitude of that element's standardized average relative to that of

the other document elements (from lowest to highest, such that the largest ranking value corresponded to the

largest standardized average). These ranked values con

stituted the Document Familiarity Index and the

Document Frequency Index.

In addition to these unweighted scales, we created

scales by weighting each participant's standardized score

so that the data set would be consistent with the U.S. or

NC demographic values (see Table 1). We created sepa rate scales weighted by age, education, sex, Hispanic eth

nicity, income, and race for the U.S. and the NC

demographic values. The scales derived from these

weighted scores were designed to compensate for the

small differences between the national and state census

es and the DMV sample. If the sample was inordinately skewed, then these weighted scales should have been

quite different from the unweighted scales. In contrast, if the differences between the sample and the U.S. or NC

censuses were negligible, then these scales should have

correlated highly with their unweighted counterpart. The

weighted familiarity scales were highly correlated with




the unweighted familiarity scale (mean correlation = .99, SD = .01). Similarly, the weighted frequency scales were

highly correlated with the unweighted frequency scale

(mean correlation = .99, SD = .01). Thus, the differences

between the DMV sample and those predicted from the

national and state censuses were negligible. Because of

these highly linear relationships, the unweighted scores

were used to create the Document Familiarity and

Frequency Indexes.

When collecting participants' self-reported frequency and familiarity data, the four versions of the response booklet were factorially combined based on whether fa

miliarity or frequency ratings were requested first and on

whether the first or second version of the stimulus book

let, both of which contained different examples of each

of the 74 document elements, had been used to provide a thumbnail example. These variations were intended to

ensure the generalizability of the data beyond the specific

examples of document elements presented to the partici

pants. To assess the effect of these variations (i.e., order

of rating and document element example), Document

Familiarity Indexes and Document Frequency Indexes

were created from the data for each of the four versions

of the response booklet. These indexes were then correlat

ed with one another and with the aggregate Document

Familiarity and Document Frequency Indexes. The fa

miliarity indexes were highly correlated with one anoth

er across response booklets (mean correlation = .98, SD = .02). Similarly, the frequency indexes were highly corre

lated with one another across response booklets (mean correlation = .98, SD = .02). Thus, the variations in book

lets did not meaningfully affect the results.

Table 2 shows the final document indexes created

from the aggregate data set (Appendix B presents the in

dexes organized from least familiar to most familiar doc ument element). The table contains the Document

Familiarity Index and the Document Frequency Index, as

well as a ranking of each document element's estimated

prevalence based on the document sample (termed the

Collected Prevalence Index). The purpose of the docu

ment sample was to identify a general set of document el

ements to be assessed in the present study; the document

sample was not intended to provide an accurate assess

ment of the frequency of document elements in the en

vironment. Nevertheless, the frequency of elements in

the document sample can provide some information

about the frequency of these elements in the environment

(however, future research directed solely for that purpose would be valuable). Thus, we used this document sample

information to create the Collected Prevalence Index, which was also examined as a predictor of document lit

eracy. Although the Document Familiarity Index and the

Document Frequency Index were highly correlated, r =

.90, the two scales were not identical (the 99% confi

dence interval around the correlation was .85 to .95).

The Collected Prevalence Index did not correlate well

with the Document Familiarity Index, r = .08, or with the

Document Frequency Index, r = .17. These relatively weak correlations were likely a function of the way the

document sample was assembled and the fact that, al

though many document elements occurred frequently in the environment, they might not have been used fre

quently by most viewers (e.g., tables in the business sec

tions of newspapers). These issues are further elaborated

in the Discussion section.

Document Indexes and Performance We assessed the relation between each of the three doc ument indexes and performance on those items on the

2003 NAAL and the 1992 NALS designed to assess doc ument literacy. The NAAL and NALS performance meas

ure, p value, roughly corresponds to the percentage of the

adult population that correctly responded to the item.

Thus, the p values were bounded by 0 and 1, and greater

p values indicated that more respondents correctly an

swered the item. We assumed that the successful comple tion of an item required the reader to understand all

document elements in the task. Hence, the presence of a

single document element with low familiarity within a

task should have decreased the probability that the read er would complete that task successfully. Given this as

sumption, the least familiar document element in a

NAAL or a NALS document literacy task should have

predicted the overall performance on that item. We

therefore used the lowest ranked document element

within a task to predict performance. We used the mini

mum familiarity rating when predicting performance with the Document Familiarity Index, the minimum fre

quency rating when predicting performance with the

Document Frequency Index, and the minimum collected

prevalence rating when predicting performance with the

Collected Prevalence Index.

There were 51 unique document tasks across the

NAAL and the NALS (for examples of such tasks, see

Kutner et al., 2005). We identified the minimum famil

iarity rating for each of these document tasks. There were

19 unique document elements that satisfied this criteri

on. Because multiple factors may have affected perform ance on a task item and we wanted to identify the

underlying effect of document familiarity?if, indeed, such an effect existed?we averaged the p values by the

minimum familiarity ratings. This resulted in 19 data

points that represented the average performance for each

least familiar document element in the NAAL and NALS.

Although it would have been ideal to assess perform ance on a wider range of document elements, the NAAL

and NALS were the best estimates of document literacy available. One item?the only item containing a line

graph?was removed from the analysis because it was




Table 2. Document Indexes in Alphabetical Order by Document Element

Document element

Document

Familiarity Index

Document

Frequency Index

Collected

Prevalence

Index

Address list 39

Alphabetical index 54

Below-labeled line 47

form

Bill 71 Boat sign diagram 1

Borderless table 34

Bubble form 42

Bulleted hyperlink 38

Bulleted list 43

Calendar 74

Categorical map 33

Checkbox 55

Checklist 61

Circle form 58

Classified index 24

Column-only table 25

form

Comma-separated list 16

Conceptual diagram 6

Cover 70

Crossword 66

Diagonal table 9

Distance diagram 11

Divisional map 29

Drop-down menu 69

Exploded diagram 3

Feature map 27

Feature table 22

Floor plan diagram 36

Geographic map 19

Horizontal hyperlink 46

Icon hyperlink 59

Implied table 13

Insert diagram 7

Insert map 20

Internet button 65

Internet checkbox 53

Internet form box 48

Internet radio button 18

Internet search element 68

45

58

43

72

1

39

31

50

48

74

22

47

53

46

38

28

23

7

67

25

11

9

29

69

4

36

30

21

15

59

66

19

8

20

68

62

54

27

71

58

NA

44

7

4

55

9

NA

57

20

45

49

27

17

38

32

42

12

24

29

3

38

51

NA

1

21

17

17

34

NA

NA

60

40

31

NA

NA

NA

NA

NA

Document element

Document

Familiarity Index

Document

Frequency Index

Collected

Prevalence

Index

Intersected table 32

Intersected table form 1 7

Labeled box form 49

Labeled diagram 26

Labeled individual box 62

form

Labeled list 40

Left-labeled line form 52

Line graph 44

Mailing form 64

Menu 73

Movement diagram 14

Numbered list 41

Overlayed stacked 5

bar graph Pie chart 57

Political location map 12

Receipt 72

Recipe 60

Road map 67

Road sign 63

Row-only table form 21

Schedule table 51

Split bar graph 4

Split table 15 Stacked bar graph 8

Structure/Building 30

location map Tab/New line list 10

Tab menu 56

Three-dimensional 35

bar graph Tournament diagram 31

Two-dimensional 37

bar graph Unclassified index 28

Underlined hyperlink 45

Venn diagram 2

Vertical hyperlink 50

Weather map 23

41

16

44

24

56

42

51

33

52

73

10

49

5

37

12

70

61

65

57

32

55

2

13

6

35

17

64

18

26

14

40

60

3

63

34

56

22

34

41

37

61

53

52

16

15

32

54

7

26

50

11

27

47

36

4

25

14

9

22

43

59

NA

6

1

48

46

NA

12

NA

30

Note. The Internet elements were eliminated from the Collected Prevalence Index because the collection method (i.e., the first 2 levels of the 10 most popular websites) skewed their measured prevalence in the environment (the eliminated elements are designated in the table as NA). Therefore, whereas the Document

Familiarity Index and the Document Frequency Index range from 1-74, the Collected Prevalence Index ranges from 1-61 (higher numbers indicate elements

rated as more familiar, frequent, or prevalent).

an extreme outlier (the data point had both high distance

and leverage and, thus, had undue influence on the re

gression surface; e.g., Howell, 2002). Because this was

the only line graph item and a particularly difficult ques tion was posed, the data point skewed low. The same

process was used to calculate the minimum frequency

rating and the minimum prevalence rating (which had 25

unique document elements) for each task item.

A linear regression with average p value (M = 0.80, SD = 0.10) as the outcome measure and minimum perceived

familiarity from the Document Familiarity Index (M = 26.1, SD = 14.6) as the predictor was significant, F(l, 16) =

37.90, p < .001, r = 0.84. Document familiarity was highly

predictive of performance and accounted for 70% of the

overall variance. Figure 3 shows the linear regression. A linear regression with average p value (M = 0.79, SD

= 0.09) as the outcome measure and minimum perceived

frequency from the Document Frequency Index (M =

27.6, SD = 14.6) as the predictor was significant, F(l, 16) =

16.5, p < .001, r = 0.71. Document frequency was pre

dictive of performance and accounted for 51% of the

overall variance. Figure 4 shows the linear regression. A third regression was computed with average p val

ue (M = 0.80, SD = 0.11) as the outcome measure and




Figure 3. Linear Regression Showing Relation Between Perceived Familiarity and Performance on Document

Literacy Tasks

1

0.8

0.6

0.4-|

0.2

0

p value

v=0.0055x +0.6512

r2 = 0.70

0 10 20 30 40 50 60

Least Familiar Document Element Ranking on the Document Familiarity Index

Figure 4. Linear Regression Showing Relation Between Perceived Frequency of Use and Performance on Document Literacy Tasks

1

0.8H

0.6

0.4

0.2H

0

p value

y=0.0045x + 0.6661

r2 = 0.51

0

?i?

10 20 30 40

?i?

50

?i

60

Least Frequent Document Element Ranking on the Document Frequency Index

minimum prevalence from the Collected Prevalence

Index (M = 35.4, SD = 17.7) as the predictor. There were

25 unique document elements that were ranked lowest on the Collected Prevalence Index across document

tasks. This regression was not significant, F(l, 23) = 1.50, r = 0.24 (see Figure 5). However, when two outliers were

removed (i.e., tasks associated with tab-delimited lists

and intersected tables), the regression became significant, F(l, 21) = 9.30, p < .01, r = 0.55. Collected prevalence was predictive of performance and accounted for 31 %

of the overall variance (see Figure 6).

Discussion The data from this study reveal a strong correlation be

tween the perceived familiarity of document elements and

performance on document literacy tasks. Correlations also are present between the perceived frequency of document

element use and performance and between document

prevalence in the environment and performance; howev

er, these correlations are not as strong as that between fa

miliarity and performance. The present study is the first to

demonstrate broadly the relation between the perceived

familiarity, perceived frequency, and estimated prevalence of document elements and adult performance on docu

ment literacy tasks.

The Document Familiarity Index accounted for ap

proximately 70% of the variance in the document items

of the NAAL and NALS. Because the stimuli and data

used to create the Document Familiarity Index were in

dependent of the items and data from the NAAL and

NALS, the results of the present survey were not prede termined. That is, the participants in the present experi

ment who rated familiarity and frequency were not asked to complete, nor did they ever see, the performance tasks

of the NAAL or NALS. Therefore, these participants could not directly anticipate performance on these tasks.

Furthermore, many of the document elements that par

ticipants were asked to rate were isolated elements from a document?thus, these elements were not

analogous to

the complete documents used in the performance tasks.

Finally, because we used isolated document elements, we

expect it would have been very difficult for the partici

pants to accurately imagine a task (similar to those used in the NAAL and NALS) for which they could anticipate

performance. Thus, the present results demonstrate the

strong relation between document familiarity and effec

tive document use. Indeed, once an individual is famil

iar with particular document categories and their

subcomponent elements, this familiarity may facilitate

document use in several ways.

It is likely that document familiarity facilitates the

information search and extraction stages common to

most models of document processing (e.g., Guthrie &

Mosenthal, 1987). The majority of theories of document

processing agree that the efficiency with which readers

can search documents is critical to their ability to extract

information successfully from documents (Fisher, 1981;

Guthrie, 1988; Guthrie & Mosenthal, 1987; Kirsch &

Mosenthal, 1990; Mosenthal & Kirsch, 1991). Guthrie et ah (1991) presented the same informa

tion in three different text formats: two document for

mats (a table and a directory) and a prose format.

Participants were asked to answer a question and were

presented with one of the three formats. The readers

searched the text using headings on a computer screen,

which, when chosen, led them further through the text.

Once a heading was selected, new information appeared, and the participants had to continue choosing the appro

priate categories of information to narrow their search

until they located the answer to the question. The com




puter recorded time spent searching for the answer, as

well as the errors made during the search.

Readers in the Guthrie et al. (1991) study spent more

time searching a given format than engaging in any oth er process associated with the task. Furthermore, when

readers searched the document formats, they spent pro

portionally more time selecting the appropriate cate

gories and proportionally less time extracting information from this text format relative to the prose format. Participants also used different search strategies to complete the task: While most readers used an effi

cient search strategy that relied on the headings to nar

row their search, some readers used either an exhaustive

strategy that entailed reviewing all the information in

the text or an erratic strategy that involved searching ran

domly for the information within the text.

Document familiarity may facilitate document search

because many documents are highly structured. By defi

nition, highly structured documents contain similar doc

ument elements in a structured format. These structured

documents contain vital information in similar places across exemplars. In such instances, it is economical for

the reader to develop a mental model of the structure and

document elements contained within different document

types (i.e., causal mental models based on long-term do

main knowledge; for a review, see Markman & Gentner,

2001). Thus, readers likely have a different mental mod

el for each specific document type with which they are fa

miliar. When confronted with a document, readers may recall and use these mental models, which, if accurate, should aid them in locating the vital information. For

example, menus often contain the price of a dish to the

right of the listing for that dish. For those with an accu

rate "menu" mental model, a request to locate price should be facilitated when the information is near the

predicted location and inhibited when it is not.

Document mental models also likely contain infor

mation about the spatial and graphical conventions as

sociated with individual document elements. Such

information is often critical to the accurate interpreta tion of documents. Thus, familiarity with a document

type suggests knowledge of the spatial and graphical con

ventions associated with the document elements con

tained within that document type. Indeed, the extensive

use of spatial and graphical conventions in documents

may motivate readers to form document mental models.

The prediction that familiarity might interact with

document search by providing a mental model that the

reader can recall and use is consistent with previous liter

ature on prose literacy. Meyer (1985), for example, not

ed that skilled readers develop a structure strategy, based on prose customs, which they apply when confronted

with a text. In addition to guiding the search for pertinent content for immediate use, these structure strategies

might help with cognitive storage of such content for

Figure 5. Linear Regression Showing Relation Between Document Prevalence and Performance on Document

Literacy Tasks

1

0.8

0.6

0.4

0.2

0

p value

y = 0.0015x + 0.7484

r2 = 0.06

0 10 20 30 40 50 60 70

Least Prevalent Document Element on the Collected Prevalence Index

Figure 6. Linear Regression Showing Relation Between Document Prevalence and Performance on Document

Literacy Tasks With Two Outliers Removed

1

0.8

0.6

0.4

0.2

0

p value

y=0.0032x + 0.7123

r2 = 0.31

10 20 30 40 50 60 70

Least Prevalent Document Element on the Collected Prevalence Index

later use. This strategy approach is consistent with the

finding that readers with prior knowledge of a content

area (e.g., in chemistry, familiarity with mass and vol

ume) recall text information from that area in more depth than those without prior knowledge (Mayer, 1985).

Similarly, familiarity with document types might

spur the development of similar document structure

strategies or document mental models that aid in the re

trieval of information from a document. In particular, skilled readers who are unfamiliar with a particular doc

ument type might base their strategies on what they be

lieve is a similar document type. Thus, readers with more

numerous or more complete document mental models

might be able to use a novel document type or docu

ment element more successfully than readers with fewer

or less complete document mental models. Indeed,




readers with fewer or less complete document mental

models might rely on a default strategy that is inappropri ate, potentially leading to a poor understanding of the text and poor performance on tasks requiring such

understanding. Document familiarity may also facilitate information

extraction from a document because similar document

types often contain comparable information. For exam

ple, bills often contain an itemization of purchases, as

well as due dates, addresses, and prices. If readers are

familiar with one bill, then they are likely familiar with

the kind of information contained within most bills.

Familiarity with document types thus implies that read ers can understand and use the information commonly found in those document types relatively easily. As a re

sult, information extraction should be facilitated if the re

quested information is consistent with what is usually found in a particular document type and inhibited if it is

different.

The present study is the first to show a strong corre

lation between document familiarity and document use

across a variety of document elements; however, several

questions remain unanswered. Perhaps most important, the present study measured familiarity by means of par

ticipants' self-reports. Because document familiarity ac

crues over the course of years and decades, it is very difficult to measure document familiarity in a context

that does not include self-report. Nevertheless, when measures are based on self-report, we cannot determine

exactly what information participants used to make their

judgments. Furthermore, there are many correlates of

familiarity (e.g., frequency of use), and the impact of

these correlates on the data is unknown.

It may be that familiarity as measured in this study could be described as the perceived ease with which the

participants processed the items in the booklet. This

would not be unexpected if familiarity is directly related to ease of processing,

as we claim. It is important, there

fore, to follow up the present correlational research with

true experimental research that tests the predictions of

the document familiarity, or document mental model,

hypothesis described above. For example, if document

familiarity aids document search by allowing readers to

predict the placement of vital information, then the

placement of information in a familiar document type could be manipulated. If our model is correct, manipulat ed placements consistent with the canonical placement should facilitate document search, whereas those incon

sistent with the canonical placement should inhibit

search. Future experimental research would also permit a

determination of the causal directionality of the relation

between document familiarity and document use.

Our findings should not, however, be disregarded because they rely to a degree on self-report. The fact that

the participants' scores were different for perceived famil

iarity and perceived frequency of use supports the valid

ity of these measures. The significant difference between

the scores for familiarity and frequency (i.e., the Document Familiarity Index and the Document

Frequency Index were not statistically identical) indicates

that the participants could distinguish the two measures

and responded accordingly when making their ratings.

They were able to identify when they were familiar with a specific document element and also how often they used the document element. Importantly, whereas fre

quency of document use and document familiarity may be expected to correlate highly, as established in this

study, the frequency of document element use is not di

rectly proportional to an individual's familiarity with a

document element. For example, an individual may be

very familiar with a three-dimensional bar graph because

this document element is similar to a two-dimensional

bar graph. This individual may not, however, encounter a three-dimensional bar graph often in her or his envi

ronment. Thus, familiarity and frequency of use should

be distinguished. The results of the present study support the assumption that participants understood this differ

ence, thereby validating the creation of separate Document Familiarity and Document Frequency

Indexes.

Notably, even our relatively objective measure of

prevalence, the Collected Prevalence Index, showed a

significant relation to document literacy task perform ance. Nevertheless, consistent with the results of the cur

rent study, the prevalence of document elements in

society should have less predictive ability for document

literacy than frequency of use. Specifically, individuals

may be exposed to a large number of document elements

every day but not use these elements. To revisit a previ ous example, the business sections of newspapers are

often filled with line graphs, bar graphs, and tables pre

senting financial information. The sheer number of doc ument elements presented in these pages may increase

their prevalence in the environment; however, these doc ument elements may not be used by the average indi

vidual. Thus, the actual rate of document element occurrence in the environment may only weakly predict how frequently the element is used by an individual or

how familiar an individual is with the element. To gain a better idea of what types of document elements are be

ing used and how often, future research could ask partic

ipants to carry a diary to record their use of document

elements in day-to-day life (e.g., Smith, 2000). This

method might yield a more accurate reflection of the doc

ument elements being used in everyday life, uninflu

enced by the rate of document element occurrence in

the environment.

The Document Familiarity Index has the potential to

be an important resource to researchers, individuals in

terested in literacy or literacy education, and individuals




seeking to increase organizational efficiency. The

Document Familiarity Index is an ordinal list of the ease

with which adults will likely process various document

elements. Accordingly, the Document Familiarity Index, in light of our conceptualization of document mental

model processing, might be useful for conceiving more

user-friendly documents. By minimizing the elements in

documents that are less familiar to adults, efficient doc

ument use might be increased, both in terms of speed and accuracy of use.

The Document Familiarity Index might be useful in

informing educational practices. Given our supposition that readers will have less complete mental models for

those document elements with low familiarity, as deter

mined by the Document Familiarity Index, educators

could focus extra attention on teaching readers how to

use these elements efficiently. This additional instruc

tion might help readers to develop mental models for less

familiar document elements, which in turn might lead

to their possessing a greater breadth of mental models

and coping better with less familiar or novel document

elements. Accordingly, training designed to increase fa

miliarity with document elements at the primary and sec

ondary level might strengthen readers' document

literacy, thereby benefiting their occupational and per sonal lives.

The Document Familiarity Index might also benefit

government agencies or individuals in industry seeking to improve organizational efficiency. Presumably, the use

of more familiar document elements in government and

corporate documents, such as annual or technical re

ports, might increase the number of employees and con

sumers able to use such documents with ease, leading to

faster document use with greater accuracy. For example, one state assistance application form collected during this

study included several document elements that the

Document Familiarity Index identifies as less familiar to

adults. A revision of this form to include document ele

ments that are more familiar to readers might facilitate

the ease with which the public can use it. Such revisions

might, for example, reduce the amount of time state em

ployees spend advising members of the public on how to complete such forms and reduce errors in paperwork that could substantially delay the processing of applica tions. The end result might be more efficient processing of paperwork, smaller backlogs, and an increase in em

ployee productivity.

Perhaps the most important use for the Document

Familiarity Index is to further research in document liter

acy. Specifically, the Document Familiarity Index pro vides a resource for researchers to use when studying document literacy, designing stimuli, manipulating fa

miliarity, and so on. Such a resource may prove highly valuable.

In sum, the present study assessed the relations be

tween document element familiarity, frequency of use, and prevalence and performance on document literacy tasks among adult readers. The results indicate that doc

ument literacy is directly related to how familiar a read

er is with various document elements and, to a lesser

degree, to how often they use the document element and

how prevalent a document element is in the environ

ment. Thus, the three document indexes created may have potential utility in increasing document literacy and

related task performance. Familiarity and frequency like

ly facilitate development of document mental models, which subsequently aid readers in document search and

information extraction. Future research should deter

mine whether the relation between document familiarity and document literacy performance is a causal one and, if so, whether facilitation of search and information ex

traction processes is the mechanism by which document

familiarity and frequency exert their beneficial influence.

Notes 1 Our definition of the term document is consistent with the use of the

term in the document literacy literature, which derives largely from

the definition that the U.S. Department of Education's National

Center for Education Statistics uses for the document literacy assessment component of the National Assessment of Adult Literacy

(NAAL)?that is, "noncontinuous texts in various formats" (see White

& Dillow, 2005, p. 4). The term, as we use it, refers to those written

materials that present information in a manner largely independent of

prose, although some prose may be present.

This project was funded in part by the U.S. Department of Education, National Center for Education Statistics, under contract number

ED-99-CO-0110. The content of this publication does not necessarily reflect the views or policies of the U.S. Department of Education,

National Center for Education Statistics, nor does mention of trade

names, commercial products, or organizations imply endorsement by the U.S. government. We would like to thank the North Carolina

Division of Motor Vehicles for cooperating and permitting us to use

its facilities for data collection.

References

Fisher, D.L. (1981). Functional literacy tests: A model of question

answering and an analysis of errors. Reading Research Quarterly, 16, 418-448.

Guthrie, J.T. (1988). Locating information in documents: Examination

of a cognitive model. Reading Research Quarterly, 23, 178-199.

Guthrie, J.T., Britten, T., & Barker, K.G. (1991). Roles of document

structure, cognitive strategy, and awareness in searching for infor

mation. Reading Research Quarterly, 26, 300-324.

Guthrie, J.T., & Kirsch, I.S. (1987). Distinctions between reading com

prehension and locating information in text. Journal of Educational

Psychology, 79, 220-227.

Guthrie, J.T., & Mosenthal, P. (1987). Literacy as multidimensional

Locating information and reading comprehension. Educational

Psychologist, 22, 279-297.

Guthrie, J.T, Seifert, M., & Kirsch, I.S. (1986). Effects of education,

occupation, and setting on reading practices. American Educational

Research Journal, 23, 151-160.

Guthrie, J.T, Weber, S., & Kimmerly, N. (1993). Searching docu

ments: Cognitive processes and deficits in understanding graphs,




tables, and illustrations. Contemporary Educational Psychology, 18, 186-221.

Hochstim, J.R. (1967). A critical comparison of three strategies of col

lecting data from households. Journal of the American Statistical

Association, 62, 976-989.

Howell, D.C. (2002). Statistical methods for psychology (5th ed.). Pacific

Grove, CA: Duxbury.

Kirsch, I.S., Jungeblut, A.Jenkins, L., & Kolstad, A. (2002). Adult lit

eracy in America: A first look at the findings of the National Adult

Literacy Survey (National Center for Education Statistics Publication

No. 1993-275). Washington, DC: National Center for Education

Statistics, U.S. Department of Education.

Kirsch, I.S., & Mosenthal, P.B. (1990). Exploring document literacy: Variables underlying the performance of young adults. Reading Research Quarterly, 25, 5-30.

Kutner, M., Greenberg, E., & Baer, J. (2005). A first look at the literacy

of America's adults in the 21st century (National Center for Education

Statistics Publication No. 2006-470). Washington, DC: National

Center for Education Statistics, U.S. Department of Education.

Locander, W., Sudman, S., & Bradburn, N. (1976). An investigation of

interview method, threat and response distortion. Journal of the

American Statistical Association, 71, 269-275.

Markman, A.B., & Gentner, D. (2001). Thinking. Annual Review of

Psychology, 52, 223-247.

Mayer, R.E. (1985). Structural analysis of science prose: Can we in

crease problem-solving performance? In B.K. Britton & J.B. Black

(Eds.), Understanding expository text: A theoretical and practical hand

book for analyzing explanatory text (pp. 65-87). Hillsdale, NJ: Erlbaum.

Meyer, B.J.F. (1985). Prose analysis: Purposes, procedures, and prob lems. In B.K. Britton & J.B. Black (Eds.), Understanding expository text: A theoretical and practical handbook for analyzing explanatory text

(pp. 11-64). Hillsdale, NJ: Erlbaum.

Mosenthal, P.B., & Kirsch, I.S. (1991). Toward an explanatory model

of document literacy. Discourse Processes, 14, 147-180.

Mosenthal, P.B., & Kirsch, I.S. (1992). Types of document knowledge: From structures to strategies. Journal of Reading, 36, 64-67.

Mosenthal, P.B., & Kirsch, I.S. (1998). A new measure for assessing document complexity: The PMOSE/IKIRSCH document readabili

ty formula. Journal of Adolescent & Adult Literacy, 41, 638-657.

Shah, P., & Hoeffner, J. (2002). Review of graph comprehension re

search: Implications for instruction. Educational Psychology Review,

14, 47-69.

Shah, P., Mayer, R.E., & Hegarty, M. (1999). Graphs as aids to knowl

edge construction: Signaling techniques for guiding the process of

graph comprehension. Journal of Educational Psychology, 91, 690-702.

Siemiatycki, J. (1979). A comparison of mail, telephone, and home

interview strategies for health surveys. American Journal of Public

Health, 69, 238-245.

Smith, M C. (2000). The real-world reading practices of adults. Journal

of Literacy Research, 32, 25-52.

Spratt, J.E., Seckinger, B., & Wagner, D.A. (1991). Functional litera

cy in Moroccan school children. Reading Research Quarterly, 26, 178-195.

United States Census Bureau. (2005a). North Carolina quickfacts.

Washington, DC: Author. Retrieved December 26, 2005, from

quickfacts. census. go v/qfd/states/3 7000. html.

United States Census Bureau. (2005b). USA quickfacts. Washington, DC: Author. Retrieved December 26, 2005, from quickfacts

.census.gov/qfd/states/00000.html.

White, S., & Dillow, S. (2005). Key concepts and features of the 2003

National Assessment of Adult Literacy (National Center for Education

Statistics Publication No. 2006-471). Washington, DC: National

Center for Education Statistics, U.S. Department of Education.

Winn, W. (1993). An account of how readers search for information in

diagrams. Contemporary Educational Psychology, 18, 162-185.

Submittedjuly31,2006 Final revision received March 14, 2007

Accepted April 10, 2007

DaleJ. Cohen teaches in the Department of Psychology,

University of North Carolina Wilmington, USA; e-mail

[email protected].

Jessica Snowden is currently pursuing a J.D. and a Ph.D. in

Social Psychology at the University of Nebraska-Lincoln,

USA; e-mail [email protected].




Appendix A

Names and Definitions of the Document

Categories and Elements

This appendix lists the names and definitions of the doc

ument elements and the document categories (shown in

italics) in the order that participants viewed them in the

stimulus booklet. Participants viewed the following 10

document categories: bar graphs, line graphs, pie charts,

diagrams, maps, lists, tables, forms, bills and receipts, and Internet elements. The document categories were

presented on a single page with only the title and defini

tion. The 74 document elements were presented, within

their associated document category, on separate pages that each displayed the document element name, defini

tion, and an example image.

Bar graphs: a graphical way of showing quantitative com

parisons by using rectangular shapes with lengths pro

portional to the measure of what is being compared.

Two-dimensional bar graphs: The bars in two

dimensional bar graphs are flat rectangles.

Three-dimensional bar graphs: The bars in three

dimensional bar graphs appear to go back into

space and look like posts or pillars.

Split bar graphs: These are bar graphs in which the

bars of each condition are presented on different

sides of a centerline.

Stacked bar graphs: These are bar graphs in which

each bar is divided into segments, of which each

segment represents a different condition.

Overlayed stacked bar graphs: These are bar graphs in which the bars that represent one condition

overlay those bars that represent another condition.

Line graphs: A line graph can be used to show how one

or more things change over time, distance, etc. Line

graphs have an x-axis (horizontal) and a y-axis (verti

cal). Usually, the x-axis has numbers for the time period

(e.g., month) and the y-axis has numbers for what is be

ing measured (e.g., average rainfall).

Pie charts: A pie chart is a circular chart cut into segments

illustrating relative magnitudes or frequencies.

Diagrams: Diagrams are drawings intended to show the

relation between the parts of objects, concepts, etc.

Floor plan diagrams: These diagrams depict the

layout of a house, building, etc.

Movement diagrams: These diagrams use arrows, or

other symbols, to depict the movement of objects.

Distance diagrams: These diagrams use arrows, or

other symbols, to depict the size of, or distance be

tween, objects.

Exploded diagrams: These diagrams show the parts of objects and their relative placement by separat

ing the pieces by small amounts.

Labeled diagrams: These diagrams identify the names and placement of object parts.

Insert diagrams: These diagrams have inserts that

provide extra information.

Road sign: These diagrams use symbols to identify the rules, layout, etc., of roads.

Boat sign diagrams: These diagrams use symbols to identify the rules, hazards, etc., of boating.

Conceptual diagrams: These diagrams present ab

stract conceptual ideas in an organized way.

Tournament diagrams: These diagrams present the

matches between teams in a tournament.

Venn diagrams: These diagrams use circles, or oth er shapes, to show relationships between different sets of information.

Maps: A map is a graphical representation of physical, po litical, or conceptual features of a part or the whole of the Earth's surface.

Divisional maps: These maps primarily illustrate

the boundaries between geographic or conceptual categories.

Feature maps: These maps primarily illustrate fea tures (such as designated parks) of a geographic area.

Geographic maps: These maps primarily illustrate

geographic and topographic features of a geograph ic area.

Insert maps: These maps have inserts that provide extra information.

Structure/Building location maps: These maps pri

marily illustrate the location of buildings or struc

tures along a stylized road map.




Political location maps: These maps primarily illus

trate the location of cities or other territories.

Categorical maps: These maps primarily illustrate

the location of various political or other entities.

Road maps: These maps primarily illustrate the de

tailed location of roads.

Weather maps: These maps primarily illustrate the

weather of geographic or political regions.

Lists: Lists are text that is categorized but that does not

contain row or column headers.

Address lists: This is a list format used solely to

present an address.

Tab/New line lists: This is a list that uses tabs or

new lines to separate elements of the list.

Bulle ted lists: This is a list that uses bullets to iden

tify elements of the list.

Numbered lists: This is a list that uses numbers to

identify elements of the list.

Comma-separated lists: This is a list that uses com

mas to separate elements of the list.

Labeled lists: Labeled lists describe elements in a

list, often by presenting a label, followed by a

colon, followed by the element.

Implied tables: Implied tables are lists that are

structured like a table, but they do not contain col

umn headers. The column contents either are as

sumed to be so familiar to readers that the column

headers are unnecessary (e.g., movie times and ti

tles) or are generally described in a title or text.

Unclassified indexes: Indexes are a specific form of

implied table?they identify items and their loca

tion in a book, newspaper, etc. Unclassified index

es do not categorize the items for readers.

Classified indexes: Indexes are a specific form of

implied table?they identify items and their loca

tion in a book, newspaper, etc. Classified indexes

group the items in the list into several categories.

Menus: Menus are a specific form of implied

table?they identify items and their prices?that often classify elements of the list by type.

Recipes: Recipes are a specific form of implied

table?they identify items and their amount.

Calendars: Calendars are a specific form of implied

table?they often identify dates in a weekly or

monthly format.

Covers: Covers often contain an abbreviated list of

the contents of a magazine, book, etc.

Tables: Tables are a set of data arranged in rows and/or

columns that contain identifiers (headers) for the rows

and columns.

Intersected tables: Intersected tables have row and

column headers.

Borderless tables: Borderless tables have row and

column headers but no vertical or horizontal lines

to separate rows or columns.

Feature tables: Feature tables identify the features

of a product, etc., by constructing a table of all

available features and indicating with a symbol, col

or, etc., whether that feature is present in each

product.

Schedule tables: Schedule tables present the time

and/or date of events in tabular format with the

merged cells indicating events that extend across

time periods.

Split tables: Split tables present the row headers in

the center of the table instead of on the more tra

ditional left side.

Diagonal tables: In diagonal tables the row and col umn headers are identical and, therefore, are pre sented once on a diagonal.

Forms: Forms are documents designed to collect informa

tion from the reader; therefore, they contain spaces in

which the reader is to write information.

Bubble forms: Bubble forms are form elements in

which the user fills in a circle, square, etc., to indi

cate a choice.

Checkboxes: Checkboxes are form elements in

which the user must put a check or "x" in a square, etc., to indicate a choice.

Checklists: Checklists are form elements in which

the user must put a check or "x" over a line to in

dicate a choice.

Circle forms: Circle forms are elements in which

the user puts a circle around a number, letter, etc.,

to indicate a choice.

Intersected table forms: Intersected table forms are

form elements in a tabular format in which both the

columns and rows are labeled.

Column-only table forms: Column-only table

forms are form elements in a tabular format in

which only the columns are labeled.

Row-only table forms: Row-only table forms are

form elements in a tabular format in which only the

rows are labeled.

Left-labeled line forms: Left-labeled line forms are

form elements in which there is a line for the user

to put information on, and the type of information




to be entered on each line is labeled to the left of

that line.

Below-labeled line forms: Below-labeled line forms are form elements in which there is a line for the user to put information on, and the type of infor

mation to be entered on each line is labeled below

that line.

Labeled box forms: Labeled box forms are form el ements in which there is a box into which the user

is to put information, and the type of information to be entered is labeled inside the box.

Labeled individual box forms: Labeled individual

box forms are form elements in which there are

boxes into which the user is to put information, one number or letter at a time.

Mailing forms: Mailing forms are elements into

which one puts addresses.

Crosswords: Crossword puzzles are form elements

that have a specific and distinctive format.

Bills and receipts: Bills and receipts are documents with a

specific format in which purchases are itemized and totaled.

Bills: Bills are document elements in which pur chases are itemized and totaled, with an amount due section.

Receipts/Invoices: Receipts and invoices are docu ment elements in which purchases are itemized

and totaled, but the customer has already paid so

there is no amount due section.

Internet elements: Internet elements are those commonly found on websites.

Underlined hyperlinks: A hyperlink is an image or

portion of text on a webpage that is linked to an

other webpage. Underlined hyperlinks use under lines to identify the presence of the hyperlink. Icon hyperlinks: A hyperlink is an image or portion of text on a webpage that is linked to another web

page. Icon hyperlinks use icons (or pictures) to

identify the presence of the hyperlink.

Bulleted hyperlinks: A hyperlink is an image or

portion of text on a webpage that is linked to an

other webpage. Bulleted hyperlinks use bullets to

identify the presence of the hyperlink.

Vertical hyperlink menus: A menu is a categorized list of options on a webpage that are linked to oth er webpages (e.g., hyperlinks). Vertical menus

present the hyperlink options vertically.

Horizontal hyperlink menus: A menu is a catego rized list of options on a webpage that are linked to other webpages (e.g., hyperlinks). Horizontal

menus present the hyperlink options horizontally.

Drop-down menus: A menu is a categorized list of

options on a webpage that are linked to other web

pages (e.g\ hyperlinks). Drop-down menus present

options that pop up when one "clicks" on the arrow next to the top option.

Tab menus: A menu is a categorized list of options on a webpage that are linked to other webpages (e.g., hyperlinks). Tab menus present the options as tabs that extend horizontally across the screen.

Alphabetical indexes: An alphabetical index is a

list of the alphabet that serves as a link to an index that is sorted alphabetically. When one clicks on a

letter, the sorted list that starts with that letter will

pop up.

Internet checkboxes: An Internet checkbox is a

form element in which one clicks on the box to in

dicate a choice. More than one checkbox can be clicked at any one time.

Internet radio buttons: An Internet radio button is a form element in which one clicks the circle to in

dicate a choice. Only one radio button can be clicked at any one time.

Internet form boxes: An Internet form box consists of a series of text boxes into which the user puts a

word or list.

Internet buttons: An Internet button is a button

that, when clicked, starts or ends a computer process (e.g., a download process).

Internet search elements: An Internet search ele ment consists of an Internet button and a text box. The user puts a word or list of words into the text box and clicks the button to begin an Internet search.




Appendix B

Document Indexes Organized From Least

Familiar (1) to Most Familiar (74) Document

Element

Document Document Collected

Familiarity Frequency Prevalence

Document element Index3 lndexb lndexc

Boat sign diagram 1 1 4 Venn diagram 2 3 12 Exploded diagram 3 4 1 Split bar graph 4 2 14

Overlayed stacked 5 5 7 bar graph

Conceptual diagram 6 7 12

insert diagram 7 8 40 Stacked bar graph 8 6 22 Diagonal table 9 11 3 Tab/New line list 10 17 59 Distance diagram 11 9 38

Political location map 12 12 50

Implied table 13 19 60 Movement diagram 14 10 32

Split table 15 13 9 Comma-separated list 16 23 42

Intersected table form 17 16 22

Internet radio button 18 27 NA

Geographic map 19 15 34

Insert map 20 20 31

Row-only table form 21 32 4

Feature table 22 30 17

Weather map 23 34 30

Classified index 24 38 38 Column-only table form 25 28 32

Labeled diagram 26 24 41

Feature map 27 36 21

Unclassified index 28 40 46

Divisional map 29 29 51

Structure/Building 30 35 43

location map Tournament diagram 31 26 1

Intersected table 32 41 56

Categorical map 33 22 45

Borderless table 34 39 55

Three-dimensional 35 18 6

bar graph Floor plan diagram 36 21 17

Two-dimensional 37 14 48

bar graph

Document Document Collected

Familiarity Frequency Prevalence Document element Index3 lndexb Index0

Bulleted hyperlink 38 50 NA Address list 39 45 58 Labeled list 40 42 61

Numbered list 41 49 54

Bubble form 42 31 9 Bulleted list 43 48 57 Line graph 44 33 52

Underlined hyperlink 45 60 NA Horizontal hyperlink 46 59 NA

Below-labeled line form 47 43 44

Internet form box 48 54 NA

Labeled box form 49 44 34

Vertical hyperlink 50 63 NA Schedule table 51 55 25

Left-labeled line form 52 51 53

Internet checkbox 53 62 NA

Alphabetical index 54 58 NA

Checkbox 55 47 49 Tab menu 56 64 NA

Pie chart 57 37 26

Circle form 58 46 17

Icon hyperlink 59 66 NA Recipe 60 61 27 Checklist 61 53 27 Labeled individual box 62 56 37

form

Road sign 63 57 36 Mailing form 64 52 16

Internet button 65 68 NA

Crossword 66 25 29

Road map 67 65 47

Internet search element 68 71 NA

Drop-down menu 69 69 NA

Cover 70 67 24

Bill 71 72 7 Receipt 72 70 11

Menu 73 73 15

Calendar 74 74 20

aDocument Familiarity Index shows the order of participants' self-reported document element familiarity ranked numerically from least familiar (1 ) to most

familiar (74). bDocument Frequency Index shows participants' self-reported frequency of document element use ranked numerically from least frequently used (1 ) to most

frequently used (74). Collected Prevalence Index shows the numerical ranking of the prevalence of document elements collected in the document sample, with the exception of the

Internet elements, which were eliminated because the collection method for this document category (i.e., the first 2 levels of the 10 most popular websites) skewed the Internet elements' estimated prevalence in the environment (the eliminated elements are designated in the table as NA). Therefore, this index shows

the estimated prevalence of document elements ranked numerically from least prevalent (1 ) to most prevalent (61 ).




The Relations between Document Familiarity, Frequency, and Prevalence...

Documents

Transcript of The Relations between Document Familiarity, Frequency, and Prevalence...