An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair...

12
An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of Industry Science and as an interlocking set of 10 provincial and.2 Technology 1993 territorial health insurance schemes Each As-partofihena- k-recenrarticie -has-presented-an-overview-of tional health care system each provincial and tern- health care systems in Canada and selected OECD tonal scheme is linked through adherence to na- countries Australia France Finland Germany tional standards set at the Federal level Provincial Sweden the United States and the United King- and territorial hospital and medical care insurance dom It discusses the organization of the health care plans are Federally regulated in areas such as the systems health care expenditure the availability financial extent of coverage accessibility portabil- and utilization of health services and the health sta ity of benefits and the requirement that the plans tus of the population Nair and Karim 1993 be non-profit publication by the World Bank World Bank The 1984 Canada Health Act defines the crite- 1984 points out that the ultimate goal of public na and conditions that each provincial health in- policy is to improve living standards to increase surance plan must meet to receive full Federal con- individual choice and to create conditions that en tributions The Act discourages direct charges to able people to realize their potential patients for physician and hospital services The five legislated principles of the Canada Health Act It has been stated that improvements in longev include accessibility comprehensiveness public ity and health policy in particular are very much administration portability and universality The Act success story in Canada life expectancy has pro- is built on the Hospital Insurance and Diagnostic gressed steadily and there is only handful of coun Service Act 1957 and the Medical CareAct 1966- tries with better health conditions than Canada 67 The Hospital Insurance and Diagnostic Services Beaujot 1991 However certain issues are sub- Act had made range of hospital and diagnostic ject to continued discussion such as cost the best services available at little or no direct cost to the curative and preventive approaches better methods patient The purposes of these acts were to increase for the quality of prolonged life and the delayed the supply of health personnel and facilities and to effects of occupational lifestyle socio-economic make the services available regardless of socio-eco- and environmental factors on health Health policy nomic circumstances and geography The Medical should take into account the demographics of health Care Act required the Federal government to make morbidity and mortality payments to provinces and territories operating medical care insurance plans In Canada neither the Federal nor provincial au thorities nor researchers have yet made more than How the provinces organize finance and admin- modest start at tapping the immense data resources ister their health insurance plans varies Some ad- and statistical tools at their disposal for health ser minister their plans directly through provincial vices management and research in areas such as health departments other plans are managed by health care health status epidemiology separate public agencies reporting to the provincial demography genetics occupational and environ health minister Some provinces have two plans mental health The desire to understand and improve one controlled by the province and another by the performance of the health care system has been

Transcript of An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair...

Page 1: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

An Overview of Record Linkagein Canada

Martha Fair Statistics Canada

anadas health care system is best described public agency Minister of Industry Science and

as an interlocking set of 10 provincial and.2 Technology 1993territorial health insurance schemes Each

As-partofihena- k-recenrarticie -has-presented-an-overview-of

tional health care system each provincial and tern- health care systems in Canada and selected OECDtonal scheme is linked through adherence to na- countries Australia France Finland Germanytional standards set at the Federal level Provincial Sweden the United States and the United King-

and territorial hospital and medical care insurance dom It discusses the organization of the health care

plans are Federally regulated in areas such as the systems health care expenditure the availability

financial extent of coverage accessibility portabil- and utilization of health services and the health sta

ity of benefits and the requirement that the plans tus of the population Nair and Karim 1993be non-profit

publication by the World Bank World Bank

The 1984 Canada Health Act defines the crite- 1984 points out that the ultimate goal of public

na and conditions that each provincial health in- policy is to improve living standards to increase

surance plan must meet to receive full Federal con- individual choice and to create conditions that entributions The Act discourages direct charges to able people to realize their potential

patients for physician and hospital services The

five legislated principles of the Canada Health Act It has been stated that improvements in longev

include accessibility comprehensiveness public ity and health policy in particular are very much

administration portability and universality The Act success story in Canada life expectancy haspro-

is built on the Hospital Insurance and Diagnostic gressed steadily and there is only handful of coun

Service Act 1957 and the Medical CareAct 1966- tries with better health conditions than Canada

67 The Hospital Insurance and Diagnostic Services Beaujot 1991 However certain issues are sub-

Act had made range of hospital and diagnostic ject to continued discussion such as cost the best

services available at little or no direct cost to the curative and preventive approaches better methods

patient The purposes of these acts were to increase for the quality of prolonged life and the delayed

the supply of health personnel and facilities and to effects of occupational lifestyle socio-economic

make the services available regardless of socio-eco- and environmental factors on health Health policy

nomic circumstances and geography The Medical should take into account the demographics of health

Care Act required the Federal government to make morbidity and mortality

payments to provinces and territories operating

medical care insurance plans In Canada neither the Federal nor provincial authorities nor researchers have yet made more than

How the provinces organize finance and admin- modest start at tapping the immense data resources

ister their health insurance plans varies Some ad- and statistical tools at their disposal for health ser

minister their plans directly through provincial vices management and research in areas such as

health departments other plans are managed by health care health status epidemiology

separate public agencies reporting to the provincial demography genetics occupational and environ

health minister Some provinces have two plans mental health The desire to understand and improve

one controlled by the province and another by the performance of the health care system has been

Page 2: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

evident in other countries such as the United States zations would rely heavily on knowing certain

Donaldson and Lohr 1994 and the United King- chapters in the Life Records Index He applied

dom Gill et al 1993 the record linkage principle to many social and eco

nomic services and benefits so that they might be

One useful tool that has been developed over better directed at the real needs of people About

the years in Canada is computerized record link- the time when this speech was given Canada was

age This paper will briefly consider the definition introducing national scheme of Family allowance

history methods and uses of record linkage draw- whereby the Federal Government was to pay

ing on variety of experiences in Canada Proba- monthly allowance on behalf of each child Some

bilistic methods generalized systems provincial 3.5 million children were eligible for claims and

and national files have been developed to help bring verification of the fact of birth was required by link-

together records relating to the same individual e.g age with birth records

to create patient histories and to carry out two-file

linkages e.g linkage of breast screening and mor- The History of Record Linkage

tality records Several examples will be described

that use national files at Statistics Canada which is The process of systemizing the approach to

Canadas centralized statistical agency Emphasis computerized record linkage was first undertaken

will be placed on use of this tool for the administra- by the geneticist Howard Newcombe and his asso

tive and statistical data needs for the development ciates at Atomic Energy of Canada Newcombe et

of health care policies al 1959 He recognized the full implication of

extending the principle to the arrangement of per-

The Definition of Record Linkage sonal files and into family histories Early studies

linked British Columbia health and vital records

Record linkage is simply the bringing together showing that various source records could be linked

of information from one or more independent source into individual family and multi-generation pedi

records that are believed to relate to the same mdi- grees Newcombe 1967

vidual family or entity With successive linkages

the information may take on the characteristics of Development of the theory of record linkage in

collection of personal or family histories more rigorous mathematical fashion was carried

out in 1969 at Statistics Canada Fellegi and Sunter

The term record linkage was first used in 1946 1969

by H.L Dunn Chief of the United States National

Bureau of Vital Statistics Dunn 1946 Dunn in- Key technical issues were identified early

troduced the new term to group of Canadian vital namely using personal identifiers to discriminate

statisticians in this way Each person in the world between the individual to whom the record refers

creates Book of Life This Book starts with birth and all other persons in the population deciding

and ends with death Its pages are made up of the whether discrepancies in identifiers are due to mis-

records of the principal events in life Record link- takes in reporting for single individual or to the

age is the name given to the process of assembling presence of additional individuals and processing

the pages of this Book into volume Dunn envis- the large volume of data required for record link-

aged the assembly of records of the principle events age while using reasonable amount of computer

of life into personal Book of Life over life- time

time with linkage implying records brought to

gether by means of common identification data for generalized record linkage system was de

example birth number which ensure that records veloped at Statistics Canada in collaboration with

are assigned to the correct file.- He envisaged that the Epidemiology Unit at the National Cancer In

numerous national state and local official organi- stitute of Canada in the early 1980s Howe and

-140-

Page 3: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

AN OVERVIEW OF RECORD LINKAGE IN CANADA

Lindsay 1981 Smith and Sums 1981 Its pri- parison of three different computer matchers has

mary benefit is that it enables record linkage to be recently been undertaken There is considerable

carried out for both an internal e.g to create pa- record linkage work being carried out in the United

tient histories and variety of two-file linkages Kingdom at the University of Oxford The types

e.g linkage of cancer file with death records of research being carried out in Oxford include such

without reprogramming each time When Statis- topics as trends in the workload in hospital spetics Canada used probabilistic linkage in the 980s cialties descriptive epidemiology of hospitalised

based on the Fellegi-Sunter theory it was to search diseases geographic variation in admission rates

the newly established Canadian Mortality Data social class marital status and place of birth and

-Base whihis -a--uiIe-of--al1--deat-hs- -in--Canada--ex- -inter-relationships-bet-ween-diseases

tending back to 1950 Smith and Newcombe 1980The Methods Used in Probabilistic

Over the past fifteen years numerous health stud- Record Linkageies have been carried out at Statistics Canada using

probabilistic matching techniques incorporated in The emphasis in this paper is on the matching

the generalized linkage system Various applica- of records where no unique identifier is available

tions have been initiated in provinces such as In probabilistic linkage the comparison or match-

Ontario Manitoba and British Columbia The soft- ing algorithm yields for each record pair prob

ware utilized has varied More recent work on gen- ability or weight which indicates whether the

eralized software at Statistics Canada GRLS.V2 records relate to the same entity Where linkages

has been for computers having UNIX operating can be based on some sort of personal identifying

system Nuyens 1993 number there is much greater certainty of achiev

ing correct match However that number mayThere has been series of seven workshops de- be incorrectly recorded on some records so that

scribing the methods software and results of such disagreement of it is not necessarily proof of non-

research e.g Howe and Spasoff 1986 Carpenter match Such numbers are also occasionally improp

and Fair 1989 Statistics Canada also has col- erly borrowed from other people so that agree

lection of preprocessing routines and generalized ment is not always positive proof of correct match

systems which are useful for the preparation of files Although rare such occurrences make it prudent

for record linkage applications The last record link- to check names and other usual identifiers as

age workshop was held at Niagara-on-the Lake in backup when doing numerical linkages Thus the

April 1994 with the North American Association methods of probabilistic record linkage should still

of Central Cancer Registries This included dis- be applied to improve the accuracy of even the nucussions regarding the development and use of can- merical linkages

cer registries software and collaboration and co

operation Often the decision to link particular records

depend on the similarities of names birth dates

Communication and collaboration with other birth places and other items Each of these identi

agencies in the various provinces in Canada in fiers carries only limited discriminating powerUnited States and the United Kingdom have aided and is fallible so the records may sometimes seem

record linkage developmental work e.g Tepping dissimilar when in fact the same person is in

1968 Kilss and Alvey 1985 Newcombe 1988 volved One must first search the files to bring

Jaro 1989 Newcombe Fair and Lalonde 1992 pairs of records together for comparison ConcepGill et al 1993 Scheuren and Winkler 1993 In tually each record on fileAe.g cohort of women

particular there have been efforts to study who have participated in national breast screen

undercoverage in the United Stated census Acorn- ing study is compared with each record on file

141

Page 4: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

e.g the Canadian Mortality Data Base to form probability of occurrence of the particular outcome

set of records which one attempts to classify as in LINKS as compared with NONLINKS that is

links or nonlinks In practice the files are blocked

using identifiers e.g by the phonetic codeofsur- PAx Byl LINKSname and gender code to limit the number of pairs ODDS Ax By NONLINKScompared Multiple passes of the file are possible

using different blocking items The process of sepa- As in information theory the odds are usually exrating out the true matches is in reality sEepwise

pressed as logarithms to the base and are oftenelimination of the false ones It is not so much

multiplied by ten and rounded to avoid decimalsmatter of picking needles out of the haystack as of

progressively getting rid of the haystack withoutOutcome Weight 10log2 ODDS

losing the needles The problem of uniquely iden

tifyingan individual in file of 28 million recordsrule is created to compare each of the fields xy

is more difficult than identifying the person inin the records The comparisons can be straight

file of 100000 because one has to discard from

consideration far more unlinkable pairs In esti-comparison cross comparisons e.g comparing

first forename on record with the second fore-mating the information needed for linkage the file

name on record or specially written functionssize needs to be taken into account Second decision is made as to whether the link is true and

The total odds in favour of match may be exthen finally group of the appropriate records re

lating to the same individual or entity is formed pressed as the sum of number of outcome

weightsThe group can consist of one-to-one match such

as one event record with death one-to-manyTotal Weight Outcome WeightO1 Outcome

match such as one event linking to cancer records

where an individual may have several cancer mci- Weight 02 .. Outcome Weight Ondence records many-to-one match such as

cancer incidence file linking to death file or where 02 are the outcomes for the

many-to-many match rules to including any used for blocking used

to compare the fields on the records The outcomes

The generalized system used at Statistics Canada are assumed to be statistically independentestimates how likely it is that pair of records re

fers to the same entity It does this by comparing The total weight becomes the overall relative

corresponding fields one at time between records probability that the potential link is in fact defi

and seeing if the values agree partially agree dis- nite link As well one may take into account the

agree or are missing The type of comparison out-greater need for discriminating power where the

comes can be similar to those that human clerkpopulation being searched is large and/or the par-

would do in carrying out the same task ticular individual is unlikely to be represented in

it Details of the calculation of absolute odds are

Quite briefly when comparing values Ax from described elsewhere Newcombe et al 1983Record e.g the record which is used to mi- Newcombe 1993

tiate the search with value By from record e.gdeath record which is the file being searched By comparing the total weight against two

the ODDS in favour of correct LINK associated thresholds this estimate is converted into deci

with the outcome Ax.By i.e the comparison pair sion as to whether or not the link is true one If

of values may be written in terms of the relative the total weight is above the upper threshold the

-142-

Page 5: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

AN OVERVIEW OF RECORD LINKAGE IN CANADA

link is assigned temporary status of definite link There is need for standard coding procedures

if it is below the lower threshold the temporary sta- e.g causes of disease geographic and other cod

tus is unlinked if it is between the two thresh- ing and data collection Carpenter and Fair 1990

olds the temporary status is possibleThere is need to have uniform data sets not

Lower Upper only for diagnostic information but also sufficient

Unlinked Possibly Linked information is required to identify the individual

Linked and entities Cases where one might anticipate

problems include distinguishing twins babies who

Possible links ae thôn examined in more de- Th shOrti after birth psOriSWho similarly

tail perhaps on sample basis to fine tune the set- named in families hyphenated names and van

ting of the thresholds In smaller projects manual ous groups where the naming conventions may not

resolution can be carried out on these links Fur- be the same The ideal criteria for personal identi

ther reference is usually made to the original source fying information on medical record should sat-

documents where further identifying information isfy the requirements as shown in Table

may be available In large projects one threshold

value is sometimes chosen to classify records as There is requirement to have data available

links and nonlinks The analysis may be run using not only on regional basis but also at the na

different threshold values Considerable work has tional level For example traditionally Canadas

also been carried out in looking at the availability population has been very mobile over each five-

and type of identifiers used to identify entities e.g year period since 1961 almost half the population

persons families households farms In some stud- moved from one neighbourhood town city prov

ies there is need for better separation of per- ince or territory to another The most mobile

sonal and family linkages when thresholds are be- groups of people in Canada between the 1986 and

ing chosen Newcombe 1993 The ultimate uses 1991 censuses were those aged 25 to 29 with seven

of the files and the required accuracy must be taken of every ten people in this age group reporting that

into account they lived at different addresses in the 1991 cen

sus than they had in 1986

The selection of thresholds involves two types

of error Type error occurs if an unlinked pair At Statistics Canada efforts have been placed

is erroneously classified as linked because it falls not only in having national file of mortality

above the upper threshold i.e false positives which goes back to 1950 but also to develop the

Type II error occurs if pair is erroneously classi- files required in carrying out alive since 1984 birth

fied as nonlinked i.e false negatives The upper since 1987 and cancer since 1969 follow-ups

and lower thresholds are determined by the setting

of acceptable errors bounds to limit the number of The Linkage Review and

errors for the analysis as is determined to be ap- Approval Process

propriate

It is important to mention that all studies in-

The Files Being Linked volving record linkage at Statistics Canada must

satisfy prescribed review and approval process

To large extent the quality of record linkage For example the purpose of the record linkage

is dependent on the quality of the files being linked activity must be statistical or research in nature and

i.e quality in quality out Substantial benefits must be consistent with the mandate of Statistics

may result from efforts to improve the accuracy Canada as described in the Statistics Act The Sta

completeness and standardization of source files tistics Act has clauses regarding the secrecy of in

143-

Page 6: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

formation The record linkage activity must have

Table 1.--Criteria forAssessing the Value of Per- demonstable cost or respondent burden savingssonal Identifying Information

over other alternatives or be the only feasible opIn ideal circumstances personal identifying informa- tion It must be shown to be in the public interest

lion on medical record should satisfy the following re

quirements Cheeseman and Butleç 1972 Smith 1973 Major Uses of Linkage

The identifying information should be permanent Record linkage is being carried out at numberthat is it should exist at the birth of person to

of provincial centres as well as at the national levelwhom it relates or be allocated to him/her at birth

in Canada There are numerous uses It is beingand it should remain unchanged throughout life.

used for longitudinal mortality and cancer follow-

The identifying information should be universal up of various cohorts as well as for clinical trial

that is similar information should exist for everyand case control studies for the creation of regis-

member of the population tries for the creation of patient-oriented rather than

event-oriented statistics e.g to examine the numThe identifying information should be reasonable ber of patients admitted to hospital rather than the

that is theperson to whom it relates and others number of events for follow-up of surveys for

should have no objection to its disclosure for medi-the preparation of sampling frames and for exam-

cal purposesining factors which influence health care costs

curative and preventive approaches Details ofThe identifying information should be economi

cal that is it should not consist of more alpha-some specific examples are as follows

betic or numeric digits and other characters than

necessaryLongitudinal Studies of Mortality

The identifying information should be simple that Long term medical follow-up has been carried

is it should be capable of being handled easily by out with number of different occupational groupsclerk and computers such as uranium miners Kusiak et al 1993 as

bestos workers nickel miners farmers Fair 1993The identifying information should be available

workers at Atomic Energy of Canada Gribbin et

al 1993 synthetic textile workers Goldberg etThe identifying information should be known that

is either the person to whom it relates or an infoFal 1993 ten percent sample of the Canadian

mant acting on his/her behalf should be able to pro-labour force and so forth Linkage of starting

vide it on demand point file which will identify an individual as be

longing to an exposed or at-risk population is

The identifying information should be accurate linked to an endpoint such as cancer and/or death

that is it should not contain errors that could resultStatistics Canada 1992

in its discrepancy on two records relating to the

same person Collaboration is often required among various

countries to carry out analysis For example raThe identifying information should be unique that

don is chemically inactive colourless radioacis each member of the population should be iden

tified differentlytive gas which occurs naturally Underground min

ers who are heavily exposed to the radioactive de

No single identifier or identity set has been devised cay products of radon gas suffer especially high

to satisfy all of these items The efficiency of the record rates of lung cancer There was recent joint analy

linkage operation depends on how well the items selected sis of 11 underground miners studies published

for comparison satisfy this standard Lubin et al 1994 In this joint analysis three

-144-

Page 7: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

AN OvERvIEw OF RECORD LINKAGE IN CANADA

Canadian cohort studies were included In the utilized for the creation of cancer registry DaleMarch 10 1994 issue of Nature Julian Peto and 1989 The four major sources of information are

Sarah Darby have stated that extrapolation from hospital separations with any mention of cancer

underground miners data suggest that radon in pathology reports with any mention of cancer death

peoples homes may cause some 2000 lung can- certificates in which cancer is the underlying cause

cers each year in Britain and 15000 in the United of death and reports describing patients referred

States figures which imply that radon is among to the Regional Cancer Centres and Princess Mar-

the most serious environmental as distinct from garet Hospital The generalized record linkage sys

self -inflicted causes of cancer mortality yet iden- tern GRLS.V1 has been used in creating the Ontario

tified in the world They go on to say that there is Cancer Registry

dearth of direct evidence on the effect of domes-

tic radon exposure Other such international stud- Canada is one of the few countries in the world

ies have included nickel workers and plans are with cancer reporting system covering the corn-

being made for further studies of radiation work- plete population This coverage is achieved through

ers as well as individuals in the pulp and paper the cooperation of the various provincial/territorial

industry registries which have provided data to Statistics

Canada since 1969 The National Cancer Incidence

Follow-up of Clinical Trials Reporting System has recently been processed into

form suitable for record linkage This forms what

generalized linkage capacity is important for we call the Canadian Cancer Data Base This file

the follow-up of clinical trials For example the is an endpoint for number of studies e.g fol

Canadian National Breast Screening Study is an low-up of study of farmers and tuberculosis pa-

individually randomized trial designed to evaluate tients who have been exposed to fluoroscopy The

the efficacy of the combination of annual file is also being internally linked for study of

mammography physical examinations of the childhood cancers

breasts and the teaching of self-examination in re

ducing the rate of death from breast cancer in Death clearance of cancer registries can be

women Miller et al 1992a and 1992b The achieved by linking existing cancer patients with

study recruited approximately 90000 women aged mortality Ascertainment of new cancer cases from

40-59 from across Canada to participate for five mortality may result Statistical uses such as the

year period from 1980 The main endpoint for as- calculation of survival rates require the linkage of

sessing the results of screening is the mortality rate cancer and death information Registries can be

from breast cancer in the study groups compared used for the follow-up of cohort studies in clinical

with the controls trials for case control studies for the follow-up of

screening programs and in genetic studies

number of important issues arise from such

studies These include for example the compara- Other types of disease registries have been cre

tive value of mammography and physical exami- ated and utilized for variety of purposes e.g the

nation among different age groups and the vahe British Columbia Health Surveillance Registry for

of screening women under 50 years of age the investigation of reproductive problems

Building Maintaining and Using Regional Variation in the Incidence of

Disease Registries Disease

Record linkage has been important in building Generalized linkage facilities can aid in look

maintaining and using disease registries For ex- ing at regional variation in the incidence of disease

ample in Ontario number of data sources are For example cardiovascular diseases CVD are the

145

Page 8: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

leading causes of death potential years of life lost file medical claims file personal care home file

and utilization of health care services in Canada the Manitoba Immunization Monitoring system

Since the late 1960s Canada and many other west- cancer registry and mortality file These data files

em countries have experienced noticeable de- have been used to address many research issues

dine in mortality from CVD and particularly from including those of surgery Roos and Roos 1987ischemic heart disease IHD Efforts are under

way worldwide to study the determinants of change Roos Shapiro and Roos 1984 have observed

in cardiovascular mortality in prospective studies that dying is much more important factor than

The idea of linking routinely collected hospital aging per se in the high usage of hospitals For

separation records of IHD to the Canadian Mor- the elderly as for the non-elderly dramatic in

tality Data Base was initiated and field tested in crease in use occurs in the relatively short period

the provinces of Nova Scotia and Saskatchewan before death Persons aged forty-five and over in

feasibility study relating to acute myocardial Manitoba used an average of forty-two hospital bed

infarction AM was initiated and it used infor- days in the last year of life while even persons aged

mation from provincial morbidity files and the eighty-five and over who survived the next four

Canadian Mortality Data Base sample of AM years had less than seven bed days per year Pro-

cases from the hospital admission databases of jections of hospital needs might profitably take into

Nova Scotia and Saskatchewan were field vali- account expected deaths as well as population agdated and individuals histories created to differ- ing Beaujot 1991entiate recurrent from incident cases Incidence

rates for fatal and non-fatal AMs were higher in Discussion and Future Directions

Nova Scotia than in Saskatchewan over three time

periods 1977 1981 1985 number of provin- There are number of areas where record link

cial surveys and related studies are now being car- age could be used for additional administrative apned out across the country plications This includes the creation maintenance

and use of files for persons enrolled in the health

Comprehensive Multifile Databases care system For example many provinces have

found the need to issue health cards This may re

Clinical data collection and surveys are often quire creation of an up-to-date file of eligible reg

episodic depending upon such factors as funding istrants There must be checks to verify that dupli

and investigator interest Alternatively adminis- cate entries are not present on the file Some provtrative databases are generally continuous and are inces in Canada have had problems with ensuring

characteristically available to researchers few that the registrant file is up-to-date and without

months after the fiscal year end duplication

In Manitoba the population health database With advanced medicine and medical insurance

records all patient contacts with physicians hos- one might expect that social difference in mortal

pitals and nursing homes It was derived from in- ity might disappear in Canada However this is

formation contained in the population registry and not the case Mortality differences by income or

from health insurance claims routinely filled out social class as well as education are important

by physicians and health care facilities with the There is need to study mortality births and can

Manitoba Health Services Commission Roos et cer by income and education level in greater depth

al 1992 Information from the various files have

been described earlier Seven files provide the bulk Cancer and other registries are likely to become

of the information contained in the Manitoba health greater users of record linkage techniques There

database including the registration file hospital will be need to create and compare data across

-146-

Page 9: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

AN OVERVIEW OF RECORD LINKAGE IN CANADA

wider geographic areas The recent meeting of the eta 1993 Administrative and statistical research

North American Central Cancer Registries with its uses of the files will need to be differentiated

recent name change as well as its chosen theme --

co-operation and collaboration -- is evidence of In order for statistics to be relevant to decision-

this Other types of registries include registries con- makers Dr Fellegi Statistics Canadas chief stat-

taming exposure data such as the Canadian Na- istician has proposed that researchers need to iden

tional Dose Registry of persons exposed to radia- tify social problems of recognized importance

tion registries of specific diseases and those used determine through analysis the factors related to

for studies of reproductive outcomes such problems find out which of these can be in

fluenced thigh dcisIsns an4 effectively com-

Quality control will play greater part in everymunicate the results Beaujot 1991

aspect of record linkage applications This includes

such things as the quality of the data sources time- References

liness e.g production speed customer satisfac

tion relevance of the output measurement of er- Beaujot 1991 Population Change in Canada

rors e.g detecting and correcting errors and the The Challenges of Policy Adaptation

security of sensitive information Availability of McClelland and Stewart Inc The Canadian

resources personnel time money computer hard- Publishers 481 University Avenue Toronto

ware and software are important factors M5G 2E9

Advances in information technology hold great Carpenter and Fair eds 1989 Cana

potential not only for increasing efficiency but fordian Epidemiology Research Conference

also redefining organizational and geographic Proceedings of the Record Linkage Session

boundaries and for improving information manage-and Workshop

ment For some applications one may have to

implement re-engineering plan Carpenter and Fair 1990 Standard

Data Collection Package for Medical Follow

It is anticipated that more applications will be up Studies Ottawa Ontario KIA 0T6 Sta

conducted using software that is developed fortistics Canada Health Statistics Division

variety of personal computers that may be in net- Health Reports Cat 82-003 1990 157-173

work environment There is trend to move from

the mainframe computing environments as was Cheeseman and Butler 1972 Corn-

discussed at the recent Record Linkage Workshop puter-based Routine Systems of Linked Medi

held at Niagara-on-the Lake cal Records with Particular Reference to

Northern Ireland Northern Ireland Med

There is also trend to acquire and maintain in- Record Linkage Research Unit Report Noformation from variety of sources and to put da- RLU Queens University at Belfast

tabases to multiple uses Agencies will require care

ful attention in their stewardship of data for policy Dale 1989 Linkage as Part of Production

decisions and research comprehensive list of System The Ontario Cancer Registry in Ca-

recommendations for Federal statistical agencies in nadian Epidemiology Research Conference

the United States is given in the recent book Pri- Proceedings of the Reconi Linkage Sessions

vate Lives and Public Policies -- Confidentialityand Workshop eds Carpenter and M.E

andAccessibility of Government Statistics Duncan Fair

147

Page 10: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

Donaldson and Lohr eds 1994 Howe and Lindsay 1981 Generalized

Health Data in the Information Age Use Iterative Record Linkage Computer System for

Disclosure and Privacy National Academy Use in Medical Follow-Up Studies Comput

Press Washington D.C ers and Biomedical Research 14 327-340

Duncan Jabine and de Wolf Howe and Spasoff R.A eds.1986 Pro

eds 1993 Private Lives and Public Poli- ceedings of the Workshop on Computerized

cies -- Confidentially andAccessibiliry of Gov- Record Linkage in Health Research Ottawa

ernment Statistics National Academy Press Ontario May 1-231986 University of

Washington D.C Toronto Press Toronto

Dunn 1946 Record Linkage American Jaro 1989 Advances in Record Linkage

Journal of Public Health 36 1412-1416 Methodology as Applied to Matching the

1985 Census of Tampa Florida Journal of

Fair 1993 Recent Advances in Matching the American StatisticalAssociation 84414-

and Record Linkage from Study of Cana- 420

dian Farm Operators and Their Farming Prac

ticesin Proceedings of the International Con- Kilss and Alvey eds 1985 Record

ference on Establishment Surveys American Linkage Techniques -- 1985 Proceedings of

Statistical Association the Workshop on Exact Matching Met hodolo

gies Arlington Virginia May 9-10 1985

Fellegi and SunterA 1969 Theory of Department of Treasury Internal Revenue

Record Linkage Journal of the American Sta- Service Washington D.Ctistical Association 40 1183-1210

Kusiak Ritchie Muller and

Gribbon Weeks and Howe Springer 1993 Mortality from Lung

1993 CancerMortality 1956-1985Among Cancer in Ontario Uranium Miners British

Male Employees of Atomic Energy of Canada Journal of Industrial Medicine 50 920-928

Limited with Respect to Occupational Exposure to External Low-Linear-Energy-Transfer Lubin Boice Edling Hornung

lonising Radiation Radiation Research 133 Howe Kunz Kusiak

375-380 Morrison Radford

Tirmarche Woodward Xiang

Gill Goldacre Simmons Bettley and Pierce 1994 Radon and Lung

and Griffith 1993 Computerized Link- Cancer Risk Joint Analysis of 11 Under

ing of Medical Records Methodological ground Miners Studies U.S Department of

Guidelines Journal of Epidemiology and Health and Human Services Public Health

Comm Health 47 316-3 19 Services National Institutes of Health NIH

Publication No 94-3644

Goldberg Carpenter Theriault and

Fair 1993 The Accuracy of Ascertain- Miller Baines To-T and Wall

ing Vital Status in Historical Cohort of Syn- 1992a Canadian National Breast Screen

thetic Textile Workers Using Computerized ing Study Breast Cancer Detection and

Record Linkage to the Canadian Mortality Death Rates Among Women Aged 40 to 49

Data Base Canadian Journal of Public Years Canada MedicalAssoc 147 1459-

Health 84 201-204 1476

-148-

Page 11: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

AN OVERVIEW OF RECORD LINKAGE IN CANADA

Miller Baines To and Wall Manual Death Searches in Study of the

l992b Canadian National Breast Screening Health of Eldorado Uranium Workers Corn-

Study Breast Cancer Detection and Death puters in Biology and Medicine 13 157-169

Rates Among Women Aged 50 to 59 Years

Canada MedicalAssoc 147 1477-1488 Nuyens 1993 Generalized Record Linkage

at Statistics Canada in Proceedings of the

Ministry of Industry Science and Technology International Conference of Establishment

1993 Canada Year Book 1994 Catalogue Surveys American Statistical Association pp

No l-402E/l994 Available from Publica- 927-930

-tionSales--and-Ser-v-ices-.Stat-isticsGanada

Ottawa KIA 0T6 Peto and Darby 1994 Radon Risk Assessed Nature 368 97-98

Nair and Karim 1993 An Overview of

Health Care Systems Canada and Selected Roos and Roos 1987 Using Large

OECD Countries Health Reports 259- Data Bases for Research on Surgery Statisti

279 cal Uses of Administrative Data An Interna

tional Symposium eds J.W Coombs and M.P

Newcombe 1967 Record Linking The Singh Statistics Canada Ottawa 75-94

Design of Efficient Systems for Linking

Records into Individual and Family Histories Roos Shapiro and Roos 1984American Journal of Human Genetics 19 Aging and the Demand for Health Services

335-359 Which Aged and Whose Demand The Ger

ontologist 24 31-36

Newcombe 1988 Handbook of Record

Linkage Methods for Health and Statistical Roos Wajda Nicol and Roberts

Studies Administration and Business Oxford 1992 Record Linkage An Overview

University Press Oxford U.K Medical Effectiveness Research Data Meth

ods eds M.L Grady and H.A Schwartz

Newcombe 1993 Distinguishing mdi- AHCPR Pub No 92-0056 U.S Department

vidual Linkages of Personal Records from of Health and Human Services Rockville

Family Linkages Methods of Information MD 119-129

in Medicine 32 35 8-364

Scheuren and Winkler W.E 1993 Regres

Newcombe Fair and Lalonde sion Analysis of Data Files thatAre Computer

1992 The Use of Names for Linking Per- Matched Suvey Methodology 19 39-58

sonal Records Journal of the American Sta

tisticalAssociation 87 1193-1204 Smith 1973 Record Linkage of Hospital

Admission-Separation Records Atomic En

Newcombe Kennedy Axford ergy of Canada AECL-4507 Chalk River

and James 1959 Automatic Linkage Ontario

of Vital Records Science 130 954-959

Smith and Newcombe 1980 Auto

Newcombe Smith Howe mated Follow-up Facilities in Canada for

Mingay Strugnell and Abbatt Monitoring Delayed Health Effects Amen1983 Reliability of Computer Versus can Journal of Public Health 73 39-46

149-

Page 12: An Overview of Record Linkage Canada · An Overview of Record Linkage in Canada Martha Fair Statistics Canada anadas health care system is best described public agency Minister of

FAIR

Smith and Sums 1981 Generalized It- World Bank 1984 World Development Reporterative Record Linkage System Proceedings The World Bank Washington D.Cof the Social Statistics Section American Sta

tistical Association pp 128-137

Statistics Canada 1992 Studies and References

Relating to the Uses of the Canadian Mortal

ity Data Base

Copies available from Statistics Canada OccuTepping B.J 1968 Model for Optimum Link- pational and Environmental Health Research Sec

age of Records Journal of the American Sta- tion R.H Coats Building 18th Floor TunneystisticalAssociation 63 132 1-32 Pasture Ottawa Ontario KIA 0T6

-150-