Genome responses of trypanosome infected...

1
Genome responses of trypanosome infected cattle The encounter between cattle and trypanosomes elicits changes in the activities of both genomes - that of the host and that of the parasite. These changes determine the fate of the host and parasite and the outcome of the encounter. Although the outcome in most cattle is a slow death following progressive anaemia and loss of body condition, in other cattle the outcome is more favourable; these cattle regain the initiative, suppress parasite growth, recover from the initial clinical signs, gain weight, and reproduce normally. Cattle exhibiting this latter outcome are said to be trypanotolerant. Below is a synopsis of the genome responses that set apart the trypanotolerant cattle from the susceptible cattle during experimental trypanosome infection MetaCore GeneGo was used to identify networks amongst the genes that were differentially expressed. The figure (right) shows the largest network of connected genes that were differently expressed by trypanosome infected resistant (N’Dama) and susceptible (Boran) cattle. STAT3 and c-Fos have the most connectivity. STAT3 is a transcription factor which is activated in response to the IL-6 family of cytokines and is involved in the acute phase response in the liver Interestingly, STAT3 is modulated by RAC1 which is in turn controlled by VAV1 and ARHGAP15 which are both located in the QTLs controlling trypanotolerance. Trypanosome infection induces profound changes in the genome function manifested by changes in the steady state level of many genes. The differences in the genome responses of resistant and susceptible cattle correspond to some of the phenotypic attributes that correlate with susceptibility. In this study, twenty N’Dama (tolerant) cattle and 20 Boran (susceptible) cattle were experimentally infected with a lethal dose of T. congolense IL1180. Liver biopsy samples were taken from each individual in specified days prior to and post infection such that at each time point there were samples from at least 5 Boran and 5 N’Dama (A). The mRNA profiles in the biopsy material were assayed using the Affymetrix system (B). The gene data were fed into an analysis workflow (C) that integrates the expression measures, gene ontology, QTL information, and gene pathways data. Rennie C 1 Hulme H 2 Fisher P 2 Hall L 3 Agaba M 4 Noyes HA 1 Kemp SJ 1,4 Brass A 2,5 Abstract High throughput technologies inevitably produce vast quantities of data. This presents challenges in terms of developing effective analysis methods, particularly where the analysis involves combining data derived from different experimental technologies. In this investigation, we applied a systematic approach to combine microarray gene expression data, QTL data and pathway analysis resources in order to identify functional candidate genes underlying tolerance of Trypanosoma congolense infection in cattle (see Agaba et al poster at this conference). We automated much of the analysis using Taverna workflows previously developed for the study of trypanotolerance in the mouse model. We identified pathways represented by genes within the QTL regions, and subsequently ranked this list according to which pathways were over-represented in the set of genes that were differentially expressed (over time or between tolerant NÕ dama and susceptible Boran breeds) at various timepoints after T. congolense infection. The genes within the QTL that played a role in the highest-ranked pathways were flagged as strong candidates for experimental confirmation. 1 School of Biological Sciences, BioSciences Building, University of Liverpool, Crown Street, Liverpool, L69 7ZB, UK 2 School of Computer Science, Kilburn Building, University of Manchester, Oxford Road, Manchester, M13 9PL, UK 3 Roslin Institute, Roslin, Midlothian, EH25 9PS, Scotland, UK 4 ILRI, PO Box 30709, Nairobi, 00100, Kenya 5 Faculty of Life Sciences, University of Manchester, Smith Building, Oxford Road, Manchester, M13 9PT, UK A systematic, data-driven approach to the combined analysis of microarray and QTL data Anaemia BTA27 Anaemia BTA16 Anaemia and parasitaemia BTA7 Parasitaemia BTA4 Anaemia BTA2 Phenotype QTL location Background African bovine trypanosomiasis is one of the most important diseases affecting African livestock production. West African taurine cattle, such as the N'dama, are more resistant to the pathological consequences of trypanosomiasis ( trypanotolerant ) than East African zebu cattle, such as the Boran. A microarray timecourse experiment was carried out to investigate gene expression in N'dama and Boran cattle infected with Trypanosoma congolense, in order to identify the genes underlying trypanotolerance (see Agaba et al poster at this conference for more details). Trypanotolerance Trypanotolerance is a complex phenotype involving several distinct components, likely to involve separate genetic control mechanisms. Key features include the ability to control anaemia, control parasitaemia and maintain bodyweight. Data on trypanotolerance QTL suggests that phenotypic traits involved in trypanotolerance may be influenced by multiple genetic loci and possibly complex epistatic or environmental effects ( Proc Natl Acad Sci USA 2003;100(13);7443-7448 ). Microarray data Microarray data for liver samples extracted from Boran and N'dama cattle at 0, 12, 15, 18, 21, 26, 29, 32 and 35 days post- infection were analysed . Outliers were identified using dChip and removed before the remaining hybridisations were normalised using the Robust Multi-Array (RMA) method. Principal Components Analysis (PCA) was used to check that the hybridisations clustered as expected. T-tests were used to identify genes that were differentially expressed (p<=0.01) between the two breeds at each timepoint and paired T-tests (using data for the same individual animals at different timepoints ) were used to identify genes that were differentially expressed (p<=0.01) within breed at any timepoint compared to day 0. QTL data 16 trypanotolerance QTL had been identified in a previous mapping study (Proc Natl Acad Sci USA 2003;100(13);7443-7448 ). 5 of these QTL were selected based on the phenotypic trait involved, the mapping resolution and the strength of the effect (see table on the left for a summary of the QTL and associated phenotypes). The base-pair positions of these QTL relative to the EnsEMBL bovine genome preliminary build Btau2.0 were determined manually Combined analysis approach The combined analysis approach is described in Figure 1 (right). In brief, it involves mapping QTL genes and Affymetrix microarray probes to genes in the EnsEMBL bovine preliminary build Btau2.0 then identifying KEGG pathways that include the EnsEMBL genes. The two resulting pathway lists are compared to generate a list of KEGG pathways that include at least one differentially expressed gene and at least one gene in the QTL. The pathway list is then ranked according to the results of a Fisher exact test performed on the microarray data using DAVID, and annotated using literature searches and various public databases of gene and pathway information. Large sections of the analysis were automated (shown in blue in Figure 1) by adapting Taverna workflows previously developed for the study of trypanosomiasis responses in mice ( Nucl Acids Res 2007;35(16);5625-5633). The adaptations required involved mapping genes to human homologues and using bovine IDs and human IDs in the analysis, rather than murine IDs. Results The analysis procedure itself could be reused or adapted for studying another species or another phenotypic trait for which QTL data are available. In the case of the bovine trypanotolerance study, the result can be quantified in terms of the reduction of an enormous set of potential targets for investigation to a manageable shortlist of the most likely targets. Out of 24128 probesets on the array, 12591 were significantly differentially expressed (p <= 0.01 in one or more T-tests comparing expression between breeds or over time). 8342 of these probesets could be mapped to a known gene. In total they represented 7071 unique gene symbols. In contrast, there were 127 genes in the QTL that were involved in pathways identified by the combined analysis protocol. If we only include pathways with a significant (p<=0.05) score on the DAVID Fisher exact test, the list of targets is reduced to only 51 genes (shown in the table below. Note that these results are based on an analysis with EnsEMBL bovine genome preliminary build Btau2.0. A more recent preliminary build is available, and the analysis will be repeated, and key findings discussed in a future publication). Figure 1. Summary of the combined analysis procedute. Stages of the analysis that were automated using Taverna workflows are in blue B C A The Affy chips contained 24K probe sets. Of these between 600 and 750 probes were differently expressed between infected and uninfected cattle. Principle component analysis of the expression data clearly shows genome-wide differences between the transcriptomes of tolerant ( ) and susceptible ( ) cattle (Top Right, PCA component 3) and some of these differences are associated with the presence and progression of trypanosome infection (Top left, PCA component 1). Agaba M 1 Hulme H 2 Rennie C 3 Mwakaya J 1 Ogugo M 1 Brass A 2 Kemp SJ 1,3 L Hall 4 Addresses 1 International Livestock Research Institute, Box 30709 - 00100, Nairobi Kenya 2 The University of Manchester LF8 Kilburn Bldg, Oxford Rd Manchester M13 9PL UK 3 School of Biological Sciences, University of Liverpool, Liverpool, L69 7ZB, UK 4 Roslin Institute, Roslin, Midlothian, EH25 9PS, Scotland, UK Acknowledgements: We thank all the staff at the ILRI large animal facility and all colleagues in the Welcome Trust Consortium This work was supported by the Wellcome Trust. Data from an experiment showing the expression of thousands of genes on a single GeneChip® probe array. Image courtesy of Affymetrix.

Transcript of Genome responses of trypanosome infected...

Page 1: Genome responses of trypanosome infected cattleportfoliolaurencedawkinshall.weebly.com/uploads/2/... · Genome responses of trypanosome infected cattle The encounter between cattle

Genome responses of trypanosome infected cattle

The encounter between cattle and trypanosomes elicits changes in the activities of both genomes - that of the host and

that of the parasite. These changes determine the fate of the host and parasite and the outcome of the encounter.

Although the outcome in most cattle is a slow death following progressive anaemia and loss of body condition, in other

cattle the outcome is more favourable; these cattle regain the initiative, suppress parasite growth, recover from the

initial clinical signs, gain weight, and reproduce normally. Cattle exhibiting this latter outcome are said to be

trypanotolerant. Below is a synopsis of the genome responses that set apart the trypanotolerant cattle from the

susceptible cattle during experimental trypanosome infection

MetaCore GeneGo was used to identify networks

amongst the genes that were differentially expressed. The

figure (right) shows the largest network of connected

genes that were differently expressed by trypanosome

infected resistant (N’Dama) and susceptible (Boran)

cattle.

STAT3 and c-Fos have the most connectivity. STAT3 is a

transcription factor which is activated in response to the

IL-6 family of cytokines and is involved in the acute

phase response in the liver

Interestingly, STAT3 is modulated by RAC1 which is in

turn controlled by VAV1 and ARHGAP15 which are both

located in the QTLs controlling trypanotolerance.

Trypanosome infection induces profound changes in the genome function manifested by changes in the steady state level of

many genes.

The differences in the genome responses of resistant and susceptible cattle correspond to some of the phenotypic attributes

that correlate with susceptibility.

In this study, twenty N’Dama (tolerant) cattle and

20 Boran (susceptible) cattle were experimentally

infected with a lethal dose of T. congolense

IL1180. Liver biopsy samples were taken from

each individual in specified days prior to and post

infection such that at each time point there were

samples from at least 5 Boran and 5 N’Dama (A).

The mRNA profiles in the biopsy material were

assayed using the Affymetrix system (B). The gene

data were fed into an analysis workflow (C) that

integrates the expression measures, gene ontology,

QTL information, and gene pathways data.

Rennie C1

Hulme H2

Fisher P2

Hall L3

Agaba M4

Noyes HA 1

Kemp SJ1,4

Brass A 2,5

Acknowledgements: This work was wholly supported by The Wellcome Trust. The authors would also like to thank Dr Park based in Dr McHugh Õs group at University College Dublin forsharing bovine gene symbol information for Affymetrix probes.

Abstract

High throughput technologies inevitably produce vast quantities of data. This presents challenges in terms of developing effective analysis methods, particularly where the analysis involves

combining data derived from different experimental technologies.

In this investigation, we applied a systematic approach to combine microarray gene expression data, QTL data and pathway analysis resources in order to identify functional candidate genes

underlying tolerance of Trypanosoma congolense infection in cattle (see Agaba et al poster at this conference). We automated much of the analysis using Taverna workflows previously

developed for the study of trypanotolerance in the mouse model.

We identified pathways represented by genes within the QTL regions, and subsequently ranked this list according to which pathways were over-represented in the set of genes that were

differentially expressed (over time or between tolerant NÕdama and susceptible Boran breeds) at various timepoints after T. congolense infection. The genes within the QTL that played a rolein the highest-ranked pathways were flagged as strong candidates for experimental confirmation.

1 School of

Biological

Sciences,

BioSciences

Building,

University of

Liverpool, Crown

Street, Liverpool,

L69 7ZB, UK

2 School of

Computer

Science, Kilburn

Building,

University of

Manchester,

Oxford Road,

Manchester, M13

9PL, UK

3 Roslin Institute,

Roslin, Midlothian,

EH25 9PS,

Scotland, UK

4 ILRI, PO Box

30709, Nairobi,

00100, Kenya

5 Faculty of Life

Sciences,

University of

Manchester,

Smith Building,

Oxford Road,

Manchester, M13

9PT, UK

A systematic, data-driven approach to the

combined analysis of microarray and QTL data

AnaemiaBTA27

AnaemiaBTA16

Anaemia and parasitaemiaBTA7

ParasitaemiaBTA4

AnaemiaBTA2

PhenotypeQTL

location

Background

African bovine trypanosomiasis is one of the most important diseases affecting African livestock production. West African

taurine cattle, such as the N'dama, are more resistant to the pathological consequences of trypanosomiasis ( trypanotolerant )than East African zebu cattle, such as the Boran.

A microarray timecourse experiment was carried out to investigate gene expression in N'dama and Boran cattle infected with

Trypanosoma congolense, in order to identify the genes underlying trypanotolerance (see Agaba et al poster at this

conference for more details).

Trypanotolerance

Trypanotolerance is a complex phenotype involving several distinct components, likely to involve separate genetic control

mechanisms. Key features include the ability to control anaemia, control parasitaemia and maintain bodyweight. Data on

trypanotolerance QTL suggests that phenotypic traits involved in trypanotolerance may be influenced by multiple genetic loci

and possibly complex epistatic or environmental effects ( Proc Natl Acad Sci USA 2003;100(13);7443-7448 ).

Microarray data

Microarray data for liver samples extracted from Boran and N'dama cattle at 0, 12, 15, 18, 21, 26, 29, 32 and 35 days post-

infection were analysed . Outliers were identified using dChip and removed before the remaining hybridisations were

normalised using the Robust Multi-Array (RMA) method. Principal Components Analysis (PCA) was used to check that thehybridisations clustered as expected.

T-tests were used to identify genes that were differentially expressed (p<=0.01) between the two breeds at each timepoint

and paired T-tests (using data for the same individual animals at different timepoints ) were used to identify genes that were

differentially expressed (p<=0.01) within breed at any timepoint compared to day 0.

QTL data

16 trypanotolerance QTL had been identified in a previous mapping

study (Proc Natl Acad Sci USA 2003;100(13);7443-7448 ). 5 of

these QTL were selected based on the phenotypic trait involved,the mapping resolution and the strength of the effect (see table on

the left for a summary of the QTL and associated phenotypes).

The base-pair positions of these QTL relative to the EnsEMBL

bovine genome preliminary build Btau2.0 were determined

manually

Combined analysis approach

The combined analysis approach is described in Figure 1 (right). In brief, it involves mapping QTL genes and Affymetrixmicroarray probes to genes in the EnsEMBL bovine preliminary build Btau2.0 then identifying KEGG pathways that include

the EnsEMBL genes. The two resulting pathway lists are compared to generate a list of KEGG pathways that include at least

one differentially expressed gene and at least one gene in the QTL. The pathway list is then ranked according to the results

of a Fisher exact test performed on the microarray data using DAVID, and annotated using literature searches and variouspublic databases of gene and pathway information.

Large sections of the analysis were automated (shown in blue in Figure 1) by adapting Taverna workflows previously

developed for the study of trypanosomiasis responses in mice ( Nucl Acids Res 2007;35(16);5625-5633). The adaptations

required involved mapping genes to human homologues and using bovine IDs and human IDs in the analysis, rather than

murine IDs.

Results

The analysis procedure itself could be reused or adapted for studying another species or another phenotypic trait for which

QTL data are available.

In the case of the bovine trypanotolerance study, the result can be quantified in terms of the reduction of an enormous set of

potential targets for investigation to a manageable shortlist of the most likely targets. Out of 24128 probesets on the array,

12591 were significantly differentially expressed (p <= 0.01 in one or more T-tests comparing expression between breeds orover time). 8342 of these probesets could be mapped to a known gene. In total they represented 7071 unique gene symbols.

In contrast, there were 127 genes in the QTL that were involved in pathways identified by the combined analysis protocol. If

we only include pathways with a significant (p<=0.05) score on the DAVID Fisher exact test, the list of targets is reduced to

only 51 genes (shown in the table below. Note that these results are based on an analysis with EnsEMBL bovine genome

preliminary build Btau2.0. A more recent preliminary build is available, and the analysis will be repeated, and key findingsdiscussed in a future publication).

Discussion

Automated approaches are becoming increasingly necessary to enable researchers to handle the output from modern high-throughput technologies. Data-driven methods are useful instudying complex phenotypes where an analysis based solely on biological processes already known to be involved may be insufficient. Pathway-based approaches provide a means to link

microarray data to QTL data in a biologically meaningful way.

Pathway-based, data-driven, systematic, semi-automated analysis approaches provide an excellent means to triage data from high-throughput technologies providing a shortlist of viable

targets for thorough manual investigation and experimental confirmation

Figure 1. Summary of the combined analysis procedute.

Stages of the analysis that were automated usingTaverna workflows are in blue

B C

A

The Affy chips contained 24K probe sets. Of these between 600 and 750 probes were differently

expressed between infected and uninfected cattle. Principle component analysis of the expression data

clearly shows genome-wide differences between the transcriptomes of tolerant () and susceptible ( )

cattle (Top Right, PCA component 3) and some of these differences are associated with the presence

and progression of trypanosome infection (Top left, PCA component 1).

Agaba M1

Hulme H2

Rennie C3

Mwakaya J1

Ogugo M1

Brass A2

Kemp SJ1,3

L Hall4

Addresses 1International Livestock

Research Institute,

Box 30709 - 00100, Nairobi

Kenya

2The University of Manchester

LF8 Kilburn Bldg, Oxford Rd

Manchester M13 9PL

UK

3School of Biological

Sciences,

University of Liverpool,

Liverpool, L69 7ZB, UK

4Roslin Institute, Roslin,

Midlothian, EH25 9PS,

Scotland, UK

Acknowledgements:

We thank all the staff at the

ILRI large animal facility and

all colleagues in the Welcome

Trust Consortium

This work was supported by

the Wellcome Trust.

Data from an experiment showing the expression of

thousands of genes on a single GeneChip® probe

array. Image courtesy of Affymetrix.