Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90...

24
Single-step GBLUP and quality control Sayan Buaban BBLP , DLD 09-2018 [email protected]

Transcript of Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90...

Page 1: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Single-step GBLUP and qualitycontrol

Sayan BuabanBBLP, DLD 09-2018

[email protected]

Page 2: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Genetic Evaluation

Performance data

EPD /EBV

Pedigree information ARelationships

based on Pedigreedata

Henderson’s rules to create inverse of relationshipmatrix

BLUPIteration on datamethods

Page 3: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Single-Step GeneticEvaluation

Performance data

BLUP

GE-EPD /GEBV

Pedigree information

SNP

A GRelationships based on

Pedigreedata

Relationships based on Genomicdata

H

Blended Relationships

Page 4: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

• Single-step genomicevaluation

Single-Step to genomicevaluation

• Traditional genetic evaluation

𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐴

𝑏𝑢

=𝑋 𝑦𝑍 𝑦

𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐻

𝑏𝑢=

𝑋 𝑦𝑍 𝑦

Page 5: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Single step genomicevaluation

• Inverses

𝐀 = Numerator relationshipmatrix 𝐀 = Pedigree relationships between genotypedanimals𝐆 = Genomic relationships matrix

Aguilar et al., 2010 Christensen & Lund,2010

𝑋 𝑋 𝑋 𝑍𝑍 𝑋 𝑍 𝑍 𝛼𝐻

𝑏𝑢=

𝑋 𝑦𝑍 𝑦

𝐇 𝐀 0 00 𝐆 𝐀

Page 6: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Extra matrices required forsingle-step

• Inverses

𝐀 = Pedigree relationships betweengenotyped animals𝐆 = Genomic relationships matrix

PREGSF90𝐇 𝐀 0 0

0 𝐆 𝐀

Page 7: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Genomic Relationship Matrix -G• G =ZZ’/k

– Z = matrix for SNPmarker– Dimension Z=n*p– n animals,– p markers

Data file with SNPmarker

Genotype Codes1 – Homozygous2 – Heterozygous3 – Homozygous

5 – No Call (Missing)

Page 8: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

How BLUPF90 performs Single-StepGenomic

BLUPF90Programs

Genomic module

PREGSF90Genomic module

RENUMF90

blupf90 (ai)remlf90 Gibbsxf90 etc.

Page 9: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Single Step in BLUPF90package RENUMF90

Data Pedigree Markerdata

Renumf90

Output filesrenf90.datrenaddx.ped

Parameter file

Add keyword to the “animal effect”SNP_FILEmarker_file_name

Page 10: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

How BLUPF90 performs Single-StepGenomic

BLUPF90programs

Genomic module

Genomic Moduleperform qualitycontrol create extramatrices

genomic relationshippedigree relationship forgenotyped

OPTION SNP_file marker.file

Page 11: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

PreGSf90

• Interface program to the genomic module to process the genomic information for the BLUPF90 family ofprograms

• Efficientmethods– creation of the genomic relationship matrix, relationship based on pedigree– Inverse of relationshipmatrices

• Former program to performs Quality Control of SNP information

Page 12: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Input file for genomicBLUPf90

• Same parameter file as for all BLUPf90programs– But with “OPTION SNP_file marker_file_name”– indicate to run genomicsubroutines

• Pedigree file

• Marker information (SNPfile)

• Cross Reference file for renumberID– Links genotypes files with codes in pedigree, etc.– Generated by renumf90

Page 13: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

SNP mapfile• OPTION chrinfo<file>• For some genomic analyses (GWAS) or QC• Format:

– SNP number• Index number of SNP in the sorted map by

chromosome andposition– chromosome number– Position– SNP name(Optional)

• First row corresponds to first column SNP in genotype file!!!

Page 14: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Parameters fileRENUMF90renum.par

BLUPF90renf90.par

Page 15: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Pedigree file fromRENUMF90

• 1 - animalnumber• 2 - parent 1 number or UPG• 3 - parent 2 number or UPG• 4 - 3 minus number of known parents• 5 - known or estimated year of birth• 6 - number of knownparents;

if animal is genotyped 10 + number of knownparents• 7 - number ofrecords• 8 - number of progenies as parent 1• 9 - number of progenies as parent 2• 10 - original animal ID

Page 16: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

SNP file &Cross Reference IdSNPFile

Cross ReferenceID

First col: Identification, could be alphanumeric Second col: SNP markers {codes: 0,1,2 and 5 for missing}

Pedigree File (fromRENUMF90)

Original ID

Renumber ID

Page 17: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Quality control By defaultexclude

• MAF– SNP with MAF< 0.05

• Call rate– SNP with call rate <0.90– Individuals with call rate <0.90

• Monomorphic– Exclude monomorphic SNP. ONLYwhen MAF <> 0

Page 18: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Quality controlBy default exclude(cont)

• Parent-progeny conflicts (SNP &Individuals)– Exclusion -> oppositehomozygous– For SNP: >10 % of parent-progeny exclusion from

the total of pairsevaluated

– For Individuals: > 1% of parent-progeny from total number ofSNP

Page 19: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Control defaultvalues

• For MAF– OPTION minfreqx

• Call rate– OPTION callratex– OPTION callrateAnimx

• Mendelian conflicts– OPTION exclusion_thresholdx– OPTION exclusion_threshold_snpx

Page 20: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Parent-progeny conflicts

• Presence of these conflicts results in a negative H matrix !!!• Problems in estimation of variance component by REML,

programs does not converge,etc.• Solution:

– Report all conflicts, with counts for each individual as parent or progeny to trace theconflicts

– Remove progenygenotype• maybe not the bestoption• But results in a positive-definite H matrix !!!

Page 21: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Genomic MatrixOptions

• OPTION whichfreqx– 0: read from file freqdata or other specified– 1: 0.5– 2: current calculated from genotypes(default)

• OPTION FreqFile file– Reads allele frequencies from afile

• OPTION maxsnpsx– Set the maximum length of string for reading marker

data from file => BovineHDchip

Page 22: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

Saving ‘clean’files• SNP excluded from QC are set as missing (i.e.Code=5)• Excluded Individuals are treated as unrealated in G andA22

– For individual iG[i,:] = 0; G[:,i]=0; G[i,i]=1 ; Same forA22so G-A22 will cancelout

• OPTION saveCleanSNPs• Save clean genotype data with excluded SNP and individuals

– For example for a SNP_filegt– Clean fles willbe:

• gt_clean• gt_clean_XrefID

– Removed will be output infiles:• gt_SNPs_removed• gt_Animals_removed

Page 23: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

No Qualitycontrol

• ONLYuse:– If QC was performed in a previous run

• preGSf90

– and “clean” genotype file isused

• OPTION no_quality_control

Page 24: Single-step GBLUP and qualitycontrolbiotech.dld.go.th/webnew/Data/KM-BBLP/Animal...How BLUPF90 performs Single-StepGenomic BLUPF90 Programs Genomic module PREGSF90 Genomic module RENUMF90

PreGSf90 wiki