Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles...
-
Upload
hortense-woods -
Category
Documents
-
view
217 -
download
0
Transcript of Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles...
![Page 1: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/1.jpg)
Welcome to Intro to Bioinformatics
![Page 2: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/2.jpg)
Intergalactic Border PatrolBioinformatics in Space
Tribbles
Warning! Highly dangerous!
Trogs
Cute and harmless.
![Page 3: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/3.jpg)
Intergalactic Border PatrolBioinformatics in Space
Tribbles
Warning! Highly dangerous!
Trogs
Cute and harmless.
![Page 4: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/4.jpg)
Welcome to the Intergalactic Detention Center
Please answer the following questions
1. Like broccoli
2. Floss every brushing
3. Enjoy ballet
4. Always pair socks
5. Liked Moby Dick
6. Eat the Maraschino cherry
1………………………….……………..10
![Page 5: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/5.jpg)
T1 T2 T3 T4 T5 T6 T7 . . .
Responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 1.6 4.0 5.2 2.2 9.1 1.0 . . .
2.2 1.9 1.0 4.6 7.6 9.8 1.0 . . .
8.3 3.1 2.4 6.1 9.3 9.2 1.0 . . .
9.6 5.5 1.3 8.4 9.8 9.0 1.0 . . .
6.4 8.9 7.1 3.3 1.9 2.0 1.0 . . .
6817. MacArthur’s Park 1.2 1.5 5.1 3.4 1.1 1.7 9.9 . . .
4.2 2.1 1.0 4.1 5.2 4.4 1.0 . . .
You need a plan
![Page 6: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/6.jpg)
A Plan
• Release all Tribbles / Trogs
• Note outcome for each individual
• Deduce identities
• Integrate identities into results
• Figure out which questions/answers informative
![Page 7: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/7.jpg)
T1 T2 T3 T4 T5 T6 T7 . . .
Responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 1.6 4.0 5.2 2.2 9.1 1.0 . . .
2.2 1.9 1.0 4.6 7.6 9.8 1.0 . . .
8.3 3.1 2.4 6.1 9.3 9.2 1.0 . . .
9.6 5.5 1.3 8.4 9.8 9.0 1.0 . . .
6.4 8.9 7.1 3.3 1.9 2.0 1.0 . . .
6817. MacArthur’s Park 1.2 1.5 5.1 3.4 1.1 1.7 9.9 . . .
Tribbles Trogs
4.2 2.1 1.0 4.1 5.2 4.4 1.0 . . .
(what now?)
![Page 8: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/8.jpg)
T1 T2 T3 T4 T5 T6 T7 Mean
Responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 1.6 4.0 5.2 2.2 9.1 1.0 6.4 2.2
2.2 1.9 1.0 4.6 7.6 9.8 1.0 6.0 1.3
8.3 3.1 2.4 6.1 9.3 9.2 1.0 8.2 2.2
9.6 5.5 1.3 8.4 9.8 9.0 1.0 9.2 2.6
4.2 2.1 1.0 4.1 5.2 4.4 1.0 4.4 1.4
6.4 8.9 7.1 3.3 1.9 2.0 1.0 4.4 3.7
6817. MacArthur’s Park 1.2 1.5 5.1 3.4 1.1 1.7 9.9 1.8 5.5
Tribbles Trogs
![Page 9: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/9.jpg)
Which questions are informative?Which can be used to predict class?
The responses to which questions are correlated with class?
1…………………….……………..10
Δμ
1…………………….……………..10
Δμ
Δμ
σ + σCorrelation of question with class =
![Page 10: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/10.jpg)
Which questions are informative?Which can be used to predict class?
Strategy
Δμ
σ + σCorrelation =
• Calculate correlation for each question
• Look for questions with largest correlations with class
Implementation
μ = (Σ s ) / N
1…………………….……………..10
![Page 11: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/11.jpg)
Which questions are informative?Which can be used to predict class?
Strategy
Δμ
σ + σCorrelation =
• Calculate correlation for each question
• Look for questions with largest correlations with class
Implementation
σ2 = [Σ (s - μ)2 / (N-1)]σ = sqrt(σ)
1…………………….……………..10
- +
![Page 12: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/12.jpg)
Which questions are informative?Which can be used to predict class?
Strategy
Δμ
σ + σCorrelation =
• Calculate correlation for each question
• Look for questions with largest correlations with class
Implementation
(Σ s)/ N - (Σ s)/N
sqrt(Σ (s - μ)2 / (N-1)] + sqrt(Σ (s - μ)2 / (N-1)) =
![Page 13: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/13.jpg)
Which questions are informative?Which can be used to predict class?
Δμ
σ + σCorrelation =
Implementation
=
Read_Responses_To_Question();
$numerator = Mean(@tribble_scores) – Mean(@trog_scores);
$denominator = StDev(@tribble_scores) + StDev(@trog_scores);
$correlation = $numerator / $denominator;
push @question_info, [$question_number, $correlation];
(Σ s)/ N - (Σ s)/N
sqrt(Σ (s - μ)2 / (N-1)] + sqrt(Σ (s - μ)2 / (N-1))
![Page 14: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/14.jpg)
Which questions are informative?Which can be used to predict class?
Δμ
σ + σCorrelation =
Implementation
=
Read_Responses_To_Question();
$numerator = Mean(@tribble_scores) – Mean(@trog_scores);
$denominator = StDev(@tribble_scores) + StDev(@trog_scores);
while (<INPUT>) {
}
$correlation = $numerator / $denominator;
push @question_info, [$question_number, $correlation];
(Σ s)/ N - (Σ s)/N
sqrt(Σ (s - μ)2 / (N-1)] + sqrt(Σ (s - μ)2 / (N-1))
![Page 15: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/15.jpg)
Which questions are informative?Which can be used to predict class?
Implementation
sub Mean {
my @scores = @_; # Grab Tribble or Trog scores
my $s_sum = 0; # Start Σ at 0
my $N = 0; # Need to count N
foreach my $score (@scores) {
$s_sum = $s_sum + $score;
$N = $N + 1;
}
return $s_sum / $N; # mean = (Σ s)/ N
![Page 16: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/16.jpg)
Which questions are informative?Which can be used to predict class?
Results
Question Correlation
3497 1.76 281 1.72 1114 1.71
… …
Are these questions good predictors of class?
Suppose there are NO good predictors of class…
![Page 17: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/17.jpg)
(Interlude)
NEWS!
Precinct in Harrisonburg has voted for the winning senatorial candidate every time
for the past ten elections!
(Probability if by chance = (1/2) · (1/2) · (1/2) · …
= (1/2)10
= 1/1024 1/1000
Suppose there are 1000 precincts in Virginia…
(BLAST from the past) E = (probability) · (number of combinations)
Beware the fallacy of the unlikely result!
![Page 18: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/18.jpg)
Which questions are informative?Which can be used to predict class?
Results
Question Correlation
3497 1.76 281 1.72 1114 1.71
… …
Are these questions good predictors of class?
Suppose there are NO good predictors of class…
… what would be the expected correlation?
![Page 19: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/19.jpg)
? ? ?
Which questions are informative?How to test class predictors?
Choice #1
Rerun time with the different (?) reality that Tribbles are no different from Trogs
Choice #2
Use random data
![Page 20: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/20.jpg)
T1 T2 T3 T4 T5 T6 T7 . . .
Random responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 -1600 331/3 99 3.14159 -0 1.0 . . .
6817. MacArthur’s Park
Random doesn’t mean crazy
![Page 21: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/21.jpg)
T1 T2 T3 T4 T5 T6 T7 . . .
Random responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 1.6 4.0 5.2 2.2 9.1 1.0 . . .
2.2 1.9 1.0 4.6 7.6 9.8 1.0 . . .
8.3 3.1 2.4 6.1 9.3 9.2 1.0 . . .
9.6 5.5 1.3 8.4 9.8 9.0 1.0 . . .
6.4 8.9 7.1 3.3 1.9 2.0 1.0 . . .
6817. MacArthur’s Park 1.2 1.5 5.1 3.4 1.1 1.7 9.9 . . .
4.2 2.1 1.0 4.1 5.2 4.4 1.0 . . .
Maybe but…
![Page 22: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/22.jpg)
T1 T2 T3 T4 T5 T6 T7 . . .
Random responses to questionnaire
1. Broccoli
2. Floss
3. Ballet
4. Pair socks
5. Moby Dick
6. Maraschino
. . .
9.2 1.6 4.0 5.2 2.2 9.1 1.0 . . .
2.2 1.9 1.0 4.6 7.6 9.8 1.0 . . .
8.3 3.1 2.4 6.1 9.3 9.2 1.0 . . .
9.6 5.5 1.3 8.4 9.8 9.0 1.0 . . .
6.4 8.9 7.1 3.3 1.9 2.0 1.0 . . .
6817. MacArthur’s Park 1.2 1.5 5.1 3.4 1.1 1.7 9.9 . . .
4.2 2.1 1.0 4.1 5.2 4.4 1.0 . . .
Keep the data, shuffle the players
![Page 23: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/23.jpg)
Which questions are informative?How to test class predictors?
Choice #1
Rerun time with the different (?) reality that Tribbles are no different from Trogs
Choice #2
Use random data
Choice #3
Shuffle data
![Page 24: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/24.jpg)
Which questions are informative?How to test class predictors?
Correlation2.0 1.5 1.0 0.5 0 -0.5
# of questions
with better correlations
10000
1000
100
10
0
5% of shuffled responses
![Page 25: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/25.jpg)
Which questions are informative?How to test class predictors?
Correlation2.0 1.5 1.0 0.5 0 -0.5
# of questions
with better correlations
10000
1000
100
10
0
1% of shuffled responses
Actual responses
![Page 26: Welcome to Intro to Bioinformatics. Intergalactic Border Patrol Bioinformatics in Space Tribbles Warning! Highly dangerous! Trogs Cute and harmless.](https://reader030.fdocuments.net/reader030/viewer/2022032804/56649e545503460f94b4b5ab/html5/thumbnails/26.jpg)
Which questions are informative?How to test class predictors?
Correlation2.0 1.5 1.0 0.5 0 -0.5
# of questions
with better correlations
10000
1000
100
10
0
1% of shuffled responses
Actual responses
If class predictors don’t work
If class predictors are valid