Introduction to Computational Thinking Vicky Chen.

29
Introduction to Computational Thinking Vicky Chen

Transcript of Introduction to Computational Thinking Vicky Chen.

Page 1: Introduction to Computational Thinking Vicky Chen.

Introduction to Computational Thinking

Vicky Chen

Page 2: Introduction to Computational Thinking Vicky Chen.

Fundamental Theorem of Informatics

Friedman C P J Am Med Inform Assoc 2009;16:169-170

Page 3: Introduction to Computational Thinking Vicky Chen.

What Informatics Is Not

Friedman C P J Am Med Inform Assoc 2009;16:169-170

Page 4: Introduction to Computational Thinking Vicky Chen.

Computational Thinking

Computational thinking is a way of solving problems, designing systems, and understanding human behavior that draws on concepts fundamental to computer science. To flourish in today's world, computational thinking has to be a fundamental part of the way people think and understand the world.

http://www.cs.cmu.edu/~CompThink/

Page 5: Introduction to Computational Thinking Vicky Chen.

Computational Thinking

• Analyzing and logically organizing data• Data modeling, data abstractions, and

simulations• Formulating problems so computers may assist• Identifying, testing, and implementing possible

solutions• Automating solutions via algorithmic thinking• Generalizing and applying this process to other

problems

Page 6: Introduction to Computational Thinking Vicky Chen.

Algorithm

• A finite list of instructions that describe all required steps to perform a computation, written in general language

Page 7: Introduction to Computational Thinking Vicky Chen.

Programming Steps

• Specification– What the code should do

• Design– Pseudocode

• Implement– Programming

• Test– Debugging

Page 8: Introduction to Computational Thinking Vicky Chen.

Data Type / Data Structure

• Integer• Floating point• Boolean• Character• String

• List• Dictionary• Hash Table

Page 9: Introduction to Computational Thinking Vicky Chen.

Data Types

Page 10: Introduction to Computational Thinking Vicky Chen.

List

Page 11: Introduction to Computational Thinking Vicky Chen.

Dictionary / Hash Table

Page 12: Introduction to Computational Thinking Vicky Chen.

Exercise 1

We have a matrix with mutation information for different tumor samples.

How can this data be represented?

Page 13: Introduction to Computational Thinking Vicky Chen.

List of Lists

• Data is a sparse matrix• Stores a lot of extra uninformative information

Page 14: Introduction to Computational Thinking Vicky Chen.

Dictionary

Page 15: Introduction to Computational Thinking Vicky Chen.

Opening Files

• Mutation matrix contains data on 2337 genes and 779 samples

• Inputting data by hand is not feasible• Data usually read in and processed from files

Page 16: Introduction to Computational Thinking Vicky Chen.

Opening Files

Page 17: Introduction to Computational Thinking Vicky Chen.

Input and print

Page 18: Introduction to Computational Thinking Vicky Chen.

For Loops

Page 19: Introduction to Computational Thinking Vicky Chen.

While Loops

Page 20: Introduction to Computational Thinking Vicky Chen.

Conditional Statements

Page 21: Introduction to Computational Thinking Vicky Chen.

Conditional Statements

• If, else if, else• and• or• not

Page 22: Introduction to Computational Thinking Vicky Chen.

Exercise 2

We have a dictionary that contains tumor sample mutation information.

We want to print out a list of tumor samples after receiving a mutated gene of interest from the user.

Page 23: Introduction to Computational Thinking Vicky Chen.

Opening Files Revisited

Page 24: Introduction to Computational Thinking Vicky Chen.

Opening Files Revisited

Page 25: Introduction to Computational Thinking Vicky Chen.

Data Extraction from Files

• Many files will contain extra information• Focus on extracting only pertinent data• Applicable to many types of data– Natural language documents (e.g. articles)– Sequence data (e.g. FASTA files)– Files from databases (e.g. NCBI Gene, TCGA)– Etc.

Page 26: Introduction to Computational Thinking Vicky Chen.

Regular Expressions

Page 27: Introduction to Computational Thinking Vicky Chen.

Reusing Code

• Some code can be useful in multiple situations• It is possible to just rewrite (or copy) the code

each time– Less efficient– Multiple locations to fix when debugging

Page 28: Introduction to Computational Thinking Vicky Chen.

Functions

Page 29: Introduction to Computational Thinking Vicky Chen.

Exercise 3

We have a document containing human gene information downloaded from NCBI.

We want to extract and store the Ensembl ID of each gene with its corresponding gene symbol.