Session ii g3 lab behavior science mmc

6
Information Management Systems in Behavior Science Theme: Statistic, Epidemiology Modeling and Diabetes Lab #2 Etienne Z. Gnimpieba BRIN WS 2013 Mount Marty College – June 24 th 2013 [email protected]

Transcript of Session ii g3 lab behavior science mmc

Page 1: Session ii g3 lab behavior science mmc

Information Management Systems in Behavior Science

Theme: Statistic, Epidemiology Modeling and Diabetes

Lab #2

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013 [email protected]

Page 2: Session ii g3 lab behavior science mmc

Context

0. Specification & Aims

Lab #2

Statement of problem / Case study: Interdisciplinary and transversal research development invites each scientific field to initiate an evolution toward integration to /for other field. The bioinformatics and computational biology used in Behavior science remain difficult to describe. We propose here to use Information management tools (data collected, mining, load, statistic analysis) and systems Biology modeling (epidemiology modeling) as key point for that translational interaction. Epidemiology is the study of the distribution and determinants of health-related states or events (including disease), and the application of this study to the control of diseases and other health problems. Various methods can be used to carry out epidemiological investigations: surveillance and descriptive studies can be used to study distribution; analytical studies are used to study determinants.

Bioinformatics and Information Management in Behavior Science

Aim: The aim of this lab is to create a broader understanding of behavioral science data analysis and mining using statistic tools. As a part of life science area, we propose here to use Bioinformatics and Systems Biology tools in Epidemiology modeling to predict disease spread.

Acquired skillsOnline Server Tools:- Survey design (from hypothesis to questionnaire)- Google Apps (design forms, updating questions)- Data analysis, data learning, data mining- Using NetLogo (modeling approach)

2

Resolution Process

T1. Creating a Google Survey Objective: Learn how to make a Google survey

T2. Descriptive and Inference Statistics in Excel Objective: Extract Load and Treat (ELT) data set for excel statistics used.

T1.1. Setting up the SurveyT1.2 Creating the QuestionsT1.3. Collected Data and Visualize the Summary

T2.1. Import/edit/export from different formats (text, tab, xml, …)T2.2. Descriptive analysis in excel (max, min, typos, count, Stde, average, sum) and data visualization in excel (histogram, scatter plot, …)T.2.3. Case Study: Social Reajustment Rating Scale (SRRS) question on diabete

T4. Epidemiology Modeling Objective: How to use modeling approach to analyze the epidemiology problems

T4.1. Using NetLogoT4.2. Running epiDEM simulation

T3. Inference statistics in Tanagra (data mining) Objective: learn, extract knowledge from data using data mining tools (associations rules, clustering, neuronal network) in Tanagra.

T3.1. Data Mining using Associaton Rule on Dataset T3.2. K-Means Clustering Method for Data Learning

Page 3: Session ii g3 lab behavior science mmc

Information Management in Behavior ScienceT1. Creating a Google Survey

Objective : Learn how to make a Google Survey

T1.1. Setting up the SurveyOn the Google website: http://www.google.com o Click on “Drive” tabo Login or create an accounto Click on the red “Create” button and select the “Form”o In “Title” you name your survey. You can also select your desired theme. For our example we

will name our survey “Diabetes” o Click “Ok” to begin creating your survey

o A screen will open and you can complete an number of functions such aso Providing a descriptiono Titling your questionso Providing help text if neededo Choosing you question typeo Adding more questions

o When you are finished with your survey questions click “Done”

o You can now choose the options too Show link to submit another responseo Publish and show the link to the results of the form to all the responderso Allow responders to edit responses after submitting

o You can now share the link of your survey through Google +, Facebook, and twitter, or sending forms via e-mail.

o When finished selecting recipients click “Done”

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

T1.2. Creating the Questions

T1.3. Collected Data and Visualize the Summary

Statistics in Behavior Science

Page 4: Session ii g3 lab behavior science mmc

Information Management in Behavior ScienceT2. Descriptive and Inference Statistics in Excel

Objective: Extract Load and Treat (ELT) data set for excel statistics used

T2.1. Import/Edit/Export from different Formats (text, tab, xml…)

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

o Open Excel and go to the “Data” tab. Click on “From Text” and select the file “data_lab8” from the student folder. o Click “Next” Under “Delimiters” select only the “comma” Click “Finish” and “OK”o Click on the column letter you want to sort by and click on “Sort and Filter” on the top left of the “Data” tab. o Select the “Sort A though Z” Make sure “Expand with selection is selected and click “Sort”

T2.2. Descriptive Analysis in Excel and Data Visualisation in Excelo Click on the cell under the title “Average” Under the “Formulas” tab click on “More Functions”, “Statistical” and

“Average”o To find the average of the cars that traveled enter “D2:D21” in “Number 1” and click “OK” o To find the standard deviation of the number accidents select the cell under the title “Stdev” and click on “More

Functions”, “Statistical”, and “STDEV.S”. In “Number 1” put “E2:E21”. o Highlight the data under “Cars Travel” (D2:D21) Go to the “Insert” tab and click on “Line” and select the first dot

line option giveno You can edit the chart multiple ways and change the looks of in by going under the tabs that show up when the

chart is selectedo Under the “Layout” tab you can add titles for the X and Y axis's. o Right click the chart and select “Change Chart Type…” if you want a different type of graphT.2.3. Case Study: Social Reajustment Rating Scale (SRRS) question on

Diabeteso Open “Case Study-Diabetes” from your student foldero Select cell C2 and type in the equation “=$B2*$B2” and press enter.o Hover your mouse over the lower right corner and drag the black box down to C7 to apply this equation to the rest

of the cells.o Select cell E2 and type in the equation “=$D2*$D2” and again drag down until cell E7o Select cell F2 and type in the equation “=$B2*$D2” and again drag down until cell F7o Select cell B8 and click on the button “Math and Trig” and select “SUM” Type in “Number 1” “B2:B7” and then

press “OK”o Drag the black box across to F8. Now your table should be completeo Click on the “Insert” tab and pull down the “Scatter” button and select the first chart

Statistics in Behavior Science

Page 5: Session ii g3 lab behavior science mmc

Statistics in Behavior Science

Information Management in Behavior ScienceT3. Inference statistics in Tanagra

Objective: learn, extract knowledge from data using data mining tools (associations rules, clustering, neuronal network) in Tanagra.

Data Mining

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

T3.1. Data Mining using Association Rule on Dataset o Open Tanagra. Pull down “File” and select “Open…” Pull down “Files of Type” and select “Binary data

mining diagram (*.bdm) Then open the file “T41Bin_transactions.bdm” in your student foldero Click on the “Define Attributes Status” icon [ ] Select all “Attributes” by clicking on [ ] and

move over into “Input” by clicking on the arrow. Click “OK”o Right click “Define Status 1” on the left and select “Execute”o Click on “Association” on the bottom. Pull “A priori” on the top of “Define Status 1” on top left. Then

double click “A priori 1” to see the results

T3.2. K-Means Clustering Method for Data Learning o Open Tanagra, pull down “File” and click on this icon [ ] Then click on this icon [ ]Change file

type to “Excel File..” and open the file “cars” from your student folder. Click “Open”o Click on this icon [ ] Move “MPG”, “Weight”, and “Drive Ratio” into the “Input” and click “OK”o Click on “Clustering” at the bottom, and pull “HAC” on top of “Define status1” Then right click “HAC

1” and change “Best Clusters” to “Detect”o You can open the “Dendrogram” tab on the top of the view window. Move back to the “Report” viewo Add another “Define Status” under “HAC 1” and select all of the “Attributes” into “Input” except

“Car” ad “Cluster_HAC_1” o Open “Target” tab and put “Cluster_HAC_1” in o Click on “Statistics” and put “Group characterization” under “Define status 2”o Click on “Factorial Analysis” and put “Principal Component Analysis 1” under “HAC 1” Add a “Define

status” under “Principal Component Analysis 1” as well. Change the “Parameters” for the new “Define status 3” by putting “MPG”, “Wieght”, “Drive_Ration”, “Horsepower”, “Displacement”, and “cylinders” in “Input” and “PCA_1_Axis _1” and “PCA_1_Axis_2” in “Target.

o Click on “Data Visualization” and put “Correlation scatterplot” under “Define status 3” In any top right pull down in the vieww window select “Cluster_HAC_1”. In the other two pull downs put “PCA_1_Axis_1” and “PCA_1_Axis_2”

Page 6: Session ii g3 lab behavior science mmc

Information Management in Behavior ScienceT4. Epidemiology Modeling

Objective : How to use modeling approach to analyze the epidemiology problems

T4.1 Using NetLogo

o Click “Run epiDEM Travel and Control in you browser” to launch ito You can change the parameters of the system as you please and simulate

the experimento Click “Setup” and the system will automatically build a populationo Execute by clicking “Go”o You will be able to see the results of the population changes in the left

graphso Right Click and select “Copy Image” in order to copy the image of your

results

Etienne Z. GnimpiebaBRIN WS 2013

Mount Marty College – June 24th 2013

T4.2. Running epiDEM Simulation

On the NetLogo website: http://ccl.northwestern.edu/netlogo/ (Make sure you are using Chrome browser)o Click on “Library” and scroll to the bottom of the page and select “epiDEM

Travel and Control”

Statistics in Behavior Science