SAURABH_CV_MACHINE

6
: E-mail [email protected] : Website in.linkedin.com/pub/saurabh- gawande/27/663/42 : Phone +919822903716 National Informatics Center February 2014 — February 2015 Saurabh Gawande Summary and Objective A Computational Biologist with experience in Healthcare Data Analytics. Recently,completed a certification course "Big Data Analytics & Optimization" certified by The Language Technologies Institute, Carnegie Mellon University. Worked on predictive analytics consulting assignments from clients in Operations Research, Healthcare & Social Analytics domain.Was proactively involved in the analytical workflow from start to finish,by Translating the business case into a proper analytics agenda ,Developing Hypothesis,Wrangling & understanding the data, Exploring patterns,Executing models on the data via statistics & data mining algorithms,unearthed insights from the data & communicated findings & recommendations to technical as well non technical stakeholders. Currently, I'am a Technical contributor to The District Informatics Officer at The National Informatics Center, Akola working on creating better E-governance frameworks by technical resource optimization & virtualization. Previously, was a Junior Research Fellow at The National Center for Cell Sciences and was involved in setting up next gen sequencing pipelines and extracting insights from huge amount of data generated by sequencing experiments utilizing R & Bio-conductor. `Working here sparked my interest in Big Data Analytics. I aspire to work at an organisation wherein I'll get to learn and work on challenging problems pertaining to Predictive Analytics and Big Data across several verticals.(esp. Renewable Energy & Healthcare.) Work experience Technical Contributor to The District Informatics Officer I was involved in executing a project called "Virtualization using Virtual Machines for effective and concrete E-Governance framework." The goal of this project was to design a framework to deliver the maximum throughput using technical resource optimization with improved and cost effective e-governance services. This was done by involving technologies like virtualization, consolidation and cloud computing and by adopting of free and open source software in designing and deploying e-governance which lead towards resource optimization with reduction in total cost associated with both hardware as well as software. Ultimately the financial burden abide by the state and central governments was greatly reduced. Saurabh Gawande 1

Transcript of SAURABH_CV_MACHINE

Page 1: SAURABH_CV_MACHINE

:E-mail [email protected] :Website in.linkedin.com/pub/saurabh-gawande/27/663/42

:Phone +919822903716

National Informatics Center February 2014 — February 2015

Saurabh Gawande

Summary and ObjectiveA Computational Biologist with experience in Healthcare Data Analytics.Recently,completed a certification course "Big Data Analytics & Optimization" certified by The Language TechnologiesInstitute, Carnegie Mellon University.

Worked on predictive analytics consulting assignments from clients in Operations Research, Healthcare & Social Analyticsdomain.Was proactively involved in the analytical workflow from start to finish,by Translating the business case into a properanalytics agenda ,Developing Hypothesis,Wrangling & understanding the data, Exploring patterns,Executing models on thedata via statistics & data mining algorithms,unearthed insights from the data & communicated findings & recommendationsto technical as well non technical stakeholders.

Currently, I'am a Technical contributor to The District Informatics Officer at The National Informatics Center, Akola workingon creating better E-governance frameworks by technical resource optimization & virtualization.

Previously, was a Junior Research Fellow at The National Center for Cell Sciences and was involved in setting up next gen sequencing pipelines and extracting insights from huge amount of data generated by sequencingexperiments utilizing R & Bio-conductor. `Working here sparked my interest in Big Data Analytics.

I aspire to work at an organisation wherein I'll get to learn and work on challenging problems pertaining to PredictiveAnalytics and Big Data across several verticals.(esp. Renewable Energy & Healthcare.)

Work experience

Technical Contributor to The District Informatics OfficerI was involved in executing a project called "Virtualization using Virtual Machines for effective and concrete E-Governanceframework."

The goal of this project was to design a framework to deliver the maximum throughput using technical resource optimizationwith improved and cost effective e-governance services.

This was done by involving technologies like virtualization, consolidation and cloud computing and by adopting of free andopen source software in designing and deploying e-governance which lead towards resource optimization with reduction intotal cost associated with both hardware as well as software. Ultimately the financial burden abide by the state and centralgovernments was greatly reduced.

Saurab h Gawand e 1

Page 2: SAURABH_CV_MACHINE

National Centre For Cell Science February 2013 — February 2014

ATG LAB August 2012 — January 2013

IBI Biosolutions June 2010 — July 2010

This project was the recipient to multiple awards,some of them being :

1] e-Maharashtra 2013 Excellence Award for best G2G initiative for theproject Virtual Machine.2] Recognized as "eGovernance Champion for year 2013-14" onthe occasion of UN public service day, 23rd June 2013 in the presence of Ms. LiseGrande, UN Resident Coordinator, India.

3] Winner of Manthan South West India Awards 2014 in e-Governance Category.

Junior Research FellowAt The National Centre For Cell Science (www.nccs.res.in/) which is an autonomous institute of Dept. of Biotechnology,Govt. of India,I was based at The Microbial culture collection unit.

I was involved in a project tittled “Studying Gene integration of bacteria in the Mosquito Genome.”(see projects).

Concurrently,I was also involved in metagenomic data analysis of the sequencing experiments undertaken at The MicrobialCulture collection unit carried out by Sanger sequencing and Ion torrent personal genome machine

Was Involved in setting up pipelines & conducting next gen sequence data analysis,RNA sequence analysis using R and Bioconductor.

Lead Faculty in Bioinformatics and Molecular BiologyI joined ATG Lab as a Bioinformatics/Molecular Biology faculty for training programs and conducting workshop services ofATG LAB.

Trained undergraduate students in various insilico experiments such as, homology modeling and structure prediction ofprotein from amino acid sequence data, routine bioinformatics analysis of chromatogram generated sequences with the helpof online tools and softwares.Trained students in following techniques: DNA extraction, PCR, Agarose gel electrophoresis,Sample preparation for DNAsequencing NCBI sequence search analysis based on nucleotide identity, Phylogenetic analysis by MEGA4.1, protein modelvisualization by Swiss model, Chimera, Discovery studio etc.

Conducted and lead hands on training workshops ranging from PCR To Sequencing Data analysis across various colleges inIndia.

Research InternDrug design is a process used in biopharmaceutical industry to discover and develop new drug compounds.

We used a variety of computational methods(structure-based drug design, ligand-based drug design, de novo design andhomology modeling) to identify novel compounds , design compounds for selectivity, efficacy and safety which wouldbe further used for designing a drug for Asthama.

Saurab h Gawand e 2

Page 3: SAURABH_CV_MACHINE

CertificationsCompleted 22 week program "Certificate In engineering Excellence in Big Data Analytics And Optimization"at TheInternational School Of Engineering & certified by "The Language Technologies Institute Of Carnegie MellonUniversity".

Some Advanced subjects covered in curriculum:

Essential Business Skills for a Data Scientist ,Essential Engineering Skills in Big Data AnalyticsStatistical Modeling for Predictive Analytics,Engineering Big Data with R and Hadoop EcosystemText Mining, Social Network Analysis and Natural Language ProcessingMethods and Algorithms in Machine Learning, Optimization and Decision AnalysisCommunication, Ethical and IP Challenges for Analytics Professionals

Also completed 2 real live analytics capstone projects across Healthcare and Operations Research verticals. (seeprojects)

Projects

Data Science / Predictive analytics :

1] "Big Data Analytics of Email Marketing Campaign" : (Ongoing)

Project in brief: The client is a magazine publisher whose E-mail marketing campaigns were analyzed forextracting actionable insights about his customers. Calculated & Analyzed 'Click to Open rates' to giveinformation about several parameters about the customers (CTOR by Gender,Income,Demographics etc.) Thefuture plan is to create an automated workflow for the data flow from source to final reports/dashboards.

Data flow in brief - a] Copy files into HDFS b] Process data by applying transformations & calculations usingMap-reduce/Pig c] Query to load data for reports using Hive d] Build reports/dashboards using Tableau e]Automate the entire flow using Oozie.

2] “ DEVELOPING ROBUST PATIENT ADHERENCE MODELS AND FRAMEWORKS.” Project in brief: A leading largest pharmacy retailer had collected huge data of patient transactions (fills of their medicinesat each month).To improve the medication adherence, the client believed that engaging patients is one of the best ways theclient plans to proactively target those members who are at risk. Created a model for the client that can account for the riskof non-adherence per patient and predict if the patient is likely to non-adhere.

Techniques used : Linear Regression, Decision Trees ,Random Forests Tools : R ,Weka, Tableau.

Saurab h Gawand e 3

Page 4: SAURABH_CV_MACHINE

3] " Vehicle Fleet Optimization."

Project in brief : An Environment and waste management enterprise collected its trip level transaction data and also GPSvehicle tracking data. I was involved in extracting patterns, anomalies, and actionable insights from this data to efficientlycome up with the optimized way to use their resource and also to improve their business.

Techniques Used : Linear regression, Decision Trees, Naive Bayes, Random Forests, Longitudinal Models,Support Vector Machine.Tools : R,Weka, Tableau.

Bioinformatics /Genomics /Molecular Biology :

1] " Study on Bacterial Gene Integration in the mosquito genome."

Place of Research : Microbial Culture Collection Unit at The National Center Of Cell Science,Pune

In this project we examined the relationships of different mosquito species(Aedis egypti,Anopheles Stephensi,Culexquinquefasciatus,Anophelous ganmbiae) with endosymbiants specifically Wolbachia. By genome analysis of both themosquito genomes and wolbachia genome an investigation was performed to examine the possible gene integration ofendosymbiants in the mosquito genome.

2] Studies on Bacillus subspecies group bacterium isolated from Indian currency note by RNA dependent RNApolymerase Beta subunit, Nucleotide and Amino acid sequence analysis for molecular differentiation

Place of Research : ATG LAB , Pune

The rpoB gene, encoding the beta-subunit of RNA polymerase, has emerged as a core gene candidate forphylogenetic analyses and identification of bacteria, especially when studying closely related isolates of Bacilluscereus group of bacteria. Together with the 16S rRNA gene, rpoB has helped to delineate new bacterial species andrefine bacterial community analysis. B. anthracis identified from the Indian Currency note, had been subsequentlyanalyzed for the same by using rpoB as molecular marker.

DNA sequence analysis of 988 bp gene segment from given set of primers 1712F and 2700R and correspondingamino acid along BLAST suggests maximum similarity with B. anthracis . DNA sequence analysis by NCBI BLAST,multiple sequence alignment by ClustalW and phylogenetic anahed lysis by MEGA5 of nucleotide sequence suggestthat the studied isolate of B. anthracis group bacteria very closely matches with Bacillus anthracis.

In addition to this, present study also include comparative structural analysis of 5’-3’ mid region of rpoB protein basedon available and established structure of RNA dependent RNA polymerase protein. Present study covers total fivemutations at gene level as compared to reference strain B. anthracis ames ancestor, which was used as referencestrains for primers designing. Although Bacillus cereus group covers four species i.e. Bacillus cereus,Bacillusanthracis, Bacillus thuringiensis and Bacillus mycoides, the isolated bacteria i.e. Bacillus anthracis group speciesDL7 strain, is comparatively very close to Bacillus anthracis.

Nucleotide NCBI Published sequences :http://www.ncbi.nlm.nih.gov/nuccore/?term=gawande+s+lingojwar

Saurab h Gawand e 4

Page 5: SAURABH_CV_MACHINE

Bachelors of technology August 2008 — August 2012Education

Dr. D. Y.Patil Biotechnology and Bioinformatics Institute, PuneCumulative GPA : 7.26/10Ranked in the top 10% in the last yearsRanked in top 5 in the last semester.

Relevant Coursework : PERL Programming,JAVA,Advanced JAVA,C,C++,Biomathematics,Protein Modelling & DrugDesigning, SQL, Chemoinformatics,Molecular Biology,,Structural Biology,Molecular modelling,Genomics And Proteomics.

Technical SkillsA] Data Science & Predictive Analytics :

Machine Learning,Data Mining,Statistical Modelling,Text Analytics,Linear Optimization,Time Series Forecasting.

B] Big Data : Working knowledge of Hadoop ecosystem(HDFS,Map-Reduce,Sqoop & Flume,Pig,Hive,HBase)

C] Visualization Tools: Tableau ,Shiny

D] Programming : C, R, JAVA

E] Operating Systems : Windows,Linux

F] Genomics/Bioinformatics :Next Generation Sequencing data analysis (using R,Bioconductor,BOWTIE,ABYSS,Galaxyworkbench,SeqGene) , Microarray Data analysis ,Insilico Drug designing

G] Computing : Full virtualization, Cloud Computing, Distributed computing, Server consolidation

ReferencesReferences available upon request.

InterestsInternet of Things , Big Data , Personalized Medicine ,Predictive Analytics

Saurab h Gawand e 5

Page 6: SAURABH_CV_MACHINE

Volunteer WorkA] Cytocomp :

Cytocomp is a startup which focuses on applying synthetic biology and Nano bioelectronics & engineer solutions tomonitor and control biological systems. Cytocomp is involved in developing a general purpose biologicalmicroprocessor. This microprocessor takes both a biological and electrical signal as input and output. Thus users cantrack input and output in the microprocessor via a smartphone interface.CytoComp`s computer assisted software (CAD) enables to custom design a biological microprocessor for specificpurposes. CytoComp builds the microprocessor as per the specifications made in the CAD. This microprocessor isable to take biological signals as an input and will have a biological regulator as an output. As the biologicalmicroprocessor can communicate with a smartphone, many potential applications can be developed.I have been granted exclusive developers license which would enable me to develop my own biologicalmicroprocessor through Cytocomp’s Computer assisted software (CAD).Cytocomp was also the leader of Stanford’sStartup Engineering program in 2013.(http://startupmooc.org/)

B] Leukippos :I am also involved in collaborative problem solving at open web platform ‘Leukippos’ which aims at applyingthis web- technology to in silico synthetic biology projects. The group currently focuses on the following projects: Thesynthetic biology software search engine ‘SynBioAppSelector’ and the game ‘SynBrick’ which is aimed at educating peopleabout synthetic biology by the interactive game.

Saurab h Gawand e 6