Ameen Eetemadi - ameenetemady.github.io · Ameen Eetemadi Title: Ph.D. Candidate Citizenship: U.S....

6
Ameen Eetemadi Title: Ph.D. Candidate Email: [email protected] Research Area: Biomedical Informatics and Machine Learning UC Davis Genome Center and Department of Computer Science, Kemper Hall 2063, University of California, Davis To Develop Diet Recommendation Systems for Human Health and Diseases. Objective Eetemadi, A. and Tagkopoulos, I., 2021. Methane and fatty acid metabolism pathways are Selected Publications predictive of Low-FODMAP diet efficacy for patients with Irritable Bowel Syndrome. Clinical Nutrition Wang X, Rai N.,Pereira B., Eetemadi, A. and Tagkopoulos, I., 2020. Accelerated knowledge discovery from omics data by optimal experimental design. Nature Communications Eetemadi, A., Rai N., Pereira B., Kim M., Schmitz H. and Tagkopoulos, I., 2020, The Com- putational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health. Frontiers in Microbiology Eetemadi, A. and Tagkopoulos, I., 2019. Genetic Neural Networks: an artificial neural network architecture for capturing gene expression relationships. Bioinformatics. Kim, M.*, Eetemadi, A.* and Tagkopoulos, I., 2017. DeepPep: Deep proteome inference from peptide profiles. PLoS computational biology, 13(9), p.e1005661. (*contributed equally) Eetemadi, A., 2012. Medical data analysis method for epilepsy, Master’s dissertation, Wayne State University. Eetemadi, A., Siadat, M.R., Soltanian-Zadeh, H., Fotouhi, F. and Elisevich, K., 2007, “Content- Based Support Environment (C-BASE): Data Preparation and Similarity Measurement.”,Proceedings of the Seventh IEEE International Conference on Data Mining (ICDM’07), pp. 145-150, Omaha, NE, USA, October 28-31, University of California, Davis, CA (2014 - now) Education Ph.D. Candidate in Computer Science (Machine Learning & Biofinformatics) Wayne State University, Detroit MI (graduated 2012) M.Sc in Computer Science (Midical Data Mining) Sharif University of Technology, Tehran, Iran (graduated 2005) B.Sc in Computer Engineering (Software) University of California, Davis, CA (2014 - now) Work experience Graduate Research Assistant, Department of Computer Science and Genome Center Microsoft, Redmond, WA (2008 - 2014) Software Development Engineer, Microsoft Office Henry Ford Health Systems, Detroit, MI (2005 - 2008) Graduate Research Assistant, Health Informatics University of California, Davis, CA (2014 - 2020) Teaching experience Instructor: ECS 124 Bioinformatics Teaching Assistant: ECS 171 Machine Learning, ECS 124 Bioinformatics, ECS 120 Theory of Computations, ECS 36C Data Structures and Algorithms, ECS 30 Programming&Prob Solving. Software Technologies Skills Sequence analysis (DNA, RNA and Metagenomics), Supervised and Unsupervised Machine Learning (Torch7, TensorFlow and scikit-learn), TCP/IP and RESTful web services. Programming Technologies Python, R, MATLAB, C++, C#, Java, HTML JavaScript, AngularJS, Node.js, Perl, ASP.net, PHP, HPC (Slurm, TORQUE) Database Systems MongoDB, PostgreSQL, Oracle (+PL/SQL), MSSQL

Transcript of Ameen Eetemadi - ameenetemady.github.io · Ameen Eetemadi Title: Ph.D. Candidate Citizenship: U.S....

  • Ameen Eetemadi

    Title: Ph.D. CandidateEmail: [email protected] Area: Biomedical Informatics and

    Machine Learning

    UC Davis Genome Center andDepartment of Computer Science,Kemper Hall 2063,University of California, Davis

    To Develop Diet Recommendation Systems for Human Health and Diseases.Objective

    � Eetemadi, A. and Tagkopoulos, I., 2021. Methane and fatty acid metabolism pathways areSelectedPublications predictive of Low-FODMAP diet efficacy for patients with Irritable Bowel Syndrome. Clinical

    Nutrition

    � Wang X, Rai N.,Pereira B., Eetemadi, A. and Tagkopoulos, I., 2020. Accelerated knowledgediscovery from omics data by optimal experimental design. Nature Communications

    � Eetemadi, A., Rai N., Pereira B., Kim M., Schmitz H. and Tagkopoulos, I., 2020, The Com-putational Diet: A Review of Computational Methods Across Diet, Microbiome, and Health.Frontiers in Microbiology

    � Eetemadi, A. and Tagkopoulos, I., 2019. Genetic Neural Networks: an artificial neuralnetwork architecture for capturing gene expression relationships. Bioinformatics.

    � Kim, M.*, Eetemadi, A.* and Tagkopoulos, I., 2017. DeepPep: Deep proteome inferencefrom peptide profiles. PLoS computational biology, 13(9), p.e1005661. (*contributed equally)

    � Eetemadi, A., 2012. Medical data analysis method for epilepsy, Master’s dissertation, Wayne StateUniversity.

    � Eetemadi, A., Siadat, M.R., Soltanian-Zadeh, H., Fotouhi, F. and Elisevich, K., 2007, “Content-Based Support Environment (C-BASE): Data Preparation and Similarity Measurement.”,Proceedingsof the Seventh IEEE International Conference on Data Mining (ICDM’07), pp. 145-150, Omaha, NE,USA, October 28-31,

    � University of California, Davis, CA (2014 - now)EducationPh.D. Candidate in Computer Science (Machine Learning & Biofinformatics)

    � Wayne State University, Detroit MI (graduated 2012)M.Sc in Computer Science (Midical Data Mining)

    � Sharif University of Technology, Tehran, Iran (graduated 2005)B.Sc in Computer Engineering (Software)

    � University of California, Davis, CA (2014 - now)Workexperience Graduate Research Assistant, Department of Computer Science and Genome Center

    � Microsoft, Redmond, WA (2008 - 2014)Software Development Engineer, Microsoft Office

    � Henry Ford Health Systems, Detroit, MI (2005 - 2008)Graduate Research Assistant, Health Informatics

    � University of California, Davis, CA (2014 - 2020)Teachingexperience Instructor: ECS 124 Bioinformatics

    Teaching Assistant: ECS 171 Machine Learning, ECS 124 Bioinformatics, ECS 120 Theory ofComputations, ECS 36C Data Structures and Algorithms, ECS 30 Programming&Prob Solving.

    � Software TechnologiesSkillsSequence analysis (DNA, RNA and Metagenomics), Supervised and Unsupervised MachineLearning (Torch7, TensorFlow and scikit-learn), TCP/IP and RESTful web services.

    � Programming TechnologiesPython, R, MATLAB, C++, C#, Java, HTML JavaScript, AngularJS, Node.js, Perl, ASP.net,PHP, HPC (Slurm, TORQUE)

    � Database SystemsMongoDB, PostgreSQL, Oracle (+PL/SQL), MSSQL

  • Selected Publication Abstracts

    Methane and fatty acid metabolism pathways are predictive of Low-FODMAP dietefficacy for patients with Irritable Bowel Syndrome

    Ameen Eetemadi and Ilias Tagkopoulos.Clinical Nutrition (2021).https://doi.org/10.1016/j.clnu.2020.12.041.

    Abstract: Objective. Identification of microbiota-based biomarkers as predictors of low-FODMAPdiet response and design of a diet recommendation strategy for IBS patients. Design. We createda compendium of gut microbiome and disease severity data before and after a low-FODMAPdiet treatment from published studies followed by unified data processing, statistical analysisand predictive modeling. We employed data-driven methods that solely rely on the compendiumdata, as well as hypothesis-driven methods that focus on methane and short chain fatty acid(SCFA) metabolism pathways that were implicated in the disease etiology. Results. The pa-tient’s response to a low-FODMAP diet was predictable using their pre-diet fecal samples withF1 accuracy scores of 0.750 and 0.875 achieved through data-driven and hypothesis-driven pre-dictors, respectively. The fecal microbiome of patients with high response had higher abundanceof methane and SCFA metabolism pathways compared to patients with no response (p-values< 6 × 10−3). The genera Ruminococcus 1, Ruminococcaceae UCG-002 and Anaerostipes canbe used as predictive biomarkers of diet response. Furthermore, the low-FODMAP diet follow-ers were identifiable given their microbiome data (F1-score of 0.656). Conclusion. Our inte-grated data analysis results argue that there are two types of patients, those with high colonicmethane and SCFA production, who will respond well on a low-FODMAP diet, and all others,who would benefit a dietary supplementation containing butyrate and propionate, as well asprobiotics with SCFA-producing bacteria, such as Lactobacillus. This work demonstrates howdata integration can lead to novel discoveries and paves the way towards personalized dietrecommendations for IBS. The source code for reproducing the article results is available athttps://github.com/IBPA/FODMAPsAndGutMicrobiome.

    Taxa1*

    Taxa2

    1 2AATGCGCACG

    CGGCA

    Pre-dietFecal

    Metagenomes

    IBS

    Patie

    nts

    AATGCGCACG

    TGGCA

    ResponseHighLowNo

    Five Clinical Trials

    Integrated Microbiome

    Data Processing

    PathwayAbundances

    TaxaAbundances

    Hypothesis-Driven

    Analysis

    Data-Driven

    Analysis

    Resp

    onse

    HighLow

    No

    1Methane Metabolism

    Fatty Acid Metabolism2

    0.0 1.00.51.0**

    1 2

    High/NoResponse

    Model Training

    Sens

    itivity

    1-Specificity

    Reca

    ll

    Precision

    Resp

    onse

    HighNo

    0 10.5*

    Pathway1

    0 10.5*

    Pathway2

    Differential Analysis

    High/NoResponse

    Model Training

    Sens

    itivity

    1-Specificity

    Reca

    ll

    Precision

    UndigestedCarbs

    MethaneMetabolism

    Fatty AcidMetabolism

    CH4

    Short Chain FattyAcids (SCFAs)

    IntestinalMotility

    (inhibits) Constipation/Bloating

    Colon EpitheliumHealth

    ColonFODMAPs

    Hypothesis

    Statistical Validation Diet Response Predictor

    Train Evaluate

    Diet Response Predictor

    EvaluateTrain

    ,

  • .

    Accelerated knowledge discovery from omics data by optimal experimental design

    Xiaokang Wang, Navneet Rai, Beatriz Merchel Piovesan Pereira, Ameen Eetemadi, and IliasTagkopoulos.Nature Communications 11.1 (Oct. 2020), p. 5026.https://doi.org/10.1038/s41467-020-18785-y.

    Abstract: How to design experiments that accelerate knowledge discovery on complex bi-ological landscapes remains a tantalizing question. We present an optimal experimental de-sign method (OPEX) to identify informative omics experiments using machine learning mod-els for both experimental space exploration and model training. OPEX-guided exploration ofEscherichia coli ’s populations exposed to biocide and antibiotic combinations lead to more ac-curate predictive models of gene expression with 44% less data. Analysis of the proposed exper-iments shows that broad exploration of the experimental space followed by fine-tuning emergesas the optimal strategy. Additionally, analysis of the experimental data reveals 29 cases of cross-stress protection and 4 cases of cross-stress vulnerability. Further validation reveals the centralrole of chaperones, stress response proteins and transport pumps in cross-stress exposure. Thiswork demonstrates how active learning can be used to guide omics data collection for trainingpredictive models, making evidence-driven decisions and accelerating knowledge discovery inlife sciences.

    OPEX

    0

    1.0

    0 1.0

    L H

    x1

    x 2

    Predicted GE

    3. Select OptimalConditions

    12

    N

    Loop Until k Conditions

    Selected(k=2)

    Select next highscoring condition

    Remove if similarto selected conditions

    1432

    RepresentationBased

    x1 x2

    UncertaintyScore

    RepresentationScore

    2115

    UncertaintyBased

    x1 x2

    Selec

    ted

    Cond

    itions

    1. Build Predictive Model

    x1 x212

    5Train Gaussian

    Processes(GP Model)

    ,

    1.00.0

    Obse

    rved

    Cond

    itions

    ExperimentalSetting

    GE

    GeneExpression

    GP M

    odel

    New SampledCondition

    Trained Kernel

    Training Data

    2. CalculateUtility Scores

    Disc

    retiz

    ed

    Spac

    e

    Unobserved

    Entro

    py

    Mut

    ual

    Info

    rmat

    ion

    x1

    x 2

    12

    N

    UnobservedSampled

    Conditions

    x1 x2

    CalculateUtility

    Scores

    RepresentationScore

    12

    N

    I(xi,xj)

    UncertaintyScore

    12

    N

    Pr(xi)

    H(xi )

    Observed

    Utility Score

    0

    1.0

    0 1.0L H

    x1

    x 2

    0

    1.0

    0 1.0

    21 15

    3214

    SelectedConditionsx 2

    x1

    L H

    0

    1.0

    0 1.0

    x 2

    Gene Expression (GE)

    ObservedConditions

    23

    45

    1

    x1

    4. PerformExperiments

    Candidate Conditions

    x2

    E. coli

    Log

    Phas

    e

    x1

    E. coli

    Add Biocide

    Add Antibiotic

    RNA-Seq

    RNA-Seq Analysis

    x1 x2

    GE

  • .

    The Computational Diet: A Review of Computational Methods Across Diet, Mi-crobiome, and Health

    Ameen Eetemadi, Navneet Rai, Beatriz Merchel Piovesan Pereira, Minseung Kim, HaroldSchmitz, and Ilias Tagkopoulos.Frontiers in Microbiology 11 (2020), p. 393.https://www.frontiersin.org/article/10.3389/fmicb.2020.00393.

    Abstract: Food and human health are inextricably linked. As such, revolutionary impacts onhealth have been derived from advances in the production and distribution of food relating tofood safety and fortification with micronutrients. During the past two decades, it has becomeapparent that the human microbiome has the potential to modulate health, including in waysthat may be related to diet and the composition of specific foods. Despite the excitement andpotential surrounding this area, the complexity of the gut microbiome, the chemical compositionof food, and their interplay in situ remains a daunting task to fully understand. However,recent advances in high-throughput sequencing, metabolomics profiling, compositional analysisof food, and the emergence of electronic health records provide new sources of data that cancontribute to addressing this challenge. Computational science will play an essential role inthis effort as it will provide the foundation to integrate these data layers and derive insightscapable of revealing and understanding the complex interactions between diet, gut microbiome,and health. Here, we review the current knowledge on diet-health-gut microbiota, relevant datasources, bioinformatics tools, machine learning capabilities, as well as the intellectual propertyand legislative regulatory landscape. We provide guidance on employing machine learning anddata analytics, identify gaps in current methods, and describe new scenarios to be unlocked inthe next few years in the context of current knowledge.

  • .

    Genetic Neural Networks: an artificial neural network architecture for capturinggene expression relationships

    Ameen Eetemadi and Ilias Tagkopoulos.Bioinformatics 35.13 (2018), pp. 2226–2234.https://doi.org/10.1093/bioinformatics/bty945.

    Abstract: Gene expression prediction is one of the grand challenges in computational biology.The availability of transcriptomics data combined with recent advances in artificial neural net-works provide an unprecedented opportunity to create predictive models of gene expression withfar reaching applications. We present the Genetic Neural Network (GNN), an artificial neuralnetwork for predicting genome-wide gene expression given gene knockouts and master regulatorperturbations. In its core, the GNN maps existing gene regulatory information in its architec-ture and it uses cell nodes that have been specifically designed to capture the dependencies andnon-linear dynamics that exist in gene networks. These two key features make the GNN archi-tecture capable to capture complex relationships without the need of large training datasets.As a result, GNNs were 40% more accurate on average than competing architectures (MLP,RNN, BiRNN) when compared on hundreds of curated and inferred transcription modules. Ourresults argue that GNNs can become the architecture of choice when building predictors of geneexpression from exponentially growing corpus of genome-wide transcriptomics data. The sourcecode of GNN is available at https://github.com/IBPA/GNN.

    RNNMLP BiRNN

    Reference Methods

    Pred

    icted

    GE

    Observed GE

    Gene Expression (GE)

    Pear

    son

    Corre

    lation

    Dataset Size

    Predicted vs. ObservedGE Correlation

    Mea

    n Ab

    solut

    e Er

    ror

    Method

    Overall GE Prediction Error

    Genetic Neural Network

    Proposed Method

    Master RegulatorExpression

    KnockoutInformation

    Pred

    icted

    Exp

    ress

    ion

    Stratified Datasets

    Gene

    Kno

    ckou

    t Mat

    rix

    Master Regulators

    {Gene

    Expr

    essio

    nSy

    nthe

    tic a

    nd R

    eal

    Gene

    Reg

    ulato

    ry N

    etwo

    rks

    (Step1)Normalization

    and Stratification

    (Step2)Model Construction

    and Training

    (Step3)Model Evaluation

    Chemotaxis

    fliF

    fliG

    fliK

    fliJfliH

    fliIfliE

    flgN

    csgD

    flgM iHFompR

    rcsAB

    flgG

    fliN

    flgF

    fliM

    flgI

    flgK

    flgE

    fliD

    fliO

    rbsR

    malT

    cpxRmglB

    motB

    tsr cheWcheA

    motA

    flgH

    fliP

    flgD

    flhA

    fliQ

    fliT

    fliA

    fliZ

    nsrRrbsB

    sutRmatA

    fliC

    flhD

    flhC

    hdfR

    flhB

    flgC

    flgB

    fliR

    fliS

    flgL

    flhDCtrg

    aer

    malE

    cRP

    lrhA

    h-NSyjjQ

  • .

    DeepPep: Deep proteome inference from peptide profiles

    Minseung Kim, Ameen Eetemadi, and Ilias Tagkopoulos.PLOS Computational Biology 13.9 (2017), pp. 1–17.https://doi.org/10.1371/journal.pcbi.1005661.

    Abstract: Protein inference, the identification of the protein set that is the origin of a given pep-tide profile, is a fundamental challenge in proteomics. We present DeepPep, a deep-convolutionalneural network framework that predicts the protein set from a proteomics mixture, given thesequence universe of possible proteins and a target peptide profile. In its core, DeepPep quanti-fies the change in probabilistic score of peptide-spectrum matches in the presence or absence ofa specific protein, hence selecting as candidate proteins with the largest impact to the peptideprofile. Application of the method across datasets argues for its competitive predictive ability(AUC of 0.80±0.18, AUPR of 0.84±0.28) in inferring proteins without need of peptide de-tectability on which the most competitive methods rely. We find that the convolutional neuralnetwork architecture outperforms the traditional artificial neural network architectures withoutconvolution layers in protein inference. We expect that similar deep learning architectures thatallow learning nonlinear patterns can be further extended to problems in metagenome profilingand cell type inference. The source code of DeepPep and the benchmark datasets used in thisstudy are available at https://deeppep.github.io/DeepPep/.