Simone MARINI, PhD

(.) Research Investigator, Li LabUniversity of Michigan (since Aug 2017)

(.) Scientific Advisor, enGenome (since 2016)

smarini (_at) med (_dot_) umich (dot_) edu

Linkedin, Twitter


Postdoc Fellow, Laboratory for Biomedical Informatics, University of Pavia (2016 - 2017)

Postdoc fellow, Akutsu LaboratoryUniversity of Kyoto, Japan (2015 - 2016)

Postdoc fellowLaboratory for Biomedical Informatics, University of Pavia (2013 - 2015)

Last update: Aug 2017

This is a picture of me

Who I am

I apply Machine Learning to Bioinformatics.I work on a wide variety of data, such as electronic health records, genomic variants, ontologies, protein sequences; and techniques, e.g. support vector machines, random forest, Bayesian networks, data fusion. My main research interest is to design Machine Learning models for joint integration of heterogeneous data.

My research projects span over Italy, China, Japan, and USA, involving people working for

I am (proudly) from Voghera, Italy. I lived in Pavia (Italy), Madrid (Spain), Hong Kong (PRC), Beijing (PRC), and Kyoto (Japan). I currently live in Ann Arbor, USA.


Protein cleavage target prediction

            Technique       Joint matrix factorization
            Technology     Octave, Matlab
            Data                KEGG, MEROPS, Domine, 3did, Negatome, BioGRID, Interpro, STRING

NGS epilepsy multiaxial association study

            Technique       Random Forest, Burden Methods
            Technology     Perl, Weka
            Data                NGS data, KEGG, Interpro, BioGRID

Cohort simulation of Type 1 and 2 diabetes

            Technique       Dynamic Bayesian Networks, Continuous Time Bayesian Networks
            Technology     MATLAB, R
            Data                EDIC, DCCT, Electronic Health Records

Genomic variant deleteriousness prediction

            Technique       Ensemble Learning, Cost-sensitive Learning
            Technology     Perl, Weka, AJAX, Glassfish
            Data                NGS, HGMD, 1TGP, NHLBI GO Exome Sequencing Project

SNP selection and effects on sample mislabeling on Machine Learning

            Technique       Markov Chain Monte Carlo, Machine Learning
            Technology     Weka, MATLAB
            Data                Genotyping

DNA-, RNA- and protein-protein interaction (or affinity) prediction

            Technique       Ensemble Learning, Support Vector Machines
            Technology     Weka, Perl
            Data                Dscam1, Protein-interactions

- - - - - - - - - - - - - - - - -


2015-2016       Japanese Society for the Promotion of Science Postdoctoral Fellowship.

2015                Outstanding contribution in reviewing, Journal of Biomedical Informatics (Elsevier).

2011                Bioengineering Division Graduate Student Research Award, 1st ranked.

2010                HKUST Overseas Research Award for PhD Students.

- - - - - - - - - - - - - - - - -


2017                Jul 18. miRNA Bioinformatics, sequence analysis and statistical processes. Training school "Omics technologies and bioinformatics application in ME/CFS research, University of                         Pavia, Pavia, Italy. EU Cost ACTION  CA15111 (European Network on Myalgic Encephalomyelitis/Chronic Fatigue Syndrome COST action CA15111, EUROMENE)

                        January 12. Investigating epileptogenesis with data fusion. University of Michigan, Ann Arbor, USA.                        

2016                September 8. Mining heterogeneous data sources to enhance association studies. University of Arizona, Tucson, USA.

                        June 10. Leveraging on public databases for novel peptidase target discovery. Electrical, University of Pavia, Pavia, Italy.

2011                May 13. Motif search, sequence alignment and Support Vector Regression for Dscam protein self- and hetero-binding affinity prediction. Institute of Biophysics, the Chinese                                 Academy of Science, Beijing, China.

- - - - - - - - - - - - - - - - -


 Kyoto University, Japan.

Supervision of summer internships (2016).

 University of Pavia, Italy.

Medical Informatics (2013-2015), Instructor of record, undergraduate.

Automatic Learning in Medicine (2013-2015), Instructor of record, postgraduate. 

Co-supervision of five MSc and one BSc dissertations (2013-2015; 2017-present).

Supervision of summer internships (2014).

The Hong Kong University of Technology, China.

Introduction to Bioengineering (2010), Teaching assistant, postgraduate     

- - - - - - - - - - - - - - - - -


Journal Reviewer                   Journal of Biomedical Informatics (since 2014)

     Briefings in Bioinformatics (since 2015)

     Computers in Biology and Medicine (since 2016).


Conference Reviewer            Artificial Intelligence in Medicine, AIME (since 2016)

                                               American Medical Informatics Association joint Summits on Translational Science (since 2016)

                                               IEEE International Conference on Healthcare Informatics, ICHC (since 2017)

My publons profile.

- - - - - - - - - - - - - - - - -

LANGUAGES                         (Reading)                                (Speaking)        

Italian                                       Native speaker                        Native speaker

English                                    Fluent                                      Fluent

Spanish                                   Fluent                                      Fluent

Chinese                                   -                                              Survival

 - - - - - - - - - - - - - - - - -


 2014                                        Software developer, DCPUK, Bangladesh. VSO Poverty Alleviation, remote services. Development of a software to help managing dairy cooperatives.

 2006 – 2008                            Front desk volunteer, City social services of Pavia, Italy. Helping immigrants interact with local bureaucracy.

- - - - - - - - - - - - - - - - 



2017                 Exploring Wound-Healing Genomic Machinery with a Network-Based Approach

                        Vitali F, Marini S§, Balli M, Grosemans H, Sampaolesi M, Lussier YA, Cusella De Angelis MG, Bellazzi R. Pharmaceuticals 2017, 10:2

                         Dscam1 Web Server: online prediction of Dscam1 self- and hetero-affinity

                        Marini S*§, Nazzicari N*, Biscarini F, Wang GZ. Bioinformatics 2017, 33:12

                         Machine learning methods to predict Diabetes complications

                        Dagliati A, Marini S,  Sacchi  L, Cogni G, Teliti M, Decata P, Chiovato L, Bellazzi R. Journal of Diabetes Science and Technology 2017, 1932296817706375

2016                 A data fusion approach to enhance association study in epilepsy

                        Marini S§, Limongelli I, Rizzo E, Errichiello E, Vetro A, Tan D, Zuffardi O, Bellazzi R. Plos One 2016, 11:12

 "Noisy beets": impact of phenotyping errors on genomic predictions for binary traits in Beta vulgaris

Biscarini F, Nazzicari N, Broccanello C; Stevanato P, Marini S. Plant Methods 2016, 12:36

2015                A Dynamic Bayesian Network model for long-term simulation of clinical complications in type 1 diabetes

                       Marini S, Trifoglio E, Barbarini N, Sambo F, Di Camillo B, Malovini A , Manfrini M, Cobelli C , Bellazzi R. Journal of Biomedical Informatics 2015, 57

                        PaPI: pseudo amino acid composition to score human coding variants

                       Limongelli I, Marini S, Bellazzi R. BMC Bioinformatics 2015, 16:123

                        Developing a parsimonius predictor for binary traits in sugar beet (Beta vulgaris)

Biscarini F, Marini S, Stevanato P, Broccanello C, Bellazzi R, Nazzicari N. Molecular Breeding 2015, 35:10

2014                Improvement of Dscam homophilic binding affinity throughout Drosophila evolution

Marini S*, Wang GZ*, Ma X, Yang Q, Zhang X, Zhu Y. BMC Evolutionary Biology 2014, 14:186 (*) equally contributed

2013                The role of SwrA, DegU and P(D3) in fla/che expression in B. subtilis.

Mordini S, Osera C, Marini S, Scavone F, Bellazzi R, Galizzi A, Calvio C. PLoS One 2013, 8:12::e85065.

2011                In silico Protein-Protein Interaction prediction with sequence alignment and classifier stacking.

                        Marini S, Xu Q, Yang Q. Curr Protein Pept Sci. 2011, 12:7



2016                Learning T2D evolving complexity from EMR and administrative data using Continuous Time Bayesian Networks

Marini S, Dagliati A, Sacchi L, Bellazzi R. 9th International Joint Conference on Biomedical Engineering System and Technolgy, HEALTHINF 2016

2015                A genomic data fusion framework to exploit rare and common variants for association discovery.

Marini S, Limongelli I, Rizzo E, Da T, Bellazzi R. 15th Conference of Artificial Intelligence in Medicine 2015

                        Matrix tri-factorization for miRNA-gene association discovery in acute myeloid leukemia

De Martini A, Marini S, Vitali F, Bellazzi R. 15th Conference of Artificial Intelligence in Medicine [Workshop] 2015



2016                Data Fusion for cleavage target prediction

Marini S, Demartini A, Vitali F, Bellazzi R, Akutsu T. Bioinformatics Italian Society National Congress 

2015                A continuous time, multivariate model to simulate Type 2 Diabetes patients trajectories

Marini S, Dagliati A, Bellazzi R. American Medical Informatics Association joint Summits on Translational Science 2015

                        Predicting Microvascular Complications from Type 2 Diabetes Retrospective Data

Sacchi L, Colombo C, Dagliati D, Marini S, Cerra C, Chiovato L, Bellazzi R. 15th Annual Diabetes Technology Meetings

2014                A multivariate data-driven model to investigate the arising of complications in T2D patients

Marini S, Malavolti M, Dagliati A, Bellazzi R. 14th Annual Diabetes Technology Meeting

                        PaPI: the Pseudo Amino acid variant Predictor

Marini S, Limongelli I, Bellazzi R.  Bioinformatics Italian Society National Congress  

                        A novel algorithm to predict the deleteriousness of genomic coding variants

                        Limongelli I, Marini S, Bellazzi R. NGS (ISCB)

 Dynamic Bayesian Networks to simulate type I diabetes patients cohorts

Barbarini N, Bellazzi R, Cobelli C, Di Camillo B, Manfrini F, Malovini A, Marini S, Sambo F, Trifoglio E. Economics, Modelling and Diabetes: Mount Hood Challenge 

 PaPI: using pseudo amino acid composition to predict deleterious coding variants

                       Limongelli I, Marini S, Bellazzi R. Italian Bioengineering Group National Congress


2017                 Precision oncology: a data similarity challenge

                        Zambelli A, Demartini A, Pala D, Vitali F, Marini S, Bellazzi R. In: E-Health e Medicina Digitale, S. Quaglini, M. Cesarelli, M. Giacomini, F. Pinciroli eds, Patron ed.

[*] denotes equal contribution.

[§] denotes corresponding author.

 - - - - - - - - - - - - - - - - -


Among things I like to do in my spare time, I mention here (1) traveling; (2) playing nerdy pen-and-paper role playing games; (3) (try to) learn languages, history and philosophy.

- - - - - - - - - - - - - - - - -


I make prediction models and simulations applying several Machine Learning techniques. I work on a wide variety of data, in both Health Informatics and Bioinformatics. I exploit the hidden relations of heterogeneous data sources.