Nicholson DN, Himmelstein DS, Greene CS. Reusing label functions to extract multiple types of biomedical relationships from biomedical abstracts at scale. bioRxiv. [Manubot Manuscript, GitHub Source]

Clay ME, Hammond JH, Zhong F, Chen X, Kowalski CH, Lee AJ, Porter MS, Greene CS, Pletneva EV, Hogan DA. Pseudomonas aeruginosa lasR mutant fitness in microoxia is supported by an Anr-regulated oxygen-binding hemerythrin. bioRxiv. 10.1101/802934

Way GP, Zietz M, Himmelstein DS, Greene CS. Sequential compression across latent space dimensions enhances gene expression signatures. bioRxiv. 10.1101/573782 [GitHub]

Rokita JL, et al., Genomic landscape of 261 childhood cancer patient-derived xenograft models. bioRxiv. 10.1101/566455 [intermediate data, classifier, processed data, source code for: PDX mouse subtraction, ethnicity inference, oncoprint generation, gene classification, RNA clustering, RNA fusion, copy number / structural variant analysis, correlation analysis, mutational signatures]

Hu Q, Greene CS, Heller E. Specific histone modifications associate with alternative exon selection during mammalian development. bioRxiv. 10.1101/361816 [GitHub]

Taroni JN, Greene CS. Cross-platform normalization enables machine learning model training on microarray and RNA-seq data simultaneously. bioRxiv. doi:10.1101/118349. [GitHub]

Zelaya RA, Wong AK, Frase AT, Ritchie MD, Greene CS. Tribe: The collaborative platform for reproducible web-based analysis of gene setsbioRxiv. doi:10.1101/055913 [Bitbucket: Tribe Server; Tribe Client]

Recent Publications

Anikeeva P, Boyden E, et al. Voices in method development. Nature Methods.

Liu Y, Huang J, Urbanowicz RJ, Chen K, Manduchi E, Greene CS, Moore JH, Scheet P, Chen Y. Embracing study heterogeneity for finding genetic interactions in large‐scale research consortia. Genetic Epidemiology. [YETI2 Software]

Zhou N, Jiang Y, Bergquist TR, Lee AJ, Kacsoh BZ, Crocker AW, Lewis KA, Georghiou G, Nguyen HN, Hamid MN, Davis L, The Critical Assessment of Function Annotation, Rost B, Brenner SE, Orengo CA, Jeffery CJ, Bosco GD, Hogan DA, Martin MJ, O’Donovan C, Mooney SD, Greene CS, Radivojac P, Friedberg I. The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens. Genome Biology. [Preprint]

Yu-Hsiu TL, Way GP, Barwick BG, Mariano MC, Marcoulis M, Ferguson ID, Driessen C, Boise LH, Greene CS, Wiita AP. Integrated Phosphoproteomics and Transcriptional Classifiers Reveal Hidden RAS Signaling Dynamics in Multiple Myeloma. Blood Advances. [classifier code, preprint]

Way GP, Greene CS. Discovering Pathway and Cell Type Signatures in Transcriptomic Compendia with Machine Learning. Annual Review of Biomedical Data Science.

Beaulieu-Jones BK, Wu ZS, Williams C, Lee R, Bhavnani SP, Byrd JB, Greene CS. Privacy-preserving generative deep neural networks support clinical data sharing. Circulation: Cardiovascular Quality and Outcomes. [GitHub source code, preprint]

Himmelstein DS, Rubinetti V, Slochower DR, Hu D, Malladi VS, Greene CS, Gitter A. Open collaborative writing with Manubot. PLOS Computational Biology. [Manubot, Manubot-formatted Document, Source Repository]

Taylor DM, Aronow BJ, Tan K, Bernt K, Salomonis N, Greene CS, Frolova A., …, White PC. The Pediatric Cell Atlas: defining the growth phase of human development at single-cell resolution. Developmental Cell.

Teichmann S, Kim J, Zhuang X, Zeng H, Boeke J, Ramakrishnan V, Greene CS. Technologies to Watch in 2019. Nature.

Taroni JN, Grayson PC, Hu Q, Eddy S, Kretzler M, Merkel PA, Greene CS. MultiPLIER: a transfer learning framework for transcriptomics reveals systemic features of rare disease. Cell Systems. [GitHub]

Greene CS. Show me the models. Nature Biotech.

Hu Q, Greene CS. Parameter tuning is a key part of dimensionality reduction via deep variational autoencoders for single cell RNA transcriptomics. Pac Symp Biocomput. [GitHub]


Way GP, Greene CS. Bayesian deep learning for single-cell analysis. Nature Methods.

Park Y, Greene CS. Parasite’s perspective on data sharing. GigaScience.

Kacsoh B, Barton S, Jiang Y, Zhou N, Mooney S, Friedberg I, Radivojac P, Greene CS, Bosco G. New Drosophila long-term memory genes revealed by assessing computational function prediction methods. G3: Genes | Genomes | Genetics. [Preprint]

Crowell AM, Greene CS, Loros JJ, Dunlap JC. Learning and Imputation for Mass-spec Bias Reduction (LIMBR). Bioinformatics. [GitHub, Preprint]

Stein-O'Brien GL, Arora R, Culhane AC, Favorov A, Greene CS, Goff LA, Li Y, Ngom A, Ochs MF, Xu Y, Fertig EJ. Enter the matrix: interpreting unsupervised feature learning with matrix decomposition to discover hidden knowledge in high-throughput omics data. Trends in Genetics. [Preprint]

Chen KM, Tan J, Way GP, Doing G, Hogan DA, Greene CS. PathCORE-T: identifying and visualizing globally co-occurring pathways in large transcriptomic compendia. BioData Mining. [GitHub to reproduce results, GitHub PathCORE Software, GitHub PathCORE demo server, GitHub crosstalk correction package, Preprint]

Grayson PC, Eddy S, Taroni JN, Lightfoot YL, Mariani L, Parikh H, Lindenmeyer MT, Ju W, Greene CS, Godfrey B, Cohen CD, Krischer J, Kretzler M, Merkel PA; Vasculitis Clinical Research Consortium, the European Renal cDNA Bank cohort, and the Nephrotic Syndrome Study Network. Metabolic pathways and immunometabolism in rare kidney diseases. Ann Rheum Dis. 2018. Epub ahead of print.

Ching T, Himmelstein DS, Beaulieu-Jones BK, Kalinin AA, Do BT, Way GP, Ferro E, Agapow P, Zietz M, Hoffman MM, Xie W, Rosen GL, Legenrich BJ, Israeli J, Lanchantin J, Woloszynek S, Carpenter AE, Shrikumar A, Xu J, Cofer EM, Lavender CA, Turaga SC, Alexandari AM, Lu Z, Harris DJ, DeCaprio D, Qi Y, Kundaje A, Peng Y, Wiley LK, Segler MHS, Boca SM, Swamidass SJ, Gitter A+, Greene CS+. Opportunities and Obstacles for Deep Learning in Biology and Medicine. J R Soc Interface. 15:20170387. [PreprintGitHub]

Sanchez-Vega F, Mina M, Armenia J, Chatila WK, Luna A, La KC, Dimitriadoy S, Liu DL, Kantheti HS, Saghafinia S, Chakravarty D, Daian F, Gao Q, Bailey MH, Liang W, Foltz SM, Shmulevich I, Ding L, Heins Z, Ochoa A, Gross B, Gao J, Zhang H, Kundra R, Kandoth C, Bahceci I, Devershi L, Dogrusoz U, Zhou W, Shen H, Laird PW, Way GP, Greene CS, Liang H, Xiao Y, Wang C, Iavarone A, Berger AH, Bivona TG, Lazar AJ, Hammer GD, Giordano T, Kwong LN, McArthur G, Huang C, Tward AD, Frederick MJ, McCormic F, Meyerson M, Cancer Genome Analysis Research Network, Van Allen EM, Cherniack AD, Ciriello G, Sander C, Schultz N. Oncogenic Signaling Pathways in The Cancer Genome Atlas. Cell. 2018. 173(2):321-337.e10.

Knijnenburg TA, Wang L, Zimmermann MT, Chambwe N, Gao GF, Cherniack AD, Fan H, Shen H, Way GP, Greene CS, Liu Y, Akbani R, Feng B, Donehower LA, Miller C, Shen Y, Karimi M, Chen H, Kim P, Jia P, Shinbrot E, Zhang S, Liu J, Hu H, Bailey MH, Yau C, Wolf D, Zhao Z, Weinstein JN, Li L, Ding L, Mills GB, Laird PW, Wheeler DA, Shmulevich I; Cancer Genome Atlas Research Network, Monnat RJ Jr., Xiao Y, Wang C. Genomic and Molecular Landscape of DNA Damage Repair Deficiency across The Cancer Genome Atlas. Cell Rep. 2018 Apr 3;23(1):239-254.e6. PMID: 29617664 [GitHub for ML portion]

Way GP, Sanchez-Vega F, La K, Armenia J, Chatila WK, Luna A, Sander C, Cherniack AD, Mina M, Ciriello G, Schultz N; Cancer Genome Atlas Research Network, Sanchez Y, Greene CS. Machine Learning Detects Pan-cancer Ras Pathway Activation in The Cancer Genome Atlas. Cell Rep. 2018 Apr 3;23(1):172-180.e3. PMID: 29617658 [GitHub]

Weissman GE, Hubbard RA, Ungar LH, Harhay MO, Greene CS, Himes BE, Halpern SD. Inclusion of Unstructured Clinical Text Improves Early Prediction of Death or Prolonged ICU StayCrit Care Med.

Himmelstein DS, Romero AR, McLaughlin SR, Greshake B, Greene CS. Sci-Hub provides access to nearly all scholarly literature. eLife. [GitHub for Analyses, GitHub for Manuscript, Current build of manuscript, Preprint]

Dahlstrom, KM, Collins AJ, Doing G, Taroni JN, Gauvin TJ, Greene CS, Hogan DA, O'Toole GA. A Multimodal Strategy Used By A Large c-di-GMP NetworkJ Bacteriol. [Preprint]

Gonzalez-Hernandez G, Sarker A, O'Connor K, Greene C, Liu H. Advances in Text Mining and Visualization for Precision Medicine. Pac Symp Biocomput. 2018;23:559-565.

Way GP, Greene CS. Extracting a Biologically Relevant Latent Space from Cancer Transcriptomes with Variational Autoencoders. Pac Symp Biocomput. 2018;23:80-91. [GitHub, Preprint]

Harrington LX, Way GP, Doherty JA, Greene CS. Functional network community detection can disaggregate and filter multiple underlying pathways in enrichment analysesPac Symp Biocomput. 2018;23:157-167. [GitHub source code, Preprint]


Doherty JA, Peres LC, Wang C, Way GP, Greene CS, Schildkraut JM. Challenges and Opportunities in Studying the Epidemiology of Ovarian Cancer Subtypes. Curr Epidemiol Rep. 2017 Sep;4(3):211-220.

Skarke C, Lahens NF, Rhoades SD, Campbell A, Bittinger K, Bailey A, Hoffmann C, Olson RS, Chen L, Yang G, Price TS, Moore JH, Bushman FD, Greene CS, Grant GR, Weljie AM, FitzGerald GA. A pilot characterization of the human chronobiomeScientific Reports.

Tan J, Huyck M, Hu D, Zelaya RA, Hogan DA, Greene CS. ADAGE signature analysis: differential expression analysis with data-defined gene sets. BMC Bioinformatics. [Preprint][GitHub R Package][GitHub Webserver][Running Webserver]

Kacsoh BZ, Greene CS, Bosco G. Machine learning analysis identifies Drosophila Grunge/Atrophin as an important learning and memory gene required for memory retention and social learning. G3: Genes | Genomes | Genetics. [Preprint]

Way GP, Youngstrom DW, Hankenson KD, Greene CS*, Grant SFA*. Implicating candidate genes at GWAS signals by leveraging topologically associating domains. European Journal of Human Genetics. [preprintGitHub - Zenodo, Docker - Zenodo]

Byrd JB, Greene CS. Data-Sharing ModelsNEJM. 376:2305-2306.

Tan J, Doing G, Lewis KA, Price CE, Chen KM, Cady KC, Perchuk B, Laub MT, Hogan DA, Greene CS. Unsupervised extraction of stable expression signatures from public compendia with an ensemble of neural networks. Cell Systems. [preprint][Bitbucket]

Yao X, Yan J, Liu K, Kim S, Nho K, Risacher SL, Greene CS, Moore JH, Saykin AJ, Shen L. Tissue-specific network-based genome wide study of amygdala imaging phenotypes to identify functional interaction modules. Bioinformatics. In Press.

Greene CS, Garmire LX, Gilbert JA, Ritchie MD, Hunter LE. Celebrating ParasitesNature Genetics. 2017;49:483-484.

Taroni JN, Greene CS, Martyanov V, Wood TA, Christmann R, Farber HW, Lafyatis RA, Denton CP, Hinchcliff ME, Pioli PA, Mahoney JM, Whitfield ML. A novel multi-network approach reveals tissue-specific cellular modulators of fibrosis in systemic sclerosis. Genome Medicine. 2017;9:27. [preprint]

Beaulieu-Jones BK, Greene CS. Reproducibility of computational workflows is automated using continuous analysisNature Biotechnology. [preprint][Github]

Greene CS. Tell me your neighbors, and I will tell you who you areScience Translational Medicine. 8:370ec203. [Editor's Choice Summary of Khurana et al.]


Way GP, Allaway RA, Bouley SJ, Fadul CE, Sanchez Y, Greene CS. A machine learning classifier trained on cancer transcriptomes detects NF1 inactivation signal in glioblastoma. BMC Genomics. 2016;18:127. [GitHub, Docker, Preprint]

Greene CS. Cheap-Seq. Science Translational Medicine. 8:370ec203. [Editor's Choice Summary of Tatlow et al.]

Moore JH, Jennings SF, Greene CS, Hunter LE, Perkins AD, Williams-Devane C, Wunsch DC, Zhao Z, Huang X. No-boundary Thinking in Bioinformatics. Pac Symp Biocomput. 2016;22:646-648.

Greene CS. How to know what we don't. Science Translational Medicine. 8:364ec179. [Editor's Choice Summary of Zhou et al.]

Beaulieu-Jones BK, Greene CS. Semi-Supervised Learning of the Electronic Health Record for Phenotype Stratification. Journal of Biomedical Informatics. 2016;64:168-178. [preprint][Github - results built via continuous analysis]

Greene CS. A stromal focus reveals tumor immune signatures. Science Translational Medicine. 2016;8:358ec155. [Editor's Choice Summary]

Way GP, Rudd J, Wang C, Hamidi H, Fridley BL, Konecny G, Goode EL, Greene CS*, Doherty JA*. Comprehensive cross-population analysis of high-grade serous ovarian cancer supports no more than three subtypesG3: Genes | Genomes | Genetics. 6(12) 4097-4103. [preprint][source & data][Docker image

Krishnan A, Taroni JN, Greene CS. Integrative networks illuminate biological factors underlying gene-disease associations. Curr Genet Med Rep. 2016. [preprint]

Greene CS. Gut Check. Science Translational Medicine. 8:352ec131. [Editor's Choice Summary]

Greene CS. The future is unsupervisedScience Translational Medicine. 8:346ec108. [Editor's Choice Summary]

Jiang Y, Oron TR, Clark WT, Bankapur A, ... [>100 authors]. An expanded evaluation of protein function prediction methods shows an improvement in accuracyGenome Biology. 2016. 17:184. [preprint]

Greene CS. Nothing but a hound dogScience Translational Medicine. 8:340ec83. [Editor's Choice Summary]

Greene CS, Voight BF. Pathway and network-based strategies to translate genetic discoveries into effective therapies. Human Molecular Genetics. doi:10.1093/hmg/ddw160. [preprint]

Greene CS. CoINcIDE: All together now. Science Translational Medicine. 8:334ec61. [Editor's Choice Summary]

Greene CS, Himmelstein DS. Genetic Association Guided Analysis of Gene Networks for the Study of Complex Traits. Circulation: Cardiovascular Genetics. 2016 9:179-184. [source & data][free full text]

Song A, Yan J, Kim S, Risacher SL, Wong AK, Saykin AD, Shen L, Greene CS. Network-based analysis of genetic variants associated with hippocampal volume in Alzheimer's disease: a study of ADNI cohorts. BioData Mining. 2016 9:3. doi:10.1186/s13040-016-0082-8. [source & data]

Tan J, Hammond JH, Hogan DA, Greene CS. ADAGE-based integration of publicly available pseudomonas aeruginosa gene expression data with denoising autoencoders illuminates microbe-host interactionsmSystems. 2016 1(1):e00025-15. doi: 10.1128/mSystems.00025-15. [preprint][source & data]

Thompson JA, Tan J, Greene CS. Cross-platform normalization of microarray and RNA-seq data for machine learning applications. PeerJ. 2016 4:e1621. [preprint][R package][source & data]

Allaway RJ*, Fischer DA*, de Abreu FB, Gardner TB, Gordon SR, Barth RJ, Colacchio TA, Wood M, Kacsoh BZ, Bouley SJ, Cui J, Hamilton J, Choi JA, Lange JT, Peterson JD, Padmanabhan V, Tomlinson CR, Tsongalis GJ, Suriawinata A, Greene CS*, Sanchez Y*, Smith KD. Genomic characterization of patient-derived xenograft models established from fine needle aspirate biopsies of a primary pancreatic ductal adenocarcinoma and from patient-matched metastatic sites. Oncotarget. Feb 25. doi: 10.18632/oncotarget.7718. [Epub ahead of print]. [source & data]

Greene CS, Foster JA, Stanton BA, Hogan DA, Bromberg Y. Computational approaches to study microbes and microbiomesPac Sym Biocomput. 2016 21:557-67. [pdf]

2015 and Older

Rudd J, Zelaya RA, Demidenko E, Goode EL, Greene CS, Doherty JA. Leveraging global gene expression patterns to predict expression of unmeasured genesBMC Genomics. 2015 16:1065. doi:10.1186/s12864-015-2250-5. [source & data]

Qian DC, Byun J, Han Y, Greene CS, Field JK, Hung RJ, Brhane Y, Mclaughlin JR, Fehringer G, Landi MT, Rosenberger A, Bickeböller H, Malhotra J, Risch A, Heinrich J, Hunter DJ, Henderson BE, Haiman CA, Schumacher FR, Eeles RA, Easton DF, Seminara D, Amos CI. Identification of shared and unique susceptibility pathways among cancers of the lung, breast, and prostate from genome-wide association studies and tissue-specific protein interactions. Hum Mol Genet. 2015 Dec 20;24(25):7406-20. doi: 10.1093/hmg/ddv440.

Gonzalez GH, Tahsin T, Goodale BC, Greene AC, Greene CS. Recent Advances and Emerging Applications in Text and Data Mining for Biomedical Discovery. Brief Bioinform. 2015 Sep 29. pii: bbv087. [Epub ahead of print]

Cordell HJ, Han Y, Mells GF, Li Y, Hirschfield GM, Greene CS et al. International genome-wide meta-analysis identifies new primary biliary cirrhosis risk loci and targetable pathogenic pathways. Nat Commun. 2015; PMID:26394269

Gui J, Greene CS, Sullivan C, Taylor W, Moore JH, Kim C. Testing multiple hypotheses through IMP weighted FDR based on a genetic functional network with application to a new zebrafish transcriptome study. BioData Min. 2015; PMID:26097506

Greene CS, Krishnan A, Wong AK,  Riccioti E, Zelaya RA, Himmelstein DS, Zhang R, Hartmann BM, Zaslavsky E, Sealfon SC, Chasman DI, FitzGerald GA, Dolinski K, Grosser T, Troyanskaya OG. Understanding multicellular function and disease with human tissue-specific networks. Nat Genetics. 2015; PMID:25915600 [processed networks][raw networks][webserver]

Zhu Q, Wong AK, Krishnan A, Aure MR, Tadych A, Zhang R, Corney DC, Greene CS, Bongo LA, Kristensen VN, Charikar M, Li K, Troyanskaya OG. Targeted exploration and analysis of large cross-platform human transcriptomic compendia. Nat Methods. 2015; PMID:25581801 [webserver]

Greene AC, Giffin KA, Greene CS, Moore JH. Adapting bioinformatics curricula for big data. Brief Bioinform. 2015; PMID:25829469

Tan J, Ung M, Cheng C, and Greene, CS. Unsupervised feature construction and knowledge extraction from genome-wide assays of breast cancer with denoising autoencoders. Pac Symp Biocomput. 2015; 20:132-43. PMID:25592575 [data & results]

Mahoney JM, Taroni J, Martyanov V, Wood TA, Greene CS, Pioli PA, Hinchcliff ME, Whitfield ML. Systems level analysis of systemic sclerosis shows a network of immune and profibrotic pathways connected with genetic polymorphisms. PLoS Comput Biol. 2015; 11(1):e1004005. PMID:25569146

Zieselman AL, Fisher JM, Hu T, Andrews PC, Greene CS, Shen L, Saykin AJ, Moore JH. Computational genetics analysis of grey matter density in Alzheimer's disease. BioData Min. 2014; 7:17. PMID: 25165488

Penrod NM, Greene CS, Moore JH. Predicting targeted drug combinations based on Pareto optimal patterns of coexpression network connectivity. Genome Med 2014; 6(4):33. PMID: 24944582

Greene CS, Tan J, Ung M, Moore JH, Cheng C. Big Data Bioinformatics. J Cell Physiol 2014 May 6. PMID: 24799088

Tan J, Grant GD, Whitfield ML, Greene, CS. Time-Point specific weighting improves coexpression networks from time-course experiments. Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics (EvoBIO) 2013. 11-22.

Ju W, Greene CS, Eichinger F, Nair V, Hodgin JB, Bitzer M, Lee YS, Zhu Q, Kehata M, Li M, Jiang S, Rastaldi MP, Cohen CD, Troyanskaya OG, Kretzler M. Defining cell-type specificity at the transcriptional level in human disease. Genome Res 2013 Nov; 23(11):1862-73. PMID: 23950145 [Highlighted by Nature Reviews Genetics]

Park CY, Wong AK, Greene CS, Rowland J, Guan Y, Bongo LA, Burdine RD, Troyanskaya OG. Functional knowledge transfer for high-accuracy prediction of under-studied biological processes. PLoS Comput Biol 2013; 9(3):e1002957. PMID: 23516347

Greene CS, Troyanskaya OG. Chapter 2: Data-driven view of disease biology. PLoS Comput Biol 2012; 8(12):e1002816. PMID: 23300408

Wong AK, Park CY, Greene CS, Bongo LA, Guan Y, Troyanskaya OG. IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks. Nucleic Acids Res 2012 Jul; 40(Web Server issue):W484-90. PMID: 22684505

Greene CS, Troyanskaya OG. Accurate evaluation and analysis of functional genomics data and methods. Ann N Y Acad Sci 2012 Jul; 1260:95-100. PMID: 22268703

See more publications at PubMed or Google Scholar