Golan Yona

Department of Electrical Engineering &
Department of Structural Biology
Stanford University


Publications

Books

  • Introduction to Computational Proteomics
    Golan Yona
    Chapman & Hall/CRC Press
    Release date: Dec 9, 2010
    Book's webpage
  • Biological Data Integration
    Golan Yona
    Cambridge University Press
    TBA: 2011

Book chapters

  • Golan Yona, Shafquat Rahman and William Dirks. (2009). Comparing algorithms for clustering of expression data - how to assess gene clusters. In Computational Systems Biology. Humana press.
  • Umar Syed and Golan Yona. (2009). Enzyme Function Prediction With Interpretable Models. In Computational Systems Biology. Humana press.
  • Itai Sharon, Jason Davis and Golan Yona. (2009). Prediction of protein-protein interactions - a study of the co-evolution model. In Computational Systems Biology. Humana press.
  • Helgi Ingolfsson and Golan Yona. (2007). Protein domain prediction. In Structural Proteomics - High-throughput Methods. Humana press.
  • Golan Yona and Steven Brenner. (2000). Comparison of protein sequences and practical database searching. In Bioinformatics: Sequence, structure, and databanks. Oxford University Press. chapter

Tutorials

  • Golan Yona. (2002). Protein classification and meta-organization. Methods for global organization of the protein universe. In The 10th International Conference on Intelligent Systems for Molecular Biology, Edmonton, Canada. (powerpoint file)

Reviewed papers

  1. Liviu Popescu and Golan Yona. (2006). An Expectation-Maximization algorithm for simultaneous prediction of multiple cellular pathways. In the proceedings of CSB 2006
  2. Golan Yona, William Dirks, Shafquat Rahman. (2006). Effective similarity measures for expression profiles. Bioinformatics 22, 1616-1622 paper (pdf)
  3. Aaron Birkland and Golan Yona. (2006). Biozon: a system for unification, management and analysis of heterogeneous biological data. BMC Bioinformatics 7, 70- paper (pdf)
  4. Paul Shafer, Timothy Isganitis and Golan Yona. (2006). Hubs of Knowledge: using the functional link structure in Biozon to mine for biologically significant entities. BMC Bioinformatics 7, 71- paper (pdf)
  5. Paul Shafer, David Lin and Golan Yona. (2006). Mapping EST sequences to proteins. BMC Genomics 7, 41- paper (pdf)
  6. Aaron Birkland and Golan Yona. (2006). The BIOZON Database: a Hub of Heterogeneous Biological Data. Nucleic Acids Research 34 D235-D242. paper (pdf)
  7. Chin-Jen Ku and Golan Yona. (2005). The distance-profile representation and its application to detection of distantly related protein families. BMC Bioinformatics 6, 282- paper (pdf)
  8. Liviu Popescu and Golan Yona. (2005). Automation of gene assignments to metabolic pathways using high-throughput expression data. BMC Bioinformatics 6, 217- paper (pdf)
  9. Itai Sharon, Kuan Chang, Aaron Birkland, Ran El-Yaniv and Golan Yona (2005). Correcting BLAST and PSI-BLAST evalues for low-complexity sequences. Journal of Computational Biology 12 980-1003. paper (pdf)
  10. Golan Yona and Klara Kedem. (2005). The URMS-RMS hybrid algorithm for fast and sensitive local protein structure alignment. Journal of Computational Biology 12 12-32. paper (pdf)
  11. Ron Begleiter, Ran El-Yaniv and Golan Yona. (2004). On Prediction Using Variable Order Markov Models. Journal of Artificial Intelligence Research 22 385-421. paper (pdf)
  12. Richard Chung and Golan Yona. (2004). Protein family comparison using statistical models and predicted structural information. BMC Bioinformatics 5 183-200. paper
  13. Niranjan Nagarajan and Golan Yona. (2004). Automatic prediction of protein domains from sequence information using a hybrid learning system. Bioinformatics 20, 1335-1360. paper
  14. Michael Quist and Golan Yona. (2004). Distributional scaling: an algorithm for structure-preserving embedding of metric and nonmetric spaces. Journal of Machine Learning Research 5, 399-420. paper
  15. Niranjan Nagarajan and Golan Yona. (2003). A multi-expert system for the automatic detection of protein domains from sequence information. In the proceedings of RECOMB 2003 289-300. paper
  16. Umar Syed and Golan Yona. (2003). Using a mixture of probabilistic decision trees for direct prediction of protein function. In the proceedings of RECOMB 2003 224-234. paper
  17. Shlomo Dubnov, Ran El-Yaniv, Yoram Gdalyahu, Elad Schneidman, Naftali Tishby, Golan Yona. (2002). A new nonparametric pairwise clustering algorithm based on iterative estimation of distance profiles. Machine Learning 47 35-61.
  18. Golan Yona and Michael Levitt. (2002). Within the twilight zone: A sensitive profile-profile comparison tool based on information theory. Journal of Molecular Biology 315 1257-1275. paper, abstract
  19. Gill Bejerano and Golan Yona. (2001). Variations on probabilistic suffix trees: statistical modeling and prediction of protein families. Bioinformatics 17 23-43. paper, abstract
  20. Golan Yona and Michael Levitt. (2000). Towards a complete map of the protein space based on a unified sequence and structure analysis of all known proteins. In the proceedings of ISMB 2000, 395-406, AAAI Press. paper, abstract
  21. Golan Yona and Michael Levitt. (2000). A Unified Sequence-Structure Classification of Protein Sequences: Combining Sequence and Structure in a Map of the Protein Space. In the proceedings of RECOMB 2000 , 308-317, ACM press. paper
  22. Michal Linial and Golan Yona. (2000). Methodologies for target selection in structural genomics. Progress in Biophysical and Molecular Biology 73, 297-320. paper, abstract
  23. Golan Yona, Nathan Linial, Michal Linial. (2000). ProtoMap: Automatic classification of protein sequences and hierarchy of protein families. Nucleic Acids Research 28, 49-55. paper, abstract
  24. Golan Yona, Nathan Linial, Michal Linial. (1999). ProtoMap: Automatic classification of protein sequences, a hierarchy of protein families, and local maps of the protein space. Proteins: Structure, Function and Genetics 37, 360-378. paper, abstract
  25. Gill Bejerano and Golan Yona. (1999). Modeling protein families using probabilistic suffix trees. In the proceedings of RECOMB 1999, 15-24, ACM press. paper (Best paper by a young scientist award).
  26. Golan Yona, Nathan Linial, Naftali Tishby, Michal Linial. (1998). A map of the protein space - An automatic hierarchical classification of all known proteins. In the proceedings of ISMB 1998, 212-221, AAAI Press. paper, abstract
  27. Michal Linial, Nathan Linial, Naftali Tishby and Golan Yona. (1997). Global self organization of all known protein sequences reveals inherent biological signatures. Journal of Molecular Biology 268, 539-556. paper, abstract

Technical reports

  1. Jason Davis and Golan Yona. (2004). Prediction of protein-protein interactions and the interaction site from sequence information - an extensive study of the co-evolution model. Technical report TR2004-1919, Computing and Information Science, Cornell University.
  2. William Dirks and Golan Yona. (2004). A comprehensive study of the notion of functional link between genes based on microarray data, promoter signals, protein-protein interactions and pathway analysis. Technical report TR2004-1921, Computing and Information Science, Cornell University.

Ph.D. thesis


Teaching

Spring 2006 - Algorithms in Computational Biology, Department of Computer Science, Technion.
Fall 2005 - Computational Proteomics, Department of Computer Science, Technion.
Spring 2005 - Computational Biology: The Machine Learning Approach CS627, Department of Computer Science, Cornell University.
Fall 2004 - Introduction to Computational Biology CS426, Department of Computer Science, Cornell University
Spring 2004 - Computational Biology: The Machine Learning Approach CS627, Department of Computer Science, Cornell University.
Spring 2003 - Machine Learning CS478, Department of Computer Science, Cornell University.
Fall 2002 - Problems and Perspectives in Computational Molecular Biology CS726, Department of Computer Science, Cornell University.
Spring 2002 - Machine Learning CS478, Department of Computer Science, Cornell University.
Spring 2002 - Problems and Perspectives in Computational Molecular Biology CS726, Department of Computer Science, Cornell University.
Spring 2001 - Machine Learning CS478, Department of Computer Science, Cornell University.
Fall 2001 - Problems and Perspectives in Computational Molecular Biology CS726, Department of Computer Science, Cornell University


Research projects

The focus areas of my group are Computational Molecular Biology and Machine Learning. We are working on large scale analysis of protein sequences and structures, exploring high-order organization within the protein space. Other research interests are mathematical and statistical models of protein families, algorithms for protein sequence and structure comparison, structural genomics, and more. For more information on ongoing research projects see here

Biozon: The aim of the BIOZON project is to construct a unified biological resource and a comprehensive protein and DNA characterization, classification and management system that analyzes biological entities from genes to protein families, biochemical pathways and organisms. BIOZON is based on an extensive database schema that integrates information at the macro-molecular level as well as at the cellular level, from a variety of resources. Biozon currently stores extensive information about 40,000,000 protein and DNA sequences (integrating sequence, structure, protein-protein interactions, pathways and expression data) totaling to about 100 million documents from several different databases as well as from in-house computations, and 6.5 billion relations between documents (including explicit relations between objects, and derived or computed relations based on sequence similarities, expression similarities, structural similarities and more). Read more about Biozon