This study focuses on methods for global organization of all known protein sequences. The main objective of this research is to explore high order organization in the protein sequence space, and to identify clusters of related proteins, by utilizing information about pairwise similarities. By these means we wish to achieve a classification of all protein sequences into groups of shared biological function (protein families), and gain a better understanding of the geometry of the sequence space. The category of this work is ``computational molecular biology''. This is an essentially new discipline which is rooted in two different disciplines: computer science and molecular biology. Being on the border line between the two disciplines, this study is related to fields of intensive research in both. It incorporates several recent developments in the theory of metric embedding and unsupervised learning, efficient graph algorithms, algorithms for the comparison of protein sequences, and the statistical theory of sequence comparison. While a full survey of each field is beyond the scope of this thesis, I have provided sufficient background to evaluate this work. This chapter begins with a very brief introduction to protein sequences, then defines the basic problem on which this study is focused, and introduces the methods that are employed. An elaborated introduction and discussion of each subject appears in the following chapters.