After the protein sequences are embedded in Euclidean space, we further analyze the sequence space and turn to look for statistical regularities in the Euclidean space. Specifically, a statistical clustering model of the sequences is constructed. This chapter focuses on methods for clustering statistical data. It starts with a very brief introduction of the notations used in cluster analysis, classical clustering algorithms, and validation schemes. Then it describes our clustering algorithm. By incorporating a cross-validation scheme, this algorithm is expected to construct a reliable model of the data with high probability. Cluster validity is measured by means of a few different statistical tests, which are described next.