Single-Domain Parvulins Constitute a Specific Marker for Recently Proposed Deep-Branching Archaeal Subgroups
Peptidyl-prolyl cis/trans isomerases (PPIases) are enzymes assisting protein folding and protein quality control in organisms of all kingdoms of life. In contrast to the other sub-classes of PPIases, the cyclophilins and the FK-506 binding proteins, little was formerly known about the parvulin type of PPIase in Archaea. Recently, the first solution structure of an archaeal parvulin, the PinA protein from Cenarchaeum symbiosum, was reported. Investigation of occurrence and frequency of PPIase sequences in numerous archaeal genomes now revealed a strong tendency for thermophilic microorganisms to reduce the number of PPIases. Single-domain parvulins were mostly found in the genomes of recently proposed deep-branching archaeal subgroups, the Thaumarchaeota and the ARMANs (archaeal Richmond Mine acidophilic nanoorganisms). Hence, we used the parvulin sequence to reclassify available archaeal metagenomic contigs, thereby, adding new members to these subgroups. A combination of genomic background analysis and phylogenetic approaches of parvulin sequences suggested that the assigned sequences belong to at least two distinct groups of Thaumarchaeota. Finally, machine learning approaches were applied to identify amino acid residues that separate archaeal and bacterial parvulin proteins from each other. When mapped onto the recent PinA solution structure, most of these positions form a cluster at one site of the protein possibly indicating a different functionality of the two groups of parvulin proteins.