Mix-and-match association

Different proteins that perform the similar classes of work, although not necessarily related by homology, form a guild of protein families. In a mix-and-match association, a key marker protein for some type of process can associate flexibly with any of a number of unrelated families from the same guild. CRISPR-associated proteins form a guild (PMID:16292354) in which member proteins cluster in contiguous genomic regions. New CRISPR-associated proteins can be discovered if they reliably occur only in contexts with other guild members.

Similarly, addiction module toxin families, and antitoxin families, each form a guild. These occasionally partner in unexpected ways through mix-and-match association.

Peptide maturation enzymes, found in bacteriocin production systems, form a guild that includes lantibiotic synthetases, cyclodehydratases, PqqE-like radical SAM enzymes, etc. Targets for peptide modifications form another guild of protein families, ribosomally produced natural product precursor families, that is far from being fully described. Interestingly, members of the nitrile hydratase-related leader peptide (NHLP) and nif11-related leader peptide families (PMID: 20500830) clearly pair promiscuously with different classes of maturases, just as the maturases work on different families of targets. This mix-and-match association is easily exploited to discover new Genome Properties and Subsystems. "Orphan" maturation clusters can point to new target peptide families (e.g. PMID: 19561184) while the new target families in turn point to new maturation proteins. Discovery processes that exploit mix-and-match association can continue iteratively, a process called annotation walking.