Composition bias and the origin of ORFan genes.

Printer-friendly versionPrinter-friendly versionPDF versionPDF version
TitleComposition bias and the origin of ORFan genes.
Publication TypeJournal Article
Year of Publication2010
AuthorsYomtovian, I, Teerakulkittipong, N, Lee, B, Moult, J, Unger, R
JournalBioinformatics
Volume26
Issue8
Pagination996-9
Date Published2010 Apr 15
ISSN1367-4811
KeywordsEvolution, Molecular, Genome, Genomics, Open Reading Frames, Prokaryotic Cells
Abstract

MOTIVATION: Intriguingly, sequence analysis of genomes reveals that a large number of genes are unique to each organism. The origin of these genes, termed ORFans, is not known. Here, we explore the origin of ORFan genes by defining a simple measure called 'composition bias', based on the deviation of the amino acid composition of a given sequence from the average composition of all proteins of a given genome.

RESULTS: For a set of 47 prokaryotic genomes, we show that the amino acid composition bias of real proteins, random 'proteins' (created by using the nucleotide frequencies of each genome) and 'proteins' translated from intergenic regions are distinct. For ORFans, we observed a correlation between their composition bias and their relative evolutionary age. Recent ORFan proteins have compositions more similar to those of random 'proteins', while the compositions of more ancient ORFan proteins are more similar to those of the set of all proteins of the organism. This observation is consistent with an evolutionary scenario wherein ORFan genes emerged and underwent a large number of random mutations and selection, eventually adapting to the composition preference of their organism over time.

DOI10.1093/bioinformatics/btq093
Alternate JournalBioinformatics
PubMed ID20231229
PubMed Central IDPMC2853687
Grant ListR01 GM081511-03 / GM / NIGMS NIH HHS / United States
R01GM081511 / GM / NIGMS NIH HHS / United States