Ensemble variant interpretation methods to predict enzyme activity and assign pathogenicity in the CAGI4 NAGLU (Human N-acetyl-glucosaminidase) and UBE2I (Human SUMO-ligase) challenges.

Printer-friendly versionPrinter-friendly versionPDF versionPDF version
TitleEnsemble variant interpretation methods to predict enzyme activity and assign pathogenicity in the CAGI4 NAGLU (Human N-acetyl-glucosaminidase) and UBE2I (Human SUMO-ligase) challenges.
Publication TypeJournal Article
Year of Publication2017
AuthorsYin, Y, Kundu, K, Pal, LR, Moult, J
JournalHum Mutat
Date Published2017 May 24
ISSN1098-1004
Abstract

CAGI (Critical Assessment of Genome Interpretation) conducts community experiments to determine the state of the art in relating genotype to phenotype. Here we report results obtained using newly-developed ensemble methods to address two CAGI4 challenges: enzyme activity for population missense variants found in NAGLU (Human N-acetyl-glucosaminidase) and random missense mutations in Human UBE2I (Human SUMO E2 ligase), assayed in a high throughput competitive yeast complementation procedure. The ensemble methods are effective, ranked 2(nd) for SUMO-ligase and 3(rd) for NAGLU, according to the CAGI independent assessors. However, in common with other methods used in CAGI, there are large discrepancies between predicted and experimental activities for a subset of variants. Analysis of the structural context provides some insight into these. Post-challenge analysis shows the ensemble methods are also effective at assigning pathogenicity for the NAGLU variants. In the clinic, providing an estimate of the reliability of pathogenic assignments is key. We have also used the NAGLU dataset to show that ensemble methods have considerable potential for this task, and are already reliable enough for use with a subset of mutations. This article is protected by copyright. All rights reserved.

DOI10.1002/humu.23267
Alternate JournalHum. Mutat.
PubMed ID28544272
Grant ListR13 HG006650 / HG / NHGRI NIH HHS / United States
U41 HG007346 / HG / NHGRI NIH HHS / United States