Aug 2023

Unveiling Novel Gene Associations and the Future of Genome Reference Builds

Author: Edgard Verdura, PhD

‍

Welcome to this novel blog post!

‍

The European Society of Human Genetics conference provided an inspiring backdrop for genetic exploration in Glasgow from June 10th to 13th. Amidst this gathering of like-minded professionals, our journey unveiled exciting insights into novel gene associations and the evolving landscape of genome reference builds. In this post, we dive into the breakthroughs and latest trends that emerged.

‍

Novel associations to genetic diseases

‍

The scientific sessions at ESHG 2023 were full of knowledge sharing, particularly descriptions of novel genes associated with Mendelian diseases spanning 2022-2023. Such information is crucial during variant data analysis, as many new diagnoses are enabled if novel gene descriptions are updated. Studies have shown that up to 30-42% of new causative variants found during exome data reanalysis are due to new disease-gene associations [1, 2]. This revelation underlines the importance of reanalysing cases every 1-2 years.

‍

During this conference, the Nostos Genomics Science team collected details on over 40 novel gene-disease associations that were added to our catalogues. This included RPH3A linked to neurodevelopmental disorders [3], SCNM1 associated with Orofaciodigital syndrome [4], SNAPIN related to foetal neuroanatomical abnormalities [5], and PHF12 connected to developmental disorders [6]. A high proportion of those novel associations are powered by initiatives such as GeneMatcher or Matchmaker, which enable researchers to share genetic and phenotypic data to gather patients with similar features, to sum up evidence to associate rare diseases to genes. By presenting these results at ESHG, researchers can recruit additional patients to complement ongoing gene identification projects, thus maximising the diagnosis of more patients worldwide.

‍

Importantly, novel gene-disease information is not automatically included in standard databases such as OMIM or Orphanet, nor properly curated by their reporters. Given that information sometimes takes weeks to be included in databases, performing analysis of NGS data when this information is not online hampers identification of variants in these genes, delaying diagnoses of rare disease patients. This illustrates the importance of integrating biomedical information into public databases, which includes not only gene-disease association but also individual variants submitted to public databases such as ClinVar. Thanks to our proactive semiautomatic approach to include novel gene-diseases associations in our databases, we are able to detect more variants in this space than other platforms, and we ensure that our knowledge base is extensive enough to detect variants in these loci as soon as they are first described.

‍

At Nostos, we understand the importance of updating our databases as quickly as possible through several variant assessment projects, including validAION, one of our in-house benchmarking pipelines. We have implemented a constant semi-automated surveillance of novel gene-disease associations. In the last 2 years we have already observed real cases where our regularly-updated knowledge base enabled identification of novel variants in previously undescribed genes, such as SLC35B2 or GNAI1.

‍

Committed to staying at the forefront of our industry, we also joined the “Gene-disease relationships” session to learn more about how gene-diseases catalogues are being curated. In the first talk “Lumping vs splitting”, we learnt how ClinGen consortium is establishing guidelines to better curate clinical entities, by differentiating or merging diseases or clinical entities based on several criteria such as molecular mechanisms, modes of inheritance or clinical variability. In the second talk, EMBL-EBI representatives explained and updated about the different resources and tools offered by this institution, including DECIPHER, Ensembl Variant Effect Predictor, Gene2Genotype and GenCC catalogues, the MANE project to unify Ensembl and Refseq reference transcripts, and the new project PARADIGM to solve gene-elusive rare diseases.

‍

Novel and not-so-novel genome builds

‍

We also observed the gradual move of the industry from the hg19/hg37 reference genome build to the hg38 genome reference. The hg19/hg37 build was released in February 2009, while the hg38 build was released in December 2023. Given this trend, Nostos has recently improved AION’s pipeline to annotate variants from vcf files generated using hg38 reference batch in an optimized way.

‍

We also participated in the session “Complete genomes”, where we had the opportunity to learn more about the next reference genomes and the methods behind them. In the first session, T2T (Telomere-to-Telomere) consortia researchers presented their efforts to finalize sequencing of the human genome. Though the human genome was reported to be fully sequenced in 2003, an 8% fraction was still unsequenced mainly due to its high content of repeated sequences, and up to 341 gaps were left to be read and analysed. By combining different long-read sequencing technologies, T2T consortia achieved “true” full genome sequencing in 2022 [7], nearly 20 years after publication of the first human genome draft sequence. Furthermore, these improved sequencing methods and T2T reference provide better sensitivity and accuracy not only at the single-base level, but specially regarding CNV and inversion detection compared to hg38 reference genome.

‍

One of the most recent improvements is the development of a method to impute up to 20/46 perfect T2T haplotypes from only one human genome [8], which facilitates use of long-read sequencing technologies to differentiate maternal and paternal alleles without segregation studies. On the other hand, a draft pangenome by Human Pangenome Reference Consortium (HPRC), consisting of T2T haplotypes from 47 different samples has also been released [9], which aims to provide a much better representation of human variation in the genome reference.

‍

In the next 5-10 years novel long-read sequencing methods will enable not only a more complete, high-quality coverage of genome data, but also much more representative reference genomes to avoid several types of biases and boost genetic disease studies.

‍

In this journey of discovery, collaboration, and innovation, Nostos Genomics is committed to expanding diagnostic capabilities for a greater number of patients. Stay tuned for more updates as we continue to shape the future of genetic exploration!

‍

References:

Tan, N. B., Stapleton, R., Stark, Z., Delatycki, M. B., Yeung, A., Hunter, M. F., Amor, D. J., Brown, N. J., Stutterd, C. A., McGillivray, G., Yap, P., Regan, M., Chong, B., Fanjul Fernandez, M., Marum, J., Phelan, D., Pais, L. S., White, S. M., Lunke, S., & Tan, T. Y. (2020). Evaluating systematic reanalysis of clinical genomic data in rare disease from single center experience and literature review. Molecular Genetics & Genomic Medicine, 8(11). https://doi.org/10.1002/mgg3.1508
Schobers, G., Schieving, J. H., Yntema, H. G., Pennings, M., Pfundt, R., Derks, R., Hofste, T., de Wijs, I., Wieskamp, N., van den Heuvel, S., Galbany, J. C., Gilissen, C., Nelen, M., Brunner, H. G., Kleefstra, T., Kamsteeg, E.-J., Willemsen, M. A. A. P., & Vissers, L. E. L. M. (2022). Reanalysis of exome negative patients with rare disease: a pragmatic workflow for diagnostic applications. Genome Medicine, 14(1). https://doi.org/10.1186/s13073-022-01069-z
Pavinato, L., Stanic, J., Barzasi, M., Gurgone, A., Chiantia, G., Cipriani, V., Ivano Eberini, Palazzolo, L., Monica Di Luca, Costa, A., Marcantoni, A., Biamino, E., Spada, M., Hiatt, S. M., Kelley, W. V., Letizia Vestito, Sisodiya, S. M., Efthymiou, S., Chand, P., & Rauan Kaiyrzhanov. (2023). Missense variants in RPH3A cause defects in excitatory synaptic function and are associated with a clinically variable neurodevelopmental disorder. Genetics in Medicine, 25(11), 100922–100922. https://doi.org/10.1016/j.gim.2023.100922
Iturrate, A., Rivera-Barahona, A., Flores, C.-L., Otaify, G. A., Elhossini, R., Perez-Sanz, M. L., Nevado, J., Tenorio-Castano, J., Triviño, J. C., Garcia-Gonzalo, F. R., Piceci-Sparascio, F., De Luca, A., Martínez, L., Kalaycı, T., Lapunzina, P., Altunoglu, U., Aglan, M., Abdalla, E., & Ruiz-Perez, V. L. (2022). Mutations in SCNM1 cause orofaciodigital syndrome due to minor intron splicing defects affecting primary cilia. American Journal of Human Genetics, 109(10), 1828–1849. https://doi.org/10.1016/j.ajhg.2022.08.009
De Koning, M. (2023, June 10-13). Biallellic variants in SNAPIN are associated with a novel foetal neuroanatomical phenotype. [Conference presentation abstract]. European Human Genetics Conference 2023, Glasgow, Scotland, United Kingdom.
López, J., Marcé-Grau, A., Gómez, C., Ferrer-Aparicio, S., Hernando, I., García, N., Vidal, S., Ferrer, R., Cervera, M., Galán-Chilet, I., Díaz, J., Castellvi, M., Pinasa, X., Royo, I., Camprubi, C., Torrents, A., Torrents, J. (2023, June 10-13). The importance of analyzing genes with weak phenotypic association: PHF12 as a potential cause of developmental disorder phenotype. [Conference presentation abstract]. European Human Genetics Conference 2023, Glasgow, Scotland, United Kingdom.
Nurk, S., Koren, S., Rhie, A., Rautiainen, M., Bzikadze, A. V., Mikheenko, A., Vollger, M. R., Altemose, N., Uralsky, L., Gershman, A., Aganezov, S., Hoyt, S. J., Diekhans, M., Logsdon, G. A., Alonge, M., Antonarakis, S. E., Borchers, M., Bouffard, G. G., Brooks, S. Y., & Caldas, G. V. (2022). The complete sequence of a human genome. Science, 376(6588), 44–53. https://doi.org/10.1126/science.abj6987
Rautiainen, M., Nurk, S., Walenz, B. P., Logsdon, G. A., Porubsky, D., Rhie, A., Eichler, E. E., Phillippy, A. M., & Koren, S. (2023). Telomere-to-telomere assembly of diploid chromosomes with Verkko. Nature Biotechnology. https://doi.org/10.1038/s41587-023-01662-6
Liao, W.-W., Asri, M., Ebler, J., Doerr, D., Haukness, M., Hickey, G., Lu, S., Lucas, J. K., Monlong, J., Abel, H. J., Buonaiuto, S., Chang, X. H., Cheng, H., Chu, J., Colonna, V., Eizenga, J. M., Feng, X., Fischer, C., Fulton, R. S., & Garg, S. (2023). A draft human pangenome reference. Nature, 617(7960), 312–324. https://doi.org/10.1038/s41586-023-05896-x

‍

Unveiling Novel Gene Associations and the Future of Genome Reference Builds

Want to know more?

Contact us

Thank you, your submission was received.