Welcome to another blog post!
Reference genomes are essential benchmarks of a species' genome that facilitate the accurate comparison of individual genomes and are crucial tools for identifying genetic variants and diagnosing rare diseases. Here, we will explore the evolution of the human reference genome, focusing on the transition from HG19 to HG38, its advantages and limitations, and the implications of migrating for clinical diagnostics and genomic research. Additionally, we will explore how innovative tools like AION are facilitating the transition from HG19 to HG38 for improved diagnostic accuracy.
Understanding Reference Genomes and Their Significance for Genomics Research and Clinical Diagnostics
A reference genome is a nucleic acid sequence database assembled as an accepted representative example of the most up-to-date information on a species' complete genome. Reference genomes are essential in genomics research, where they are used for studying genetic traits and diversity, and in clinical diagnostics, where they act as a universal standard for comparing genomes and diagnosing genetic diseases1. Consequently, it is imperative that reference genomes are as complete and representative as possible; however, most eukaryotic reference genomes do not have complete end-to-end sequence information2. In fact, the latest official Genome Reference Consortium (GRC)-released version of the human reference genome has hundreds of gaps, may not accurately reflect genetic diversity, and focuses on the euchromatic fraction of the genome1,2. Nevertheless, it is still accepted and widely used while ongoing efforts continue to construct, validate, and analyse a truly complete human reference genome3.
The first human genome was drafted following the huge international effort of the Human Genome Project and has been continually improved in line with ongoing research efforts over the past two decades3,4. The human reference genome is curated by the GRC, the organisation responsible for ensuring the human reference genome is revised regularly in line with new findings5. The GRC releases a major update (a new version) every few years in addition to more regular, minor updates known as patches. The latest GRC-released version of the human genome is known as Genome Reference Consortium Human Build 38 (GRCh38), commonly referred to as HG38 or Build 386. HG38 is an upgraded version of the previous build, GRCh37, commonly known as HG19. In 2022, a more up-to-date version of the human reference genome was released by the Telomere-to-Telomere (T2T) Consortium3. This reference genome, T2T-CHM13, is the first gapless version of the human genome. Although T2T-CHM13 has not yet been formally adopted as the primary human reference genome, it represents a significant contribution to genomics and is anticipated to significantly enhance the current primary reference genome.
The Transition from HG19 to HG38: Advancements and Limitations
HG19, released in 2009, is an earlier human reference genome version. This version has been widely used in genomics projects, including long-term, global studies like the 1,000 Genomes Project7, as well as in clinical diagnostics. HG38 is a significantly upgraded version released in 2013, assembled using Sanger sequencing data from many donors8. HG38 has been employed for major genomics projects such as the 100,000 Genomes Project9,10, owing to the inclusion of several improvements over HG1911,12:
The transition from HG19 to HG38 has significantly impacted genomic research, offering a more robust framework for understanding human genetics and its variations, which is likely to be improved even further with the adoption of T2T-CHM13. Moreover, the enhanced accuracy and representation of diversity provided with HG38 have considerable implications for rare disease diagnostics and precision medicine19,20.
While HG38 demonstrates significant improvements in representation and accuracy11, many scientists across research and clinical laboratories continue to use HG19 as their reference genome of choice11. Despite the increased risk of incorrect variant interpretation, which can be especially detrimental in diagnostic settings, HG19 allows labs to retain consistently throughout long-term projects and avoids altering existing analysis pipelines, since labs should avoid using two reference genomes at the same time8. Moreover, much of the existing literature reports HG19 coordinates, and initially, the lack of annotation tools for HG38 represented a considerable bottleneck8. Migrating to HG38 also presents complications regarding the realignment and conversion of previously generated data, which is technically complex and time- and resource-intensive19.
It is likely that, with the increasing availability of advanced, accurate annotation and variant interpretation tools supporting HG38 and streamlining the transition, more researchers will be willing and able to make the switch. This will not only achieve more accurate genomic data and faster diagnosis and treatment but also improve consistency, avoid confusion, and reduce errors.
Choosing Between HG19 and HG38 With AION
Researchers should carefully consider the version of the reference genome used in their work, as this can significantly impact the interpretation of genomic data, particularly in rare disease clinical diagnostics. Many bioinformatics tools support one or the other, but the flexibility to switch between the HG19 and HG38 reference genomes should not be underestimated, given the technical and practical benefits of each.
AION is an AI-driven rare disease variant interpretation platform that supports the accelerated diagnosis of rare diseases using a machine-learning model trained on high-quality genetic variant data points. AION has been clinically validated on the Genomics England 100,000 Genomes Project, showing high sensitivity in identifying causative variants. AION has the advantage of allowing users the flexibility to choose between the HG19 and HG38 reference genomes simply by selecting one or the other when running a case. This capacity accommodates the nuances between different versions of the human reference genome, ensuring accurate variant interpretation and genomic analysis across different builds for each user’s unique project requirements.
From a practical perspective, transitioning from HG19 to HG38 requires the lab to modify and validate their existing workflows for secondary and tertiary analysis, which can be burdensome in terms of time and costs. Moreover, lift-over, while possible, is usually imperfect, and variants may be detected in one genome but not the other. A 2021 study reported that only 7% of surveyed laboratories (including academic and clinical diagnostic labs) had transitioned to HG3821. Those still using HG19 cited time and monetary costs and lack of staff to support the migration as the main reasons for not yet migrating. AION is helping labs overcome these hurdles, allowing them to efficiently migrate to HG38 for more accurate, reliable clinical diagnostics. The lab just needs to adopt a secondary analysis strategy for HG38, and AION has the rest covered. In addition, if you need technical support transitioning to HG38, our expert team is here to help.
Conclusions and Future Perspectives
In conclusion, the evolution of the human reference genome from HG19 to HG38 marks a significant advance in genomics, offering enhanced accuracy, better representation of genetic diversity, and incorporation of decoy sequences. While HG38 is the more updated and comprehensive version, HG19 continues to be heavily used due to barriers to transitioning, which are likely to remain barriers to adopting updated reference genomes in the future. Advanced tools like AION are working to overcome these barriers by facilitating the transition from HG19 to HG38 and providing expert support for labs looking to migrate.
Moving forward, advanced long-read sequencing techniques will likely continue to facilitate the development of highly accurate and complete genome sequencing capabilities22, leading to more accurate, representative reference genomes3 and representing a huge step towards the more inclusive diagnosis of rare diseases.
To learn more about how Nostos Genomics and our AI-driven variant interpretation platform, AION, can support your lab’s transition to HG38 for more accurate research and clinical diagnostics, book a free demo with one of our genomics experts.
References
5. Genome Reference Consortium. Accessed January 24, 2024. https://www.ncbi.nlm.nih.gov/grc
Contact us!
*Nostos Genomics regularly produces webinars, white papers, and other types of content that you may find valuable.
You can unsubscribe at any time. For more information view our Privacy Policy.