Tuesday, December 24, 2013

Announcing GRCh38

The GRC announces the public release of GRCh38, the latest version of the human reference genome assembly. This represents the first major assembly update since 2009, and introduces changes to chromosome coordinates. The GRC would like to thank the many individuals and groups that have provided helpful feedback and shared data, often ahead of publication, in efforts to improve the reference assembly. Such interactions help ensure the reference assembly is truly a community resource.

Users can download the latest version of the assembly from the GenBank FTP site: ftp://ftp.ncbi.nlm.nih.gov/genbank/genomes/Eukaryotes/vertebrates_mammals/Homo_sapiens/GRCh38/

The GRC does not provide annotation for the assembly. The assembly will be picked up from this FTP site for annotation by the major browsers (UCSC, Ensembl and NCBI), who will make it available on their websites in the upcoming weeks and months.

GRCh38 highlights

Mitochondrial genome

MITOMAP, the organization responsible for management human mitochondrial sequences, has kindly allowed the GRC to include the mitochondrial reference sequence with GRCh38. As in GRCh37, the current MT reference sequence is the Revised Cambridge Reference Sequence (rCRS), represented by GenBank accession number J01415.2 and RefSeq accession number NC_012920.1.

Sequence representation for centromeres

In previous reference assembly versions, the centromeres were represented by large, megabase-sized, gaps (N's in the assembly sequence). In GRCh38, these gaps are replaced by sequences derived from the reads generated during the sequencing of the HuRef genome. These sequences were used to create centromere models, as described in Miga et al., 2013, that  provide the approximate repeat number and order for each centromere in the genome. These model centromere sequences are anticipated to be useful for read mapping and variation studies. Be on the lookout for upcoming GRC blogs with more information about these centromeres.

General assembly updates

Large scale studies of human variation, such as the 1000 Genomes Project, identified a number of bases and indels in GRCh37 that were never seen in any individuals, suggesting they may represent errors in the assembly. Several thousand individual bases were updated in GRCh38, many of which corrected errors in coding sequence. In addition, a number of assembly regions that were misassembled in GRCh37, such as 1Q21, 10Q11 and the chr. 9 peri-centromeric regions have been retiled. Several highly variant genomic regions, such as the IGH locus have been retiled with components derived from a single haplotype resource in order to ensure the reference assembly provides a valid haplotypic representation. More that 100 assembly gaps have also been updated; these are either closed or reduced, in many cases with publicly available WGS sequences from other genome sequencing projects.


Like GRCh37, the updated reference assembly provides alternate sequence representation for variant regions in the form of alternate loci (alt loci) scaffolds. The alt loci are stand-alone, accessioned sequences for which chromosomal context is provided via alignment to the reference chromosomes. All alternate loci include at least one anchor sequence, a component also found on the reference chromosomes, to ensure these alignments are of high quality. Alt loci belong to alternate loci assembly units: the assembly unit ALT_REF_LOCI_1 contains the first alternate sequence representation for any genomic locus, ALT_REF_LOCI_2 contains the second alternate sequence representation and so forth. GRCh38 contains 261 alt loci scaffolds, in 35 alternate assembly units. 72 of these alternate loci were previously available as NOVEL patches to GRCh37. The LRC/KIR complex on chr. 19 has the largest number of alternate sequence representations (35), followed by the MHC on chr. 6 (7).


  1. Dear Human GenomeReference Team,

    I know that 70% of the GRCh37 was based on single individual - RP11 BAC clone library.

    Is GRCh38 also primarily based on the same RP11 library ?


  2. I am Alecia,from what I can read. It has been sad news and scam to everyone about Voodoo casters or so. But to me they are so real cause one worked for me not quite long ago.i met this man on a blog his name is Dr Abalaka is a very powerful man.I traveled down to where his shrine his and we both did the ritual and sacrifice.he had no website yet but he promised to create one as costumers are requesting for it, and now i'm free from the powders of sickness.I don't know about you but Voodoo is real;love marriage,finance, job promotion ,lottery Voodoo,poker voodoo,golf Voodoo,Law & Court case Spells,money voodoo,weigh loss voodoo,diabetic voodoo,hypertensive voodoo,high cholesterol voodoo,Trouble in marriage,Barrenness(need a child),Luck, Money Spells,he also cure any cancer and HIV,it's all he does. I used my money to purchase everything he used he never collected a dime from. He told me I can repay him anytime with anything from my heart. Now I don't know how to do that. If you can help or you need his help write him on (dr.abalaka@outlook.com) and also his cell number: 760-935-3804 you can text him because he use to be very busy some times,i believe that your story will change for better,or if you have any question you can contact me here as 1001madonado@gmail.com best of luck.