Find an NCBI GenBank mtDNA full sequence on the complete human mitochondrial tree
— Last update May 21, 2007 — Installed 3,512 times.|
Script Summary:
Create a human mitochondrial DNA phylogenetic tree from an NCBI GenBank mtDNA full sequence, with a green line from the root of the tree pointing to its position. For a demo of the result: http://members.cox.net/tkandell/mtDNA/tree.html |
this script has 1 topic, 3 posts |
This script has no reviews. |
Many people are beginning to order tests for their full mitochondrial DNA sequences. However, they have their results, they have no real way of comparing their haplotypes with anyone else's, so that they can see who are their closest matches, and figure out their more distant matrilineal origins, and which regional and ethnic groups they most closely match. This script attempts to address that problem. If they upload their full mtDNA sequence to NCBI GenBank, they will have the ability to compare their haplotype with the 2000+ and counting other sequences already there. These are all the sequences cited in academic studies of mitochondrial DNA, as well as other sequences that have been uploaded personally by people who’ve ordered their own mtDNA tests.
This script adds a button to any complete human mtDNA sequence in GenBank (they must be actual full sequences or the button won’t appear). This runs an NCBI Blast query (NCBI Blast is an online tool to compare genetic sequences), and generates an on-the-fly phylogenetic tree.
See here for the latest examples of an outline of the human mtDNA tree from a recent academic paper [Kivisild et al. 2006)]:
(Click on the thumbnails to see the full-size images)

The African root of the tree (“Haplogroup L”)

The two Out of Africa migrations, macro-haplogroups M and N
This query uses a reconstructed common ancestral human mtDNA haplotype to generate a balanced phylogenetic tree of all complete human mtDNA sequences in NCBI GenBank. (This reconstruction was based on the data from academic papers as well as an alignment of gorilla, chimpanzee, bonobo, Neanderthal control-region sequences, alongside sequences from haplogroups L0d, L0k, L0f, L1b, L1c, L5, and L3e, all aligned with the rCRS.)
Then, it searches for the selected GenBank sequence, and uses NCBI’s Blast TreeView distance tree generator to draw a green line showing the path from the root of the tree (the ancestral haplotype) to that sequence. The tree that’s generated, which contains over 2000 haplotypes, is very small and densely packed, making it hard to see. However, by rolling the mouse over any branching point and selecting “Show Subtree”, the user can magnify any part of the tree. Also, by selecting "Show Alignment", the user can compare any group of sequences, with each other and with the ancestral haplotype. Another option is just clicking on an individual sequence on the tree. (Even though the titles of the sequences don’t appear on the unmagnified versions of the tree, they do show up in popup text when the mouse is rolled over the ends of the lines on the right-hand side.)
Distance trees are critical to determining the actual relationships between haplotypes. Often, single comparisons with “matches” do not accurately show the true closest matches, but a comparison of a set of haplotypes really makes this very clear.
In the alignment window, each sequence accession number is a link to the original sequence page in GenBank, which in turn has the links to the academic papers (if any) where they was cited. Even though it often isn’t clear to which haplogroup these sequences belong from the sequence title alone, by looking at the sequence page and the academic papers one can deterermine not only the haplogroup, but also the geographic and ethnic background of the haplotype.
Of course, the more sequences uploaded to GenBank by have ordered their full sequence mtDNA tests, the more haplotypes will be available for comparison for all particpants, and as more people upload their sequences to GenBank, the tree will become more detailed and more accurate.
The phylogenetic tree of course also documents the real history of mankind, because the distances that are shown correspond directly to the actual timescale of human migrations, and when these splits took place. Just looking at the tree one can clearly see the two movements out of Africa - macro-haplogroups M and N - and the lower-level clades show the history of settlement of various regions of the world.
Here are a few of screenshots to show what this script does and how it works (click on the thumbnails to see the larger images):

NCBI Sequence Viewer page showing the added Blast query button
NCBI Blast TreeView showing a green line from the root of the tree to the haplotype that was queried
NCBI TreeView subtree highlighting the haplotype and showing its closest matches
NCBI TreeView alignment of a group of sequences, with links to the sequence pages at the left
To see how this script works, install it and then click on the following link to my own GenBank mtDNA haplotype sequence page, then click on the
button in the blue strip near the top of the page:
DQ377992 - Homo sapiens haplotype HV* mitochondrion, complete genome.
For any questions, suggestions, and bug reports, please contact me: email Ted Kandell
This script adds a button to any complete human mtDNA sequence in GenBank (they must be actual full sequences or the button won’t appear). This runs an NCBI Blast query (NCBI Blast is an online tool to compare genetic sequences), and generates an on-the-fly phylogenetic tree.
See here for the latest examples of an outline of the human mtDNA tree from a recent academic paper [Kivisild et al. 2006)]:
(Click on the thumbnails to see the full-size images)

The African root of the tree (“Haplogroup L”)

The two Out of Africa migrations, macro-haplogroups M and N
This query uses a reconstructed common ancestral human mtDNA haplotype to generate a balanced phylogenetic tree of all complete human mtDNA sequences in NCBI GenBank. (This reconstruction was based on the data from academic papers as well as an alignment of gorilla, chimpanzee, bonobo, Neanderthal control-region sequences, alongside sequences from haplogroups L0d, L0k, L0f, L1b, L1c, L5, and L3e, all aligned with the rCRS.)
Then, it searches for the selected GenBank sequence, and uses NCBI’s Blast TreeView distance tree generator to draw a green line showing the path from the root of the tree (the ancestral haplotype) to that sequence. The tree that’s generated, which contains over 2000 haplotypes, is very small and densely packed, making it hard to see. However, by rolling the mouse over any branching point and selecting “Show Subtree”, the user can magnify any part of the tree. Also, by selecting "Show Alignment", the user can compare any group of sequences, with each other and with the ancestral haplotype. Another option is just clicking on an individual sequence on the tree. (Even though the titles of the sequences don’t appear on the unmagnified versions of the tree, they do show up in popup text when the mouse is rolled over the ends of the lines on the right-hand side.)
Distance trees are critical to determining the actual relationships between haplotypes. Often, single comparisons with “matches” do not accurately show the true closest matches, but a comparison of a set of haplotypes really makes this very clear.
In the alignment window, each sequence accession number is a link to the original sequence page in GenBank, which in turn has the links to the academic papers (if any) where they was cited. Even though it often isn’t clear to which haplogroup these sequences belong from the sequence title alone, by looking at the sequence page and the academic papers one can deterermine not only the haplogroup, but also the geographic and ethnic background of the haplotype.
Of course, the more sequences uploaded to GenBank by have ordered their full sequence mtDNA tests, the more haplotypes will be available for comparison for all particpants, and as more people upload their sequences to GenBank, the tree will become more detailed and more accurate.
The phylogenetic tree of course also documents the real history of mankind, because the distances that are shown correspond directly to the actual timescale of human migrations, and when these splits took place. Just looking at the tree one can clearly see the two movements out of Africa - macro-haplogroups M and N - and the lower-level clades show the history of settlement of various regions of the world.
Here are a few of screenshots to show what this script does and how it works (click on the thumbnails to see the larger images):
NCBI Sequence Viewer page showing the added Blast query button
NCBI Blast TreeView showing a green line from the root of the tree to the haplotype that was queried
NCBI TreeView subtree highlighting the haplotype and showing its closest matches
NCBI TreeView alignment of a group of sequences, with links to the sequence pages at the left
To see how this script works, install it and then click on the following link to my own GenBank mtDNA haplotype sequence page, then click on the
button in the blue strip near the top of the page:
DQ377992 - Homo sapiens haplotype HV* mitochondrion, complete genome.For any questions, suggestions, and bug reports, please contact me: email Ted Kandell





