GVS: Genome Variation Server 144
6
SNP Summary Table Values

SNP base:

location on the chromosome (hg38), 1-based

SNP rs ID:

dbSNP reference SNP identifier

Alleles:

the alternative bases, in order of increasing frequency

Minor Allele:

the allele with the lowest frequency

Minor-Allele Frequency (%):

the minor-allele frequency in percent

Heterozygosity:

the expected fraction of heterozygotes if the population is in Hardy-Weinberg equilibrium, calculated from the minor allele frequency q: 2q(1-q)

Hardy-Weinberg Chi-Square:

obtained by summing 3 terms (common homozygous, heterozygous, and rare homozygous), where each term is calculated from the number of individuals in one of the three classes:
(observed number - expected number)2
expected number
where the observed numbers are just the genotype counts, and the expected numbers are the Hardy-Weinberg values p2N (common homozygotes), 2pqN (heterozygotes), and q2N (rare homozygotes), where p is the major allele frequency, q is the minor allele frequency, and N is the number of individuals; p+q=1

Genes:

one or more genes for which the SNP is in the transcribed region

Function:

If the SNP has been given a function by dbSNP, that classification is used and "(dbSNP)" is added to the text:
stop-gained or stop-lost(within an exon and translated, non-stop codon changed to stop codon or stop codon changed to non-stop codon)
frameshift-variant(within an exon and translated, insertion or deletion interrupts the reading frame)
cds-indel(within an exon and translated, insertion or deletion keeps the reading frame)
missense(within an exon and translated, protein amino acid change, but not nonsense or frameshift)
splice-donor-variant(two locations at the 5' end of an intron)
splice-acceptor-variant(two locations at the 3' end of an intron)
synonymous-codon(within an exon and translated, no protein amino acid change)          
utr-variant-5-prime(within an exon, but not translated, 5' end of the gene)
utr-variant-3-prime(within an exon, but not translated, 3' end of the gene)
upstream-variant-2KB(upstream of the gene)
downstream-variant-500B(downstream of the gene)
intron-variant(between exons)
nc-transcript-variant(transcript variant of a non-coding RNA gene)

If the SNP has not been given a function by dbSNP, the SNP is classified according to the location of the gene and its transcription and coding boundaries (see the list under "GVS Function" below), and "(GVS)" is added to the text.

In both cases, there can be more than one function for a given SNP, if two or more genes overlap or if there is alternative splicing; one function is reported, that highest in the relevant list. This dbSNP function (or its substitute if not available) is the one used to color-code the SNPs in various places.

Conservation Score GERP:

the rejected-substitution score from the program GERP, a number between -12.3 and 6.17 that describes the degree of sequence conservation among many mammalian species, with 6.17 being the most conserved (see this manuscript and this website)

Chimp Allele:

Chimp alleles are acquired from the UCSC human/chimp alignment file hg38.panTro4.net.axt. If the variation does not fall within an alignment block, or if it is an indel, the chimp allele is listed as "unknown". If the variation falls within a gap in the alignment, it is listed as "-". (Note that we do not use the chimp alleles from dbSNP, though ours are the same in most cases.)

Submitter IDs (only available if "Text" or "Custom-Text" is selected):

one or more SNP identifiers, as assigned by the submitters to dbSNP (comma separated); for now, the list includes all submissions to dbSNP, not just those of the population/submitter combination chosen in the search

Genotyping Chip Or Assay Availability (case of "Table/Image"):

In limited cases, dbSNP assigned multiple rs IDs to the same SNPs. If there is an "alternate id" listed for a chip in the Genotyping Chip Availability column, it is an ID representing the same SNP as the rs ID listed in the "SNP rs ID" column. This alternate ID should be used to access the chip information from the corresponding company.

Whether the variation is on one or more whole-genome genotyping chips; the chips are as follows:
Affymetrix Genome-Wide Human SNP Array 6.0
Illumina Human610-Quad BeadChip
Illumina Human1M BeadChip (1 million)
Illumina OmniExpress

GenotypingChipIDs (case of "Text" or "Custom-Text"):

In limited cases, dbSNP assigned multiple rs IDs to the same SNPs. If there is an "alternate-id" attached to a chip in the GenotypingChipIDs column, it is an ID representing the same SNP as the rs ID listed in the "rsID" column. This alternate ID should be used to access the chip information from the corresponding company.

Identifiers of SNPs that are on one or more whole-genome genotyping chips (comma separated); the chips are as follows:
GVS identifierchip
A9Affymetrix Genome-Wide Human SNP Array 6.0
I6QIllumina Human610-Quad BeadChip
I10Illumina Human1M BeadChip (1 million)
I7Illumina OmniExpress

RepeatMasker:

whether the SNP is in a repeat region; the regions, as identified by the RepeatMasker program, were downloaded in the file hg38.fa.out from the UCSC Genome site.

Tandem Repeats Finder:

whether the SNP is in a repeat region; the regions, as identified by the Tandem Repeats Finder program filtered to keep repeats with period of less than or equal to 12, were downloaded in the file hg38.trf.bed from the UCSC Genome site.

GVS Function (only available if "Table/Image" or "Custom-Text" is selected):

similar to Function above, but these functions are calculated locally; in general the two will agree; the GVS functions are calculated in advance and stored in the database; they are based on the alleles for all populations and individuals

the SNP is classified according to the location of the gene and its transcription and coding boundaries, and the bases in the coding region are divided into codons (if a multiple of 3):
stop-gained or stop-lost(within an exon and translated, codon change to or from a stop codon)
coding-indel(within an exon and translated, variation is an indel, and no attempt is made choose frameshift or not)
missense(within an exon and translated, protein amino acid change)
splice-5 or splice-3(in first two bases or last two bases of an intron)
coding-synonymous(within an exon and translated, no protein amino acid change)
coding-notMod3(within an exon and translated, number of coding bases is not a multiple of 3, and no attempt is made to rate as synonymous or not)
coding-monomorphic(within an exon and translated, all genotypes in the database are the same, and no attempt is made to rate as synonymous or not)
non-coding-exon(SNV or indel in an exon of a non-coding gene, accession beginning with NR_)
utr-5 or utr-3(within an exon, but not translated)
near-gene-5 or near-gene-3(within 2000 bases of an exon, upstream or downstream of a gene)
intron(between exons)
intergenic                          (between genes)

there can be more than one function for a given SNP, if two or more genes overlap or if there is alternative splicing; one function is reported, that highest in the list above

any of the SNP coding functions may be augmented by "-near-splice" if the SNP is in the first two or last two positions of an exon

Upstream Flank and Downstream Flank:

Sequence upsteam and downstream of a variation (not including the variation); from the UCSC hg38 genome sequence; upstream and downstream are relative to the genome assembly, not to the strand of any gene present; no flanks are available for indels at this time (they are listed as "NA"); if "Table/Image" is selected, 25 bases on each side are listed, if "Text", 100 bases.

NumberAlleles, NumberMajorAlleles, NumberMinorAlleles: (only available if "Custom-Text" is selected)

the number of alleles measured for the individuals queried (not counting missing data)
 
Skip footer links and go to content
Privacy Terms National Heart, Lung, and Blood Institute National Heart, Lung, and Blood Institute logo