Batch Genome Variation Server 144

Batch GVS Build Notes

The current GVS Batch version is 11.00, November 1, 2015.

The variation locations are again mapped to the human genome reference sequence of December 2013 (UCSC hg38, NCBI build 38). The dbSNP build is 144.

Build notes for 10.00, January 30, 2015

The dbSNP build is now 141, and the variation locations are mapped to the human genome reference sequence of December 2013 (UCSC hg38, NCBI build 38).

Two annotations have been discontinued: the UCSC phastCons conservation scores and regions of copy number variation.

Build notes for 9.00, December 24, 2013.

The dbSNP build is now 138. The gene model, copy number variations, and chimp alleles have been updated. The variation locations are still mapped to the human genome reference sequence of February 2009 (UCSC hg19, NCBI build 37).

Build notes for 8.00, December 13, 2012.

The dbSNP build is now 137. This build has an improved set of functions. Our GVS functions are unchanged. The gene model has been updated. The variation locations are still mapped to the human genome reference sequence of February 2009 (UCSC hg19, NCBI build 37).

Build notes for 7.00, December 16, 2011.

The dbSNP build is now 134. The gene models, GERP scores, chimp alleles, and copy number variations have been updated. The variation locations are still mapped to the human genome reference sequence of February 2009 (UCSC hg19, NCBI build 37).

The list of GVS functions has been augmented to add the string "-near-splice" if the variation is in the first two or last two positions in an exon. If you are parsing text files, and want to pick out missense SNPs, for example, it will be necessary to use "contains" instead of "equals" for the strings. The "nonsense" classification has been replaced by "stop-gained" and "stop-lost".

The list of genotyping chips has not yet been updated (though some older chips were deleted).

For more details, see the How-to-Use page and the links at the bottom of that page.

Build notes for 6.00, June 14, 2010.

Data for dbSNP build 131 are now served out. The variation locations are now mapped to the human genome reference sequence of February 2009 (UCSC hg19, NCBI build 37).

The default for the "includeHapMap3" parameter is now true. HapMap 3 data will be included unless this parameter is set to false. To compensate for uneven coverage of SNPs between HapMap 1/2 and HapMap3, the default for "coverageTagSNPs" has been reduced to 14, and that for "coverageClustering" has been reduced to 12. If you are selecting tagSNPs, but not using a mixture of HapMap 1/2 and HapMap3, you may want to set these values to the previous 85 and 70. (These thresholds determine when a SNP with missing data will be put into a separate tag SNP bin.)

Non-synonymous SNPs are now labeled "missense".

dbSNP build 131 has no splice-site function calls. If we detect a splice-site, we annotate it as splice-5 or splice-3, overriding the dbSNP intron call.

The UCSC phastCons conservation scores are now those for 46 placental mammals.

Build notes for 5.15, March 15, 2010.

Conservation scores from the program GERP are now available. If you don't specify an annotation list, there will be an additional annotation column in the output file. The original UCSC conservation scores are still requested with "ConservationScore" in the submitted file, but the returned file will have that column labeled "ConservationScorePhast". See the GVS site for documentation on the ConservationScoreGERP column.

Build notes for 5.14, January 21, 2010.

A decimal place has been added to minor allele frequencies (e.g. 5 may become 4.8, in percent). The searchType chipID is working again. Filenames for the returned files are back to using time stamps, though protections are in place to avoid duplicates. Internally, there has been a major upgrade to the application server software (to JBoss 5).

Build notes for 5.13, October 6, 2009.

The output for the searchType snpListForLD has been improved for submission of SNPs on more than one chromosome. The SNP location columns now display chromosome as well as position within the chromosome.

Build notes for 5.12, September 21, 2009.

A list of SNPs can be submitted, and r2 can be calculated for every pair in the list (see snpListForLD on the How-to-Use page).

There is now a way to automate the exchange of submitted file and result file with a screen-scraper program (see autoFile on the How-to-Use page).

Build notes for 5.11, September 8, 2009.

HapMap frequencies for each HapMap SNP are now in our database, so that searchType-chipID requests are several times faster.

Build notes for 5.10, April 22, 2009.

In analogy to the interactive GVS site, merge mode B has been added. Previously, if more than one population was requested, the merge mode was always C. There is now a "merge" parameter that can be set to A, B, or C (C being the default). A is common-individuals-with-combined-variations, B is combined-individuals-with-common variations, and C is combined-individuals-with-combined-variations.

For the "includeHapMap3" parameter, there is now a third choice: "only". Setting "includeHapMap3" to "only" results in selection of SNPs only if they are covered in HapMap phase 3.

Build notes for 5.09, April 21, 2009.

Conflict genotypes have been changed to "?" (unknown) for fastPHASE calculations.

Build notes for 5.08, April 20, 2009.

Emails are now being sent by user snpserve. Please contact us if you have trouble getting an email with a download link.

Build notes for 5.07, March 6, 2009.

The cancel-job link works for chip annotation requests now.

Build notes for 5.06, March 2, 2009.

The default setting for the "includeHapMap3" parameter is now false. To include HapMap 3 data, it is now necessary to set this parameter to true (add the line "# includeHapMap3 true" to your file). If you submit a list of populations that includes HapMap-3-only populations, it is still necessary to add this line.

Build notes for 5.05, February 23, 2009.

For ensuring that result files have unique names, the identifier has been changed to a Java message identifier plus a short time stamp.

Build notes for 5.04, February 2, 2009.

There have been a few unexplained failures to send emails. The server will now try 5 times at 1-minute intervals before giving up. The documentation lists of individuals and populations were updated to include HapMap phase 3 entries.

Build notes for 5.03, December 8, 2008.

For tag SNP calculations, there is a new parameter bracketLowCoverageSNPs for requesting that brackets be placed around the rs ID for SNPs that have coverage below the coverageTagSNPs value.

Build notes for 5.02, November 24, 2008.

Problems in identifying unrelated individuals for HapMap 3 data have been fixed.

Build notes for 5.01, November 14, 2008.

HapMap phase 3 genotypes have been available since about Nov. 10. See the GVS data sources documentation. For populations CEU, CHB, JPT, and YRI the results have been merged with earlier HapMap data. If you wish to suppress HapMap 3 genotypes, add a line "# includeHapMap3 false" to your file.

Further rare cases were identified for the situation described in the build note for September 11, 2008 below, and the code was corrected to accommodate them. In addition, the binning process to extract tag SNPs (ldSelect), was improved slightly to make a better choice when the search for the largest bin resulted in a tie. The results should now be independent of the order of SNPs presented to the algorithm.

Build notes for 5.00, September 17, 2008.

The database has been upgraded to dbSNP build 129 (June 2008). The dbSNP function list is now nonsense, frameshift, missense, splice-5, splice-3, coding-synonymous, intron, utr-5, utr-3, near-gene-5, and near-gene-3. We retain the term coding-nonsynonymous for missense, though the set of coding-nonsynonymous SNPs no longer contains nonsense and frameshift SNPs.

Only version 2 of the Illumina HumanHap300 BeadChip is in the chip set now.

Previous Build Notes

For the September 11, 2008 build, a bug that affected the linkage disequilibrium calculation of r2 was fixed. This bug affected the rare instance when the minor-allele frequencies of two SNPs were each very close to 50%, there were no individuals heterozygous for one SNP but not for the other (red-blue or red-yellow in a visual genotype graph), and no individuals homozygous-common (blue-blue) or homozygous-rare (yellow-yellow) for both SNPs. In these cases, the r2 value is now 1.0 rather than zero.

For the May 9, 2008 build, a new database with dbSNP build 128 is in place.

With the May 5, 2008 build, it is now possible to do chromosome one-base queries with chr*:base.

fastPHASE calculations were made available in the March 28, 2008 build.

For the December 24, 2007 build, the annotation parameter, if present, is recognized for displayType snpSummary, as well as tagSNPs and r2LD. Three new annotation columns have been added to display the raw number of alleles in the set: NumberAlleles, NumberMajorAlleles, NumberMinorAlleles. There has been a change in the assignment of the minor allele for the rare triallele case: the minor allele is no longer the least frequent allele, but the second most frequent allele (that is the more frequent of the two minor alleles).

For the December 14, 2007 build, there is a new parameter returnTarballWithEmailMessage for downloading a tarball that includes the email message.

For the December 6, 2007 build, the chimp allele has been added for snpSummary data; this puts an extra column in the middle of the output.

For the December 5, 2007 build, the per-file limits for large tagSNPs jobs has been reduced from 100,000 to 60,000 without multipop and 30,000 with multipop.

For the October 16, 2007 build, copy number variations have been added to the SNP annotation list. A bug was fixed: previously, when tag SNPs were requested, and the frequency cutoff was not specified, no results were returned; now the frequency cutoff will default to 0 when not specified, and there will be a result file. There is a new searchType: chipID.

In the September 27, 2007 build, CopyNumberVariation has been added to the annotation list. There is now an additional copy-number column in the SNP summary output, just before the flank columns.

As of the September 11, 2007 build, the chipFilter parameter can be used for the snpSummary displayType.

For the August 7, 2007 build, the speed and memory footprint for the r2LD display has been improved. This does not affect the usage. Also added was the Illumina HumanHap300 version 2 chip.

Note prior to August 7, 2007 : There is a new noMonomorphic parameter. The default is "true", so that monomorphic (single-allele) SNPs will be excluded unless this parameter is set to false. The SNP positions now include the chromosome number separated from the base by a colon. The freqCutoff parameter is now used by all display types. The frequency and monomorphic filtering has been changed to match the recently-improved handling on the GVS site (filters applied after the genotypes have been sorted into population groups). As a result, the multipop parameter now has a default value of true, and multiple populations can be submitted with this parameter either true or false.
