In the initial result table, it is possible to select multiple data sets from different populations or submitters. If more than one set is selected, the outcome of merging those data sets will depend on how the parameter "Merge Samples and Variations" is set.
A - Common Samples with Combined Variations
In this mode, genotype results will be displayed only if there is a set of individuals who are common to all population-submitter data sets selected. It is thus necessary to determine in advance whether such an overlap exists. One way to do this is to use the "Population" links in the "Select Population(s)" table.
Whether or not there is overlap also depends on whether any "Check to Select Only Unrelated Individuals" boxes are checked. When only unrelated individuals are selected, there are fewer individuals examined for commonality. Removing the check may create an overlap.
Our application looks at the sets of data and makes a list of SNPs that is the union of those in the sets. Each SNP is then examined for individuals in common (the intersection of those in all the sets). If there are none, the SNP is excluded from the results. The number of SNPs in the merged result will be less than the sum of the SNPs for all data sets, and could be as large as the number in the smallest set.
B - Combined Samples with Common Variations
In this mode, genotype results will be displayed only if there is a set of variations that are common to all population-submitter data sets selected. Our application looks at the sets of data and makes a list of samples that is the union of those in the sets. If a sample is shared by more than one submitter, the data will be merged.
C - Combined Samples with Combined Variations
In this mode, genotype results will be displayed for the union of variations and the union of samples from all selected data sets. If a genotype is common to more than one data set, the data will be merged.
In any mode of "Merge Samples and Variations" listed above, if, in the merge, the genotypes from two different submissions disagree, the genotype is reported as xx and as a black square in the graphical output.
The frequency cutoff (selected in the "Allele Frequency Cutoff" set-parameters table cell) is applied after the genotypes are merged. Lowering this value may help to avoid a null set of genotypes.
For tag SNP calculations, if B or C is selected, and multiple populations having individuals from different population groups are selected, GVS will
automatically run the MultiPop-TagSelect algorithm, and there will be additional sections in the tag SNP result page.