Using this software0. For impatient users
1. Data inputs
2. Create a project
3. Annotate a project
Using user annotation track
4. Filter for quality scores
5. Main user interface
6. SVA genome browser
7. SVA tables
8. Selecting genes or regions
Bio-pathway or GO
Fisher's exact test
10. Exome or targeted capture sequencing
This document describes the input format required by SVA version 1.1 and onwards. For older versions of SVA, where the pileup format was supported, see here.
SVA users need to prepare three (3) types of input files for an SVA project.
In addition, there is an optional pedinf file for an SVA project. This file lists the subjects in a linkage format. This file is not necessary for SVA annotation tasks, but is necessary for some SVA analysis and exporting functions.
I will assume that the SVA users are already familiar with next-generation sequencing data pipelines, particularly using BWA/SAMtools. The file name extensions in the above box is only for SVA to conveniently recognize the relative format. Although we do ourselves use BWA/SAMtools, the file extensions do not indicate that SVA only takes outputs from SAMtools. SVA does not distinguish which software generates the alignment results, as long as the format is in accordance with the description below.
There is another important note:
The basic data generation flow described below is based on our experience for your reference. You may choose to use different parameter settings.
Step 1. Generating mpileup file
We used SAMtools to generate the mpileup file:
There is an important note regarding the chromosome designatations, which will affect the following data generation.
Step 2. Generating variant file in vcf format
We used SAMtools/bcftools to generate the variant file (Please note this is a basic example. Your actual parameters may vary.):
(Optional) Step 3. Generate SV file .events
We used a separate program (ERDS) to generate the SV file. Please refer to its webpages for user guide.
Here is an example of the generated .events file:
The columns are: chromosome name, start coordinate, end coordinate, SV status (diploid=2), LOD score.
Step 4. Generate coverage and quality score file .bco
We used a simple JAVA program vcf2bco.jar (download it here) to generate the chromosome-wise .bco file from base-wise vcf file generated using SAMtools/bcftools.
Note: This small JAVA program (vcf2bco.jar) accepts pileup file with chromosome designations (column 1) as an integer from 1-22, and X, Y, M. For example, vcf2bco accepts "16" but not "chr16".
After you generate these four types of files (with step 3 as optional), you may proceed to create your project.
© 2011 Dongliang Ge, PhD.