Getting started with VAP is pretty easy. You only need to download the complete VAP packaged file already containing all you need if your working with a Linux, Mac, or Windows computer. VAP already contains test files to give it a try (see below for instructions). The individual modules vap_core and vap_interface can be optionally downloaded separately.
What is VAP?
VAP stands for Versatile Aggregate Profiler. It's a stand-alone intuitive tool designed to analyze very high volume of genomic experimental data on laptop computers to generate aggregate or individual profiles over genomic regions of interest.
What is required to run VAP?
Java JRE 7.0 update 9 is required to run the interface.
No dependencies are required to run the core module (compiled statically).
How do I download VAP?
How do I install VAP?
You don't have to install anything, the complete VAP tool should run directly on supported OS (Linux, Mac, and Windows) where Java JRE 7.0 is present.
How do I start the interface?
Usually by double-clicking on the jar file the interface should open. Alternatively, in a terminal or command prompt type :
java -jar vap-1.0.0-all.jar
How can I test VAP?
- Click on the "Test VAP" menu of the interface, then "Load test files".
- Select the output folder in the "Output selection" section.
- Click on the "Run" button at the bottom left of the interface.
- The results of the analysis will be generated in the output folder, including a file called "test_graph_all.png" showing the aggregate profiles.
See "Test instructions" under the "Test VAP" menu of the interface for more details.
How VAP works?
By pressing "Run", the interface is validating the parameters, then it creates the file "VAP_parameters.txt" in the output folder, then it copies and run the appropriate packaged binary to generate the aggregate/individual profiles of the datasets over the reference groups, and it finally creates the png graphs. The parameters file can be opened and modified using any text editor.
How can I run the core module?
Usually the interface is running the vap_core module for you. Alternatively, in a terminal or command prompt type:
./vap_core -p vap_param.txt
When I run vap_core the images are not created, why?
The images are created by the interface. They can then be created separately using the tab "Only create graphs" from the interface or by typing in a terminal:
java -jar vap_vapinterface create_graphs
What is vap_native?
To allow users to benefit from the interface while having to compile the core code on their computer, the interface is actually first looking for the presence of a binary named “vap_native” (“vap_native.exe” for Windows) in the output directory and will use it rather than one of the executable from the package if present.
How do I compile VAP?
The complete VAP package is composed of the vap_interface and vap_core modules already compiled. You usually don't need to compile anything, but see below if it is required for you.
What is required to compile VAP?
JDK 7.0 update 9 or higher and Maven 3.0 or higher are required to compile the interface. A JAVA_HOME environment variable must points to the folder where JDK 7.0 is installed.
G++ 4.2 or higher is required to compile the core module.
How do I compile vap_core?
See the INSTALL file in vap_core repository.
How do I compile vap_interface?
In vap_interface folder, run:
The compiled binaries should be in the target folder of vap_interface.
Is VAP portable?
VAP is in theory portable in any of the three supported OS (Linux, Mac OS X and Windows XP and above) where JRE 7.0 update 9 or higher is installed.
Why are there three Bitbucket repositories for VAP?
What does the VAP version numbers mean?
The scheme is x.y.z, where x is incremented only for important changes (major version), y is incremented whenever a new feature is added (minor version), and z is incremented for maintenance releases and bug fixes. y and z are reset to 0 if the value to the left changes.
What is the license of VAP?
VAP is released under the GPL license. For further details see the COPYING files in respective repositories.
Can I run VAP through X-Windows?
Yes, as long as your server is configured to support X-windows the interface should open on your side.
Who is working on VAP?
How to contact the authors?
All questions and comments should be addressed at firstname.lastname@example.org.
QUESTIONS ABOUT THE CONCEPTS
What is a reference feature?
A reference feature is a generalized term corresponding to a gene, another genetic annotation or a region of interest.
What is a reference group?
A reference group contains the set of reference features of interest from which the aggregate and/or individual profiles will be calculated.
What is a reference point?
A reference point is like an anchor on which genomic regions are aligned. VAP proposes to use up to six reference points to delimit the regions of interest and avoid signal contamination from the adjacent annotations.
What is a block?
There are always one more block than the number of reference points. Using two reference points, three blocks are generated to isolate in the middle block the reference feature, flanked by independent blocks for the upstream and downstream regions.
What is a window?
Each block is subdivided in a given number of windows of constant size in the absolute analysis mode, or a constant number of windows of variable size in the relative analysis mode.
QUESTIONS ABOUT THE INTERFACE
How do I create graphs?
What are the orientation (AA, TC, DC, etc.)?
What does the orientations mean?
QUESTIONS ABOUT THE INPUT FILES
What is the format of the reference group?
It depends on the analysis mode. For the “annotation” and “exon” modes, one reference group simply contains a list of one annotation name per line (coming from the first column of the genome annotations file) that are used to extract the coordinates of the reference features from the genome annotations file.
For “coordinate” mode, the reference group must directly contains the coordinates in a special format. Because it can contains up to 6 reference points, the first column (tab-delimited) of the other lines must contains the chromosome, the second column the strand ('+' or '-' (any other character interpreted as '+')), and X columns containing the coordinates. (e.g. to analyze regions of interest identified by a start and end coordinates: "chr1 - 1200 1500 region1_score=14.6"). An additional column could optionally contains the name of the region that will be reported in the "all_" output file. Exceptionally, the BED format file with 3 to 6 columns is supported when there are exactly 2 reference points. The first line must start by a "#" and contains at least the tag "type=" followed by one of the 6 "coordX" where X is the number of reference_points contained in this file (e.g. “type=coord2”). The full description of the first line is: # type=["]file_type["] name=["]file_alias["] desc=["]file_description["] where the three tags can appears in any order, and to contain space the type/name/desc should be flanked by double-quotes.
When the first line of a reference group file contains a tag "name=", this string is used as group_name, otherwise the file name is used as group_name in the output files. If a group contains the same reference feature more than once, all instances are kept (and linked to the same genome annotation when applied). Any other line starting by "#" are considered as comments.
QUESTIONS ABOUT THE OUTPUT FILES
How do I interpret graphs?
What is an "agg" file?
What is an "ind" file?
What is the "map_graphs_datafiles.txt" file?
What is the "vap_parameters.txt" file?
Why the proportion of the second graph for the test files is mostly ~40%?
The test datasets are ChIP-chip files with a resolution of ~1 probe of 60 bp each 250 pb (~4 probes/1 kb) of the yeast genome, and the window_size parameter is set to 50 bp in the default parameters file. As a consequence, in each 5 consecutive windows (covering 250 bp), 2 windows contain values (therefore 40% of the windows are contributing to the aggregate profiles). By changing the window_size to 500 bp, the average proportion increases to ~95% in the upstream and downstream regions (but drop rapidly to 0% in the other regions because the average gene length is 1.5 kb (therefore only 3 windows of 500 bp).