Installing and runningΒΆ
The pipeline is written in Groovy (a Java scripting language) and distributed as an executable JAR. To install it get the latest JRE and download the executable from releases section.
To ran a specific script from the pipeline, say Checkout, execute
java -jar MIGEC-$VERSION.jar Checkout [arguments]
Where $VERSION
stands for pipeline version (e.g. 1.2.1), this notation is
omitted in MIGEC routine documentation.
To view the list of available scripts execute:
java -jar MIGEC-$VERSION.jar -h
alternatively you can download the repository and compile it from source using Maven (requires Maven version 3.0)
git clone https://github.com/mikessh/MIGEC.git
cd MIGEC/
mvn clean install
java -jar target/MIGEC-$VERSION.jar
This should show you the list of available MIGEC routines.
Note
The data from 454 platform should be used with caution, as it contains
homopolymer errors which (in present framework) result in reads dropped
during consensus assembly. The 454 platform has a relatively low read
yield, so additional read dropping could result in over-sequencing level
below required threshold. If you still wish to give it a try, we would
recommend filtering off all short reads and repairing indels with
Coral, the latter
should be run with options -mr 2 -mm 1000 -g 3
.
Warning
NCBI-BLAST+ package is required. Could be directly installed on Linux using a command like $sudo apt-get ncbi-blast+ or downloaded and installed directly from here: ftp://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/LATEST/
Warning
Consider providing sufficient memory for the pipeline, i.e. 8Gb for
MiSeq or 36Gb for HiSeq sample, depending on sample sequence diversity
and current script (CdrBlast requires has the highest memory
requirements). To do so, execute the script with -Xmx
argument:
java -Xmx8G -jar MIGEC-$VERSION.jar CdrBlast [arguments]
.
If insufficient amount memory is allocated, the Java Virtual Machine
could drop with a Java Heap Space Out of Memory error.