Meaning of the output files of BWA MEM (Source)
.amb is text file, to record appearance of N* (or other non-ATGC) in the ref fasta.
.ann is text file, to record ref sequences, name, length, etc.
.bwt is binary, the Burrows-Wheeler transformed sequence.
.pac is binary, packaged sequence (four base pairs encode one byte).
.sa is binary, suffix array index.
Is -a bwtsw flag compulsory for indexing human genome with BWA
Interpreting the Bwa mem screen output
- What is FF, FR, RF, RR?
FR: ---------> F <--------- R
RF: <-------- R ---------> F
FF: ---------->F -----------> F
RR: <---------- R <------------R
PNEXT in SAM File Bam Format: What'S The Purpose Of Rnext And Pnext ? See here
Install Sambamba as follows using static image.
Refer to GATK Blog
eg: @RG\tID:HNKYTDSXX:1\tSM:12DIa_S6\tPL:illumina
Atleast ID, SM and PL must be specified.