Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.27 KB

3_Alignment.md

File metadata and controls

27 lines (19 loc) · 1.27 KB

Meaning of the output files of BWA MEM (Source)

.amb is text file, to record appearance of N* (or other non-ATGC) in the ref fasta.
.ann is text file, to record ref sequences, name, length, etc.
.bwt is binary, the Burrows-Wheeler transformed sequence.
.pac is binary, packaged sequence (four base pairs encode one byte).
.sa is binary, suffix array index.

Is -a bwtsw flag compulsory for indexing human genome with BWA

Interpreting the Bwa mem screen output

  1. What is FF, FR, RF, RR?
    FR: ---------> F <--------- R
    RF: <-------- R ---------> F
    FF: ---------->F -----------> F
    RR: <---------- R <------------R

PNEXT in SAM File Bam Format: What'S The Purpose Of Rnext And Pnext ? See here

Install Sambamba as follows using static image.

What are read groups and how to construct them?

Refer to GATK Blog

eg: @RG\tID:HNKYTDSXX:1\tSM:12DIa_S6\tPL:illumina

Atleast ID, SM and PL must be specified.