You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have some bacterial NGS reads as well as assemblies. I used two methods to call variations:
mapping-based: I use clean reads with snippy and reference genome to call variation, and bcftools merge to combined the variations from multiple samples.
assembly-based: I assemblied the sample using shovill, annotated them using prokka, got pan-genome results using Panaroo, generated a recombination-free core-genes alignment using ClonalFrameML. Then I get the core genome variations using snp-sites by snp-sites -v -o ours_core_variations.vcf ../PGout_panaroo/core_gene_alignment_filtered.aln
Here is the statistics for two vcfs:
# for merged VCF from individually calling by snippy
1. $ bcftools stats ../Ours_vcf_merged.vcf.gz | grep SN
# SN, Summary numbers:
# number of SNPs .. number of rows with a SNP
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
# counter. For example, a row with a SNP and an indel increments both the SNP and
# SN [2]id [3]key [4]value
SN 0 number of samples: 196
SN 0 number of records: 2292
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 2053
SN 0 number of MNPs: 47
SN 0 number of indels: 180
SN 0 number of others: 13
SN 0 number of multiallelic sites: 13
SN 0 number of multiallelic SNP sites: 1
#for snp-sites results
2. $ bcftools stats ours_core_variations.vcf | grep SN
# SN, Summary numbers:
# number of SNPs .. number of rows with a SNP
# number of multiallelic SNP sites .. number of rows with multiple alternate alleles, all SNPs
# counter. For example, a row with a SNP and an indel increments both the SNP and
# SN [2]id [3]key [4]value
SN 0 number of samples: 196
SN 0 number of records: 5907
SN 0 number of no-ALTs: 0
SN 0 number of SNPs: 5907
SN 0 number of MNPs: 0
SN 0 number of indels: 0
SN 0 number of others: 0
SN 0 number of multiallelic sites: 3408
SN 0 number of multiallelic SNP sites: 34
There is a huge difference between the total variations (2292 vs. 5907) as well as SNP(2053 vs. 5907). Is there anything I did wrong? Could you please help me figure it out?
The text was updated successfully, but these errors were encountered:
Hi,
I have some bacterial NGS reads as well as assemblies. I used two methods to call variations:
mapping-based
: I use clean reads withsnippy
andreference
genome to callvariation
, andbcftools merge
to combined the variations from multiple samples.assembly-based
: I assemblied the sample usingshovill
, annotated them usingprokka
, got pan-genome results usingPanaroo
, generated a recombination-free core-genes alignment usingClonalFrameML
. Then I get the core genomevariations
usingsnp-sites
bysnp-sites -v -o ours_core_variations.vcf ../PGout_panaroo/core_gene_alignment_filtered.aln
Here is the statistics for two vcfs:
There is a huge difference between the total
variations
(2292
vs.5907
) as well asSNP
(2053
vs.5907
). Is there anything I did wrong? Could you please help me figure it out?The text was updated successfully, but these errors were encountered: