Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GRAAL for diplid assemblies? #15

Open
dcopetti opened this issue May 4, 2018 · 5 comments
Open

GRAAL for diplid assemblies? #15

dcopetti opened this issue May 4, 2018 · 5 comments

Comments

@dcopetti
Copy link

dcopetti commented May 4, 2018

Hi,

I wonder if GRAAL will fit my genome project.
I have a plant genome assembly with the following features:

  • estimated genome size: 2.6 Gb, diploid organism, no recent WGD;
  • total assembly size: 4.5 Gb, scf N50 3 Mb, scf N80 1 Mb, 1.4% Ns;
  • BUSCO genes: 98% present, >70% in two copies
    It is indeed a diploid assembly.

I wonder if GRAAL can use allelic variation to produce phased pseudochromosome sequences.
By collinearity I am able to assign 80% of the sequence to chromosomes (of a closely-related species), but I have pairs of scaffolds at each locus. I would like to split the pairs in the two allelic genomes in a phased fashion. Would GRAAL work with this?
Thanks,

Dario

@baudrly
Copy link
Member

baudrly commented May 4, 2018

Hello,

I have indeed been using GRAAL with success to assemble diploid genomes. This depends on how heterozygous the assembly is. If chromosomes are too similar to each other there will be too many mapping issues and 3C-based assemblers are generally unable to distinguish reads mapping onto either member of a chromosome pair. But if the chromosomes can be distinguished I have found that GRAAL is relatively robust to these mapping issues and can separate two chromosomes from a pair, albeit with a noticeable pattern:

hetero_pattern

If that works for you don't hesitate to try it out and report any issue you may have found.

@dcopetti
Copy link
Author

dcopetti commented May 4, 2018

perfect, that image is what I was looking for! (if the two are allelic chromosomes/scffolds)
do you have any recommendation on how to prepare the Hi-C data? any favorite protocol or method?
Thanks for the feedback,

@baudrly
Copy link
Member

baudrly commented May 4, 2018

Yes they are an allelic pair, this is a typical pattern I've found among such chromosomes. As for the generation of GRAAL-compatible contact maps, you may use HiC-Box (graphical interface based) or my own pipeline (command-line based). Or you may convert the data manually according to the format described in the readme if you already have some data at your disposal that's been processed by another Hi-C pipeline. In any case, the pipelines are by and large equivalent for reassembly purposes, so simply pick whatever is the most convenient for you.

Cheers

@dcopetti
Copy link
Author

dcopetti commented May 4, 2018

great, thanks.
How about the wet lab part, any particular recommendation?

@baudrly
Copy link
Member

baudrly commented May 6, 2018

I don't do experiments anymore and I am not familiar on protocol specifics for plants but here's a very handy and recent reference. I hope that helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants