Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in executing process pipeline:differential_expression:map_transcriptome #81

Open
alyazeeditalal opened this issue Mar 28, 2024 · 4 comments

Comments

@alyazeeditalal
Copy link

Operating System

Ubuntu 22.04

Other Linux

No response

Workflow Version

v1.1.1

Workflow Execution

Command line

EPI2ME Version

No response

CLI command run

./nextflow run epi2me-labs/wf-transcriptomes
-r v1.1.1
-c config.cfg
--threads 64
--fastq /data/neurabin_bulk/data
--de_analysis
--ref_genome reference/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
--ref_annotation reference/Homo_sapiens.GRCh38.111.chr.gff3.gz
--sample_sheet sample_sheet.wt_mut.csv
--tr_out Talal.wt.mt.out -w Talal.inter.work.dir
--minimap2_index_opts '-k 15'
--pychopper_opts '-k LSK114'

Workflow Execution - CLI Execution Profile

standard (default)

What happened?

I have ran into this issue using the latest version of the pipeline and the earlier version v1.1.0. I am running a differential expression analysis for two conditions using the human genome Gh38 as a reference. I'd like to highlight that the pipeline was run on this dataset before without any errors using earlier versions v0.4.

wf-transcriptomes v1.1.1-g999fb4e

Core Nextflow options
revision : v1.1.1
runName : sick_stone
containerEngine : docker
container : [withLabel:isoforms:ontresearch/wf-transcriptomes:shae7c9f184996a384e99be68e790f0612f0c732867, withL
abel:wf_common:ontresearch/wf-common:sha1c5febff9f75143710826498b093d9769a5edbb9]
launchDir : /data/neurabin_bulk
workDir : /data/neurabin_bulk/Talal.inter.work.dir
projectDir : /home/prom/.nextflow/assets/epi2me-labs/wf-transcriptomes
userName : prom
profile : standard
configFiles : /home/prom/.nextflow/assets/epi2me-labs/wf-transcriptomes/nextflow.config, /data/neurabin_bulk/confi
g.cfg

Input Options
fastq : /data/neurabin_bulk/data
ref_genome : reference/Homo_sapiens.GRCh38.dna.toplevel.fa.gz
ref_annotation : reference/Homo_sapiens.GRCh38.111.chr.gff3.gz

Sample Options
sample_sheet : sample_sheet.wt_mut.csv

Options for reference-based workflow
minimap2_index_opts: -k 15

Differential Expression Options
de_analysis : true

Advanced Options
threads : 64
pychopper_opts : -k LSK114

Error message:

ERROR ~ Error executing process > 'pipeline:differential_expression:map_transcriptome (4)'

Caused by:
Process pipeline:differential_expression:map_transcriptome (4) terminated with an error exit status (1)

Command executed:

minimap2 -t 64 -ax splice -uf -p 1.0 "genome_index.mmi" "seqs.fastq.gz" | samtools view -Sb > "output.bam"
samtools sort -@ 64 "output.bam" -o "wildtype02_reads_aln_sorted.bam"
Command exit status:
1

Command output:
(empty)

Command error:
[WARNING] Indexing parameters (-k, -w or -H) overridden by parameters used in the prebuilt index.
[M::main::30.5440.70] loaded/built the index for 241316 target sequence(s)
[M::mm_mapopt_update::44.740
0.57] mid_occ = 2150
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 241316
[M::mm_idx_stat::45.393*0.57] distinct minimizers: 83604527 (9.03% are singletons); average occurrences: 21.286; average spacing: 5.310; total length: 9448981716
[E::sam_hdr_create] Invalid header line: must start with @HD/@SQ/@RG/@PG/@co
[main_samview] fail to read the header from "-".

Work dir:
/data/neurabin_bulk/Talal.inter.work.dir/a9/b99f6eac32400cf52debf32a1bb314

Tip: you can try to figure out what's wrong by changing to the process work dir and showing the script file named .command.sh

-- Check '.nextflow.log' file for details
WARN: Killing running tasks (1)

Relevant log output

not sure how to access the log file from the command line!

Application activity log entry

No response

Were you able to successfully run the latest version of the workflow with the demo data?

yes

Other demo data information

No response

@sarahjeeeze
Copy link
Contributor

Hi, thanks for the feedback, It may be out of memory as we have now set a limit on that process. You could you try reducing the number of threads to something like 4/6/8 which should be enough for this process? Or try increasing the memory with this and let me know if it works -

process {
    withName: map_transcriptome{
        memory = 32.GB
    }
}

@alyazeeditalal
Copy link
Author

Thank you for your response.

It worked after increasing the memory to 300GB
}
process {
withName: map_transcriptome{
memory = 300.GB
}
}

On a different note, the differential expression and the count file for the transcript level analysis is not outputted in the lates version. The results_dexseq.tsv containing the differential expression at the transcript level is not produced amongst the outputs in the latest version.

@alyazeeditalal
Copy link
Author

Dear @sarahjeeeze,

Would you please guide me in finding the results_dexseq.tsv containing the differential expression at the transcript level? This tsv file used to be produced using previous versions but I struggled to find it using the latest version.

Kind regards,
Talal

@sarahjeeeze
Copy link
Contributor

Hi sorry for the delay, this will be updated soon to include the dexseq results in the output folder in the meantime meantime you should be able to find the count files in the de_analysis process work folder. So go to the work directory, and then find the folder that is labelled with the same as the nextflow process and you should find it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants