-
Notifications
You must be signed in to change notification settings - Fork 60
Description
Description of the bug
Hello,
I am running TaxProfiler on WSL2, with the work and results directories located in a NAS mounted using CIFS. The work directory must be located in the NAS because it becomes too big to be stored locally. However, when running the pipeline (see the command below), it gets stuck in the Bowtie2 align step of host removal. The reason is that the command find -L ./ -name "*.rev.1.bt2" | sed "s/\.rev.1.bt2$//" fails to read the symlink that is created within the work directory for that process, pointing to the bowtie2 index files previously computed by the pipeline. I believe that the reason why this happens is that find itself fails to read symlinks on a mounted NAS, depending on the protocol and mount command used. I already checked that the symlink exists and that the index files are foundable there, so it is the command that fails, not the files being absent.
However, if instead of using find I change that line (line 53 of bowtie2/align/main.nf module) to something like INDEX=\$(ls -1 bowtie2/*.rev.1.bt2 2>/dev/null | sed 's/\\.rev\\.1\\.bt2\$//') then it reads the symlink and stores the information in the variable $INDEX as expected and the pipeline proceeds as normal. I do not know the implications of this in the overall pipeline, but from my humble knowledge, I dare to suggest changing the find command for an ls-based option in order to prevent this error from happening. This would need to be changed in lines 53 and 54 of the script modules/nf-core/bowtie2/align/main.nf. I do not know if this happens with any other processes within TaxProfiler other than the ones I tested myself. If so, find would need to be changed and tested there too.
Otherwise, can you come up with another general solution when find fails to read the symlink created in the work directory and returns empty?
The command used:
nextflow run nf-core/taxprofiler -profile docker -resume --input /mnt/NAS/taxprofiler_files/sample_sheet.csv --databases /mnt/NAS/taxprofiler_files/db_file.csv --outdir /mnt/NAS/Taxprofiler_PROJECT_01092025 --run_motus --perform_shortread_qc --perform_runmerging --save_runmerged_reads --save_analysis_ready_fastqs --shortread_qc_mergepairs --perform_shortread_hostremoval --save_hostremoval_bam --save_hostremoval_unmapped --run_profile_standardisation --hostremoval_reference /mnt/NAS/general_resources/T2T-CHM13_v2.0/hs1.fa -work-dir /mnt/NAS/work_03092025
Thank you for your time,
Best regards
Samuel
Command used and terminal output
nextflow run nf-core/taxprofiler -profile docker -resume --input /mnt/NAS/taxprofiler_files/sample_sheet.csv --databases /mnt/NAS/taxprofiler_files/db_file.csv --outdir /mnt/NAS/Taxprofiler_PROJECT_01092025 --run_motus --perform_shortread_qc --perform_runmerging --save_runmerged_reads --save_analysis_ready_fastqs --shortread_qc_mergepairs --perform_shortread_hostremoval --save_hostremoval_bam --save_hostremoval_unmapped --run_profile_standardisation --hostremoval_reference /mnt/NAS/general_resources/T2T-CHM13_v2.0/hs1.fa -work-dir /mnt/NAS/work_03092025
---------------------------------------------------
Execution cancelled -- Finishing pending tasks before exit
-[nf-core/taxprofiler] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN (TB007)'
Caused by:
Process `NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN (TB007)` terminated with an error exit status (1)
Command executed:
INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\.rev.1.bt2$//"`
[ -z "$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\.rev.1.bt2l$//"`
[ -z "$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1
bowtie2 \
-x $INDEX \
-U TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.merged.fastq.gz \
--threads 12 \
--un-gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.gz \
\
2>| >(tee TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.bowtie2.log >&2) \
| samtools sort --threads 12 -o TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.bam -
if [ -f TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.1.gz ]; then
mv TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.1.gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped_1.fastq.gz
fi
if [ -f TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.2.gz ]; then
mv TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.2.gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped_2.fastq.gz
fi
cat <<-END_VERSIONS > versions.yml
"NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN":
bowtie2: $(echo $(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*$//')
samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
pigz: $( pigz --version 2>&1 | sed 's/pigz //g' )
END_VERSIONS
Command exit status:
1
Command output:
(empty)
Command error:
Bowtie2 index files not found
Work dir:
/mnt/NAS/work_03092025/cf/266a7cc6486141a9978677e32eaed9
Container:
community.wave.seqera.io/library/bowtie2_htslib_samtools_pigz:edeb13799090a2a6
Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`
-- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting
-- Check '.nextflow.log' file for detailsRelevant files
No response
System information
Nextflow version: 25.04.6
Executor: local
OS: WSL2 running on Windows 11
Version: nf-core/taxprofiler 1.2.4 (revision: 1385e2f [master])