Skip to content

Bowtie2 "find" process fails to read symlinks when mounted in CIFS #652

@sghuete

Description

@sghuete

Description of the bug

Hello,

I am running TaxProfiler on WSL2, with the work and results directories located in a NAS mounted using CIFS. The work directory must be located in the NAS because it becomes too big to be stored locally. However, when running the pipeline (see the command below), it gets stuck in the Bowtie2 align step of host removal. The reason is that the command find -L ./ -name "*.rev.1.bt2" | sed "s/\.rev.1.bt2$//" fails to read the symlink that is created within the work directory for that process, pointing to the bowtie2 index files previously computed by the pipeline. I believe that the reason why this happens is that find itself fails to read symlinks on a mounted NAS, depending on the protocol and mount command used. I already checked that the symlink exists and that the index files are foundable there, so it is the command that fails, not the files being absent.

However, if instead of using find I change that line (line 53 of bowtie2/align/main.nf module) to something like INDEX=\$(ls -1 bowtie2/*.rev.1.bt2 2>/dev/null | sed 's/\\.rev\\.1\\.bt2\$//') then it reads the symlink and stores the information in the variable $INDEX as expected and the pipeline proceeds as normal. I do not know the implications of this in the overall pipeline, but from my humble knowledge, I dare to suggest changing the find command for an ls-based option in order to prevent this error from happening. This would need to be changed in lines 53 and 54 of the script modules/nf-core/bowtie2/align/main.nf. I do not know if this happens with any other processes within TaxProfiler other than the ones I tested myself. If so, find would need to be changed and tested there too.

Otherwise, can you come up with another general solution when find fails to read the symlink created in the work directory and returns empty?

The command used:
nextflow run nf-core/taxprofiler -profile docker -resume --input /mnt/NAS/taxprofiler_files/sample_sheet.csv --databases /mnt/NAS/taxprofiler_files/db_file.csv --outdir /mnt/NAS/Taxprofiler_PROJECT_01092025 --run_motus --perform_shortread_qc --perform_runmerging --save_runmerged_reads --save_analysis_ready_fastqs --shortread_qc_mergepairs --perform_shortread_hostremoval --save_hostremoval_bam --save_hostremoval_unmapped --run_profile_standardisation --hostremoval_reference /mnt/NAS/general_resources/T2T-CHM13_v2.0/hs1.fa -work-dir /mnt/NAS/work_03092025

Thank you for your time,

Best regards

Samuel

Command used and terminal output

nextflow run nf-core/taxprofiler -profile docker -resume --input /mnt/NAS/taxprofiler_files/sample_sheet.csv --databases /mnt/NAS/taxprofiler_files/db_file.csv --outdir /mnt/NAS/Taxprofiler_PROJECT_01092025 --run_motus --perform_shortread_qc --perform_runmerging --save_runmerged_reads --save_analysis_ready_fastqs --shortread_qc_mergepairs --perform_shortread_hostremoval --save_hostremoval_bam --save_hostremoval_unmapped --run_profile_standardisation --hostremoval_reference /mnt/NAS/general_resources/T2T-CHM13_v2.0/hs1.fa -work-dir /mnt/NAS/work_03092025

---------------------------------------------------


Execution cancelled -- Finishing pending tasks before exit
-[nf-core/taxprofiler] Pipeline completed with errors-
ERROR ~ Error executing process > 'NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN (TB007)'

Caused by:
  Process `NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN (TB007)` terminated with an error exit status (1)


Command executed:

  INDEX=`find -L ./ -name "*.rev.1.bt2" | sed "s/\.rev.1.bt2$//"`
  [ -z "$INDEX" ] && INDEX=`find -L ./ -name "*.rev.1.bt2l" | sed "s/\.rev.1.bt2l$//"`
  [ -z "$INDEX" ] && echo "Bowtie2 index files not found" 1>&2 && exit 1

  bowtie2 \
      -x $INDEX \
      -U TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.merged.fastq.gz \
      --threads 12 \
      --un-gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.gz \
       \
      2>| >(tee TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.bowtie2.log >&2) \
      | samtools sort  --threads 12  -o TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.bam -

  if [ -f TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.1.gz ]; then
      mv TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.1.gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped_1.fastq.gz
  fi

  if [ -f TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.2.gz ]; then
      mv TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped.fastq.2.gz TB007_TB007_EKDN250015205-1A_22VGFWLT4_L6.unmapped_2.fastq.gz
  fi

  cat <<-END_VERSIONS > versions.yml
  "NFCORE_TAXPROFILER:TAXPROFILER:SHORTREAD_HOSTREMOVAL:BOWTIE2_ALIGN":
      bowtie2: $(echo $(bowtie2 --version 2>&1) | sed 's/^.*bowtie2-align-s version //; s/ .*$//')
      samtools: $(echo $(samtools --version 2>&1) | sed 's/^.*samtools //; s/Using.*$//')
      pigz: $( pigz --version 2>&1 | sed 's/pigz //g' )
  END_VERSIONS

Command exit status:
  1

Command output:
  (empty)

Command error:
  Bowtie2 index files not found

Work dir:
  /mnt/NAS/work_03092025/cf/266a7cc6486141a9978677e32eaed9

Container:
  community.wave.seqera.io/library/bowtie2_htslib_samtools_pigz:edeb13799090a2a6

Tip: you can replicate the issue by changing to the process work dir and entering the command `bash .command.run`

 -- Check '.nextflow.log' file for details
ERROR ~ Pipeline failed. Please refer to troubleshooting docs: https://nf-co.re/docs/usage/troubleshooting

 -- Check '.nextflow.log' file for details

Relevant files

No response

System information

Nextflow version: 25.04.6
Executor: local
OS: WSL2 running on Windows 11
Version: nf-core/taxprofiler 1.2.4 (revision: 1385e2f [master])

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions