-
Notifications
You must be signed in to change notification settings - Fork 74
Description
Hello,
I am copying files between from lustre file system to ceph storage over infiniband via slurm job. But i am getting very low speed.
Below is the slurm job.
`#!/bin/bash
#SBATCH --job-name=user_copy
#SBATCH --partition=THK_CPU
#SBATCH --nodelist=mlqhpc-cpu-node15
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=40
#SBATCH --hint=nomultithread
#SBATCH --time=0-00:00:00
#SBATCH --output=user_copy_%j.out
#SBATCH --error=user_copy_%j.err
Load the mpifileutils module
module load custom/mpifileutils/0.12.1
Define source and destination
SRC=/mnt/lustre/source_user
DEST=/users/
time srun dcp --progress 600 $SRC $DEST`
Below is the output results
[2025-10-29T16:51:08] Data: 65.783 GiB (70634049927 bytes) [2025-10-29T16:51:08] Rate: 8.415 MiB/s (70634049927 bytes in 8004.996 seconds) [2025-10-29T16:51:08] Started: Oct-29-2025,14:37:43 [2025-10-29T16:51:08] Completed: Oct-29-2025,16:51:08 [2025-10-29T16:51:08] Seconds: 8005.003 [2025-10-29T16:51:08] Items: 226505 [2025-10-29T16:51:08] Directories: 19478 [2025-10-29T16:51:08] Files: 200806 [2025-10-29T16:51:08] Links: 6221 [2025-10-29T16:51:08] Data: 65.783 GiB (70634049927 bytes) [2025-10-29T16:51:08] Rate: 8.415 MiB/s (70634049927 bytes in 8005.003 seconds) [2025-10-29T16:51:08] Updated 226505 items in 1622.983 secs (139.561 items/sec) done [2025-10-29T16:51:08] Updated 226505 items in 1622.983 seconds (139.561 items/sec) [2025-10-29T16:51:08] Syncing directory updates to disk. [2025-10-29T16:51:08] Sync completed in 0.000 seconds. [2025-10-29T16:51:08] Started: Oct-29-2025,14:37:43 [2025-10-29T16:51:08] Completed: Oct-29-2025,16:51:08 [2025-10-29T16:51:08] Seconds: 8005.287 [2025-10-29T16:51:08] Items: 226505 [2025-10-29T16:51:08] Directories: 19478 [2025-10-29T16:51:08] Files: 200806 [2025-10-29T16:51:08] Links: 6221 [2025-10-29T16:51:08] Data: 65.783 GiB (70634049927 bytes) [2025-10-29T16:51:08] Rate: 8.415 MiB/s (70634049927 bytes in 8005.287 seconds)
The same data were copied in just 30 mints with rsync. I am avoiding the rsync because its putting load on the lustre MDS.
need you help in this matter.
cheers,
Ihsan