Can the copy of large files be done in parallel across ranks?

We have been using `dsync` to migrate data from an old file system to new file system (GPFS->CephFS).  The data migration works well and is meeting most of our needs.

In our data sets we have many small files (>95% of file count) and a smaller number of  large files (>95% storage used).   The `dsync` operations generally work well and perform the tree walks efficiently.   The data copy performance is ultimately limited by the file sizes. 

Something we have noticed, however, is that we can get into a very long tail during our batches.  The throughput starts strong but then trails off to a trickle for a long time.  The next batch then picks up again where the throughput starts good but then trails off again with a long tail.

We suspect this is due a a large file existing in the batch and that the rank processing the file hasn't finished, leaving the other ranks idle awaiting the next batch.    It seems that the file list can be shared across ranks but that a file action (copy) is only carried out by a single rank.

Is our intuition correct?

If so,  is there a way to improve the copy portion of the dsync?  One solution could be to copy file data in parallel, assigning portions of a data transfer to idle ranks.  This would enable all ranks to contribute to the completion of the transfer of the large file in that batch.   Another option might be to allow the idle ranks to start on the next batch, avoiding a stall due to lack of work.

We'd be interested in your feedback on this assessment and suggestions for improvement.

Thanks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can the copy of large files be done in parallel across ranks? #645

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Can the copy of large files be done in parallel across ranks? #645

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions