I'm glad to hear that your issue has been resolved. And thanks for sharing the information, which might be beneficial to other community members reading this thread as solution. Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", so I'll repost your response to an answer in case you'd like to accept the answer. This will help other users who may have a similar query find the solution more easily.
Query: How to improve speed of data transfer of RO_MOUNT Data asset in Azure ML Job?
Solution: The issue is resolved. I don't know why but the issue in the previous comment got resolved by setting pin_memory
in the Dataloader
to False
.
So, a TL; DR for this problem is as follows:
- In
command
function of v2 SDK, setshm_size
parameter to ~1/2 of VM RAM capacity. - Increase workers in
Dataloader
to CPU core count / CPU core count - 1. - Disable pin memory if using large VMs (CPU core count > 4).
- If data is too large to store in VM's local disk, data asset's
mode
parameter toro_mount
and incommand
function, setenvironment_variables
parameter to this dictionary settings:
PythonCopy
# parameter in `command` function
environment_variables=dict(
DATASET_MOUNT_BLOCK_BASED_CACHE_ENABLED=True, # enable block-based caching
DATASET_MOUNT_BLOCK_FILE_CACHE_ENABLED=False, # disable caching on disk
DATASET_MOUNT_MEMORY_CACHE_SIZE=0, # disabling in-memory caching
)
This GitHub link is my source to most of these parameter settings: best-practices ViT-Pretrain
If you have any further questions or concerns, please don't hesitate to ask. We're always here to help.
Do click Accept Answer
and Yes
for was this answer helpful.