解决ValueError: Error initializing torch.distributed using env:// rendezvous:: environment variable 报错
在命令行运行程序时候可成功跑通,但在程序调试过程中出现如下错误:
源代码:
修改后:
import torch.distributed as dist
import os
os.environ['MASTER_ADDR'] = 'localhost'
os.environ['MASTER_PORT'] = '5678'
dist.init_process_group(backend='nccl', init_method='env://', rank = 0, world_size = 1)