Here are some example configurations for setting up Horovod in different frameworks:
TensorFlow Configuration
# tensorflow_config.yaml
horovod:
framework: tensorflow
backend: mpi
master: tcp://localhost:2222
workers:
- rank: 0
hostname: worker0
- rank: 1
hostname: worker1
PyTorch Configuration
# pytorch_config.yaml
horovod:
framework: pytorch
backend: nccl
master: tcp://127.0.0.1:2222
workers:
- rank: 0
hostname: worker0
- rank: 1
hostname: worker1
Key Configuration Parameters
framework
: Choose betweentensorflow
,pytorch
, ormxnet
backend
: Usempi
,nccl
, orgloo
based on your cluster setupmaster
: Specify the master node address (e.g.,tcp://host:port
)workers
: List all worker nodes with their ranks and hostnames
For more details on Horovod configuration options, please visit /Horovod/docs/