Welcome to the getting started guide for Horovod, the distributed training framework for TensorFlow, Keras, and PyTorch. This document will help you get up and running with Horovod quickly.
Prerequisites
Before you start, make sure you have the following prerequisites:
- Python 3.5 or newer
- TensorFlow, Keras, or PyTorch installed
- Horovod installed (you can install it using pip:
pip install horovod
)
Quick Start
To get started, follow these steps:
- Create a New Project: Create a new directory for your project and navigate into it.
- Initialize a Git Repository: Run
git init
to initialize a new Git repository. - Write Your First Training Script: Create a new Python file, for example,
train.py
, and write your first training script using Horovod.
Here's a simple example using TensorFlow and Horovod:
import tensorflow as tf
import horovod.tensorflow as hvd
# Initialize Horovod
hvd.init()
# Create a simple model
model = tf.keras.models.Sequential([
tf.keras.layers.Dense(10, activation='relu', input_shape=(784,)),
tf.keras.layers.Dense(10, activation='softmax')
])
# Compile the model
model.compile(optimizer=hvd.DistributedOptimizer(tf.keras.optimizers.SGD(0.01)),
loss='categorical_crossentropy',
metrics=['accuracy'])
# Load data
mnist = tf.keras.datasets.mnist
(x_train, y_train), _ = mnist.load_data()
x_train, y_train = x_train / 255.0, y_train.astype(int)
# Train the model
model.fit(x_train, y_train, epochs=5, batch_size=32)
- Run the Training Script: Run the training script using the Horovod command. If you have multiple GPUs, you can use the
--gpus
flag to specify the number of GPUs to use.
horovodrun -np 2 -np 2 python train.py
- Monitor the Training: You can monitor the training process by checking the output in the terminal.
Next Steps
- Learn more about Horovod's architecture
- Explore advanced features of Horovod
- Check out Horovod's examples
Conclusion
Congratulations! You've successfully started using Horovod for distributed training. Keep exploring and expanding your knowledge about Horovod and distributed computing. Happy training! 🎉