Torchvision is a part of the PyTorch ecosystem, providing state-of-the-art tools for computer vision research and development. It offers a wide range of pre-trained models, datasets, and utilities to facilitate the development of computer vision applications.
Features
- Pre-trained Models: torchvision comes with a variety of pre-trained models that can be used as a starting point for your projects.
- Datasets: It provides access to popular datasets such as CIFAR-10, ImageNet, and COCO.
- Transforms: torchvision provides a set of transforms that can be applied to images to prepare them for training or inference.
Getting Started
To get started with torchvision, you can install it using pip:
pip install torchvision
Quick Start
Here's a simple example of loading an image and applying a pre-trained model using torchvision:
import torchvision.transforms as transforms
from torchvision import models
# Load an image
image = Image.open('path/to/image.jpg')
# Define a transform to normalize the image
transform = transforms.Compose([
transforms.Resize(256),
transforms.CenterCrop(224),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])
# Apply the transform
image = transform(image).unsqueeze(0)
# Load a pre-trained model
model = models.resnet18(pretrained=True)
# Perform inference
output = model(image)
# Print the class
print(output.argmax(1))
Further Reading
For more information on torchvision, please refer to the official documentation.
Image of a PyTorch logo