Torchvision is a part of the PyTorch ecosystem, providing state-of-the-art tools for computer vision research and development. It offers a wide range of pre-trained models, datasets, and utilities to facilitate the development of computer vision applications.

Features

  • Pre-trained Models: torchvision comes with a variety of pre-trained models that can be used as a starting point for your projects.
  • Datasets: It provides access to popular datasets such as CIFAR-10, ImageNet, and COCO.
  • Transforms: torchvision provides a set of transforms that can be applied to images to prepare them for training or inference.

Getting Started

To get started with torchvision, you can install it using pip:

pip install torchvision

Quick Start

Here's a simple example of loading an image and applying a pre-trained model using torchvision:

import torchvision.transforms as transforms
from torchvision import models

# Load an image
image = Image.open('path/to/image.jpg')

# Define a transform to normalize the image
transform = transforms.Compose([
    transforms.Resize(256),
    transforms.CenterCrop(224),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
])

# Apply the transform
image = transform(image).unsqueeze(0)

# Load a pre-trained model
model = models.resnet18(pretrained=True)

# Perform inference
output = model(image)

# Print the class
print(output.argmax(1))

Further Reading

For more information on torchvision, please refer to the official documentation.

Image of a PyTorch logo