Welcome to the Pig documentation page! Pig is a high-level platform for creating MapReduce programs used with Hadoop. This guide will help you get started with Pig and its features.

Quick Start

Here's a simple example of a Pig script:

-- Load the data
data = LOAD '/path/to/data' USING PigStorage(',');

-- Define a schema
data_schema = FOREACH data GENERATE $0 AS id, $1 AS name;

-- Store the result
STORE data_schema INTO '/path/to/output' USING PigStorage(',');

Features

  • High-Level Abstractions: Pig provides a high-level language for creating MapReduce programs.
  • Schema on Read: Pig allows you to define a schema for your data, which makes it easier to work with.
  • Rich Ecosystem: Pig has a rich ecosystem of UDFs (User-Defined Functions) and libraries.

Resources

For more information, please visit the following resources:

Image

Here's a picture of a Pig mascot to celebrate Pig's features:

Pig Mascot