Welcome to the Pig documentation page! Pig is a high-level platform for creating MapReduce programs used with Hadoop. This guide will help you get started with Pig and its features.
Quick Start
Here's a simple example of a Pig script:
-- Load the data
data = LOAD '/path/to/data' USING PigStorage(',');
-- Define a schema
data_schema = FOREACH data GENERATE $0 AS id, $1 AS name;
-- Store the result
STORE data_schema INTO '/path/to/output' USING PigStorage(',');
Features
- High-Level Abstractions: Pig provides a high-level language for creating MapReduce programs.
- Schema on Read: Pig allows you to define a schema for your data, which makes it easier to work with.
- Rich Ecosystem: Pig has a rich ecosystem of UDFs (User-Defined Functions) and libraries.
Resources
For more information, please visit the following resources:
Image
Here's a picture of a Pig mascot to celebrate Pig's features: