Framework for a training project
- Fork the project and clone it to your development environment
- Install Hadoop cluster on your instance
- Select data sources, download the data. Optionally create more data with Pig command
- Don't forget to document your steps, in data/data-notes.md, or doc/doc-notes.md
- Set up Flume to import the data from a local directory. Optionally import it from an ftp site or read from a network
- Create Hive queries to analyze data
- Outline future steps