All necesary libraries are indicated at the beginning of the notebook. The code should run with no issues using Python versions 3.*.
This project is part of an assignment for the Udacity Data Scientist nanodegree course. In it, students can practice their data science (CRISP-DM) skills on a real-world problem and communicate their findings in a blogpost. For my assignment, I chose to analyze data from Untappd, a geosocial network app that allows users to check-in beers and share reviews with their friends and other users. I'll try to answer three questions:
- What contributes to the way people rate beer?
- Is there a way to predict if you'll like a certain beer?
- How could data further enhance the Untappd experience?
Files include two jupyter notebooks: the main one analyses an Untappd dataset that a user put on Kaggle, the second one repeats the analysis with my own Untappd data.
I wrote a blogpost about my findings on Medium.
I'd like to thank Theun van Vliet for providing feedback and inspiration. Gratitude to Nikita Gruntov who provided his Untappd data to the public on Kaggle, and to fellow authors analyzing beer data and writing about it. Feel free to use the code provided here and analyze your own Untappd data!