In module 4, we covered unsupervised learning, time series and deep learning. We'll have a presentation in class on where you will be presenting to your fellow Data Scientists!
Your project must be focused on one of the following topics:
- Time Series
- Recommendation Systems
- Clustering
- Deep Neural Networks
- Convolutional Neural Networks
Your project should contain:
- Problem Statement
- A description of the problem at hand. What problem are you trying to solve? Why is it a problem?
- Evaluation metrics of your models and an explanation of which model is best for your given domain
- A description/visual representation of your data preparation and modeling pipeline
- At least 3 visualizations
- Conclusion
- So what? Given your analysis, what do we now know about our initial problem statement?
Your final product can either be an analysis (in a Jupyter notebook, dashboard, or slideshow) or a web application (with Flask or Dash).
Data can be captured from current databases online or scraped. If performing a classification, you should have at least 1000 observations belonging to each class.
Some useful sources of data for Deep Learning are:
- http://www.image-net.org/
- http://deeplearning.net/datasets/
- https://skymind.ai/wiki/open-datasets
- https://www.analyticsvidhya.com/blog/2018/03/comprehensive-collection-deep-learning-datasets/
- https://toolbox.google.com/datasetsearch
- http://cs229.stanford.edu/projects.html
- https://ml-showcase.com/
- http://www.yaronhadad.com/deep-learning-most-amazing-applications/
- Another option is to try and recreate results from a research paper