Skip to content

Latest commit

 

History

History
72 lines (38 loc) · 4.23 KB

README.md

File metadata and controls

72 lines (38 loc) · 4.23 KB

Kaggle Starter Codes for common ML tasks

A repository containing some of my kaggle notebooks which could be helpful as starter codes for beginners.

My Kaggle Profile: https://www.kaggle.com/parulpandey

Time Series Analysis

Time series data is a sequence of data points in chronological order that is used by businesses to analyze past data and make future predictions.In this notebook, I have introduced some common techniques used in time-series analysis and I walk through the iterative steps required to manipulate, visualize time-series data.

The objective of this notebook is to explore the given NIFTY-50 data and along with the sectoral indices and visualise them to obtain important information.However, if you want to understand the nuances of Time Series data and how to get started with it, there is another notebook which caters to that.


NLP - Getting started

This notebook explains the concepts of NLP with respect to this current competition. NLP is the field of study that focuses on the interactions between human language and computers. NLP sits at the intersection of computer science, artificial intelligence, and computational linguistics[source]. NLP is a way for computers to analyze, understand, and derive meaning from human language in a smart and useful way.This Kaggle notebook with basic codes t

This notebook comes as a second part to the Getting started with NLP Notebooks .In this notebook we study the various ways of vectorizing text data.


Dimensionality reduction

Extracting human understandable insights from any Machine Learning mode. Some techniques explained in this notebook are:

  • Permutation Importance using ELI5 library
  • Partial Dependence Plots
  • SHAP Values
  • Advanced Uses of SHAP Values

4 Getting started with Dimensionality Reduction Techniques in Python

A 3 part serieson Dimensionality reduction techniques using the Kannada MNIST dataset. In this series of notebooks, we shall study about three Dimensionality reduction techniques using the Kannada MNIST dataset. The techniques are PCA, t-SNE and UMAP.


5 Getting started with Geospatial Data in Python

The beauty of using Python is that it offers libraries for every data visualisation need. One such library is Folium which comes in handy for visualising Geographic data (Geo data). Geographic data (Geo data) science is a subset of data science that deals with location-based data i.e description of objects and their relationship in space.

6 Getting started with H2O libraries in Python