Skip to content

A collection of educational projects demonstrating the use of various methods and technologies in Data Science, from exploratory data analysis to building machine learning models.

Notifications You must be signed in to change notification settings

masha-ds/yp-projects

Repository files navigation

My study projects

This repository contains a collection of learning projects completed as part of Data Science course. I completed the Yandex Practicum Data Science course (certificate attached) where I developed these projects as part of my learning journey. Each project represents a practical task aimed at developing key skills in data analysis, machine learning, and working with large datasets.

Name Description Tools
Toxic Comment Recognition I trained a model for classifying the toxicity of comments. This allows for the automation and acceleration of moderation. nltk, tqdm, TfidfVectorizer, LogisticRegression, ComplementNB, LGBMClassifier
Customer Age Estimation I developed a model to determine customers' age from photos, enabling targeted product recommendations and ensuring compliance in alcohol sales. keras, CV
Optimizing Oil Well Drilling: Profit and Risk Analysis Using Machine Learning I built a model to identify the region with maximum profit and analyzed it along with the associated risks using the Bootstrap technique. Bootstrap, Pipeline, ColumnTransformer, LinearRegression, Lasso, Ridge, Scalers, GridSearchCV, RandomizedSearchCV
Taxi Orders Forecasting I developed a model that effectively predicts taxi orders by leveraging time-based features and lag variables, achieving a strong performance. decompose, LGBMRegressor, CatBoostRegressor, LinearRegression, ElasticNet, FunctionTransformer
Predicting Employee Satisfaction and Attrition I predicted employee job satisfaction levels and instances of turnover based on the available data Pipeline, ColumnTransformer, Encoders, Scalers, GridSearchCV, RandomizedSearchCV, LinearRegression, LogisticRegression, KNeighborsClassifier, DecisionTreeClassifier, DecisionTreeRegressor, SVC
Optimizing Steel Production: Temperature Prediction and Process Stability Enhancement To optimize the production process and reduce energy consumption, the best model for predicting alloy temperature was selected. GridSearchCV, PolynomialFeatures, LinearRegression, Lasso, Ridge, ElasticNet, CatBoostRegressor, LGBMRegressor, RandomForestRegressor, SHAP
Car Price Prediction The goal of the project was to develop a model that accurately predicts car prices based on historical data, meeting the client’s requirements for high accuracy, fast prediction speed, and minimal training time. ColumnTransformer, sklearn, LGBMRegressor, CatBoostRegressor, SHAP

About

A collection of educational projects demonstrating the use of various methods and technologies in Data Science, from exploratory data analysis to building machine learning models.

Resources

Stars

Watchers

Forks