This repository contains a collection of learning projects completed as part of Data Science course. I completed the Yandex Practicum Data Science course (certificate attached) where I developed these projects as part of my learning journey. Each project represents a practical task aimed at developing key skills in data analysis, machine learning, and working with large datasets.
Name | Description | Tools |
---|---|---|
Toxic Comment Recognition | I trained a model for classifying the toxicity of comments. This allows for the automation and acceleration of moderation. | nltk, tqdm, TfidfVectorizer, LogisticRegression, ComplementNB, LGBMClassifier |
Customer Age Estimation | I developed a model to determine customers' age from photos, enabling targeted product recommendations and ensuring compliance in alcohol sales. | keras, CV |
Optimizing Oil Well Drilling: Profit and Risk Analysis Using Machine Learning | I built a model to identify the region with maximum profit and analyzed it along with the associated risks using the Bootstrap technique. | Bootstrap, Pipeline, ColumnTransformer, LinearRegression, Lasso, Ridge, Scalers, GridSearchCV, RandomizedSearchCV |
Taxi Orders Forecasting | I developed a model that effectively predicts taxi orders by leveraging time-based features and lag variables, achieving a strong performance. | decompose, LGBMRegressor, CatBoostRegressor, LinearRegression, ElasticNet, FunctionTransformer |
Predicting Employee Satisfaction and Attrition | I predicted employee job satisfaction levels and instances of turnover based on the available data | Pipeline, ColumnTransformer, Encoders, Scalers, GridSearchCV, RandomizedSearchCV, LinearRegression, LogisticRegression, KNeighborsClassifier, DecisionTreeClassifier, DecisionTreeRegressor, SVC |
Optimizing Steel Production: Temperature Prediction and Process Stability Enhancement | To optimize the production process and reduce energy consumption, the best model for predicting alloy temperature was selected. | GridSearchCV, PolynomialFeatures, LinearRegression, Lasso, Ridge, ElasticNet, CatBoostRegressor, LGBMRegressor, RandomForestRegressor, SHAP |
Car Price Prediction | The goal of the project was to develop a model that accurately predicts car prices based on historical data, meeting the client’s requirements for high accuracy, fast prediction speed, and minimal training time. | ColumnTransformer, sklearn, LGBMRegressor, CatBoostRegressor, SHAP |