Portfolio

Denis Gomonov’s Data Science Portfolio

Goal (2024): Practical Application of NLP/Comp Vision

Led team of 4 to build a political database SaaS platform as a subscription service
Collected data from multiple gov entities via API calls such as: congress.gov, pa.gov, data.gov, etc
Built out a platform with the latest techniques using Next.js, Tailwind, httpx, clerk, etc
Provided research on data licensing, legal matters and service costs

snippet from the website + nlp work:

Analyzed and created visualizations to explore trends showing distinct features that lead to fraudulent claims
Prepared and transforemd data via categorical encoding to be fit for Keras model, acquired from Kaggle
Built a model with 32 input neurons in the input layer, 32 neurons in 1 hidden layer and final dense layer utilizing binary-crossentropy loss and adam optimizer, resulting in loss: 0.9092 - accuracy: 0.9361 on test data

snippet from Notebook:

Animation

Analyzed and transformed BTC datasets to exclude bot entries and show parallel data between Twitter entries & market ticker prices
Optimized string data by removing stop words and stemming, further calculating sentiment scores via subjectivity, polarity, vader metrics
Created 3 demo hourly visualizations with scaled vader sentiment scores against BTC volume, BTC USD, BTC close price

Prepared the NFLX data for visual portfolio analysis by calculating relevant metrics such as SMA, EMA, ROC, RSI, Bollinger Bands, etc
Created dynamic Candlestick and RSI plots for detailed analysis of stock perfomance via Plotly
Established a buy/sell signal and prepared a list of classification algorithms with performance comparisson via k-fold cross validation
Determined Random Forest to be the best applicable model, achieving classification accuracy score of 0.91 by tuning with GridSearchCV

snippet from Notebook:

Created EDA & external dynamic dashboard breaking down the spread of Covid-19 in U.S. using Python in Jupyter & Atom
Prepared and transformed data read from USAFACTS
Optimized OLS Regression for Covid-19 Case & Death 2-Week predictions
Engineered features and built custom plots for metric analysis, geo-mapping and forecasting via Plotly
Launched external dynamic dashboard on Heroku

Heroku App GIF:

Created EDA-Notebook producing uncommon visualization techniques for Iris Species using Python & Jupyter
Explored petal & sepal features with customized Scatter, Coordinates and Categorical-Coordinates plots
Determined features and applied K-Nearest-Neighbors classification with best model fit & performance via Scikit-learn
Visualized KNN-algorithm output and its decision boundary in 2D format with Countour plot

snippet from Notebook:

Created EDA-Notebook exploring how Airbnb is affecting neighbourhoods of NYC using Python & Jupyter
Collected data from Inside Airbnb and maintained it on Kaggle with achieving Top-20 Most Voted, 1850+ Votes, Gold medal
Analysed and visualized distribution of prices of New York City boroughs & neighborhoods via Seaborn
Optimized text data presented and provided with Listing Titles insights on trend terminology

snippet from Kaggle: