↑ Click the title for github repository!
- A summarization tool catered to CNN articles that can contextualize its responses to queries on a previously stated topic
- Chunks and tokenizes data into OpenAI GPT 3.5 Turbo to train the chatbot to answer questions based on new data given
- Skills Used: LLMs (GPT 3.5 Turbo), Langchain, HuggingFace (Embeddings, Bias Classifier)
↑ Click the title for github repository!
- Built an End-to-End Data Science Project as an ONGB intern studying important features that correlate to student
absenteeism in Oakland Public schools in order for the non-profit to create programs attacking students at risk
- Using artificial neural networks, created >80% accurate diagnosis for Parkinson’s Disease, comparing it to XGBoost ML
- Skills Used: Tensorflow/Keras Neural Networks, Machine Learning (XGBoost)
↑ Click the title for github repository!
- Using artificial neural networks, created >90% accurate diagnosis for Parkinson’s Disease, comparing it to XGBoost ML
- Includes Visualizations and finds correlations within the data
- Skills Used: Tensorflow/Keras Deep Learning Nerual Networks, Sklearn, XGBoost Classifier
↑ Click the title for github repository!
- Using two Kaggle datasets (Glassdoor) with information (education, experience, etc.) to predict annual salary
- After Data Preprocessing, EDA, and visualizations to understand dataset, tested Regression Algorithms for optimization
- Two files comparing the difference of accuracy, NLP Count Vectorizer vs simple One Hot Encoding
- >95% accuracy using NLP and Count Vectorizer and tweaking using GridSearchCV
- Skills Used: NLP (Count Vectorizer), ML (Decision Tree Regressor), Preprocessing, EDA
↑ Click the title see Deepnote!
- Hypothesis tests on correlations between excess noise and Population Density, using correlations and histograms
- Based on database of major cities in the US and their noise pollution created visualizations to predict excess noise based on variables
- Skills Used: Machine Learning w/ Sklearn, Hypothesis Testing, EDA