Job Duties
Student Data Scientist @WW
Jan 2021 - May 2021
Skills: Python (tensorflow, keras, Huggingface Hub, Whisper, DALLE), SQL, UI Design(Python Streamlit), Deep Learning, NLP, Data Structure
In this internship experience, I participated in developing the beta version of an audio transcription model that was embedded as a function in the user input section. To improve the accuracy of the food-entity-detection task, I created 100,000 pseudo data containing text and POS tags and trained an open-source DeBERTa_V3 model, conducting hyper-parameter tuning processes to achieve 92% accuracy. I also adopted the cosine similarity technique of vector embedding combined with the FAISS model to map detected foods to the database.To provide a user-friendly interface for the product team, I designed a beta version of the user interface via a Streamlit app and deployed it on the internal system for evaluation. Additionally, I created a data warehouse using PostgreSQL to record beta user input and results, using Python Pandas and SQL to test the model's performance on the pseudo-audio input and its variance among food and sentence structures.