top of page

Academic CV

Experience

Jan 2023 - May 2023

Applied Research Intern

Delta Airlines

Atlanta, GA

i-1447978708548_955_hd.png

I developed supervised learning algorithms, including logistic regression and support vector machines, achieving an accuracy score of approximately 80%. These algorithms were used to train a Google BERT model, enabling the classification of documents based on their relevance to user queries. This system allowed users to input human-like questions and receive direct responses while identifying the most pertinent documents.

 

Additionally, I conducted quantitative analysis and created data visualizations using Python, specifically utilizing Jupyter Notebook with libraries such as NumPy and Pandas. Furthermore, I utilized Tableau dashboards to analyze data and provide recommendations in the e-commerce and Internet entertainment/media space. By implementing proposed policy changes, I anticipate a 20% increase in customer satisfaction scores by the fourth quarter of 2023.

May 2022 - Aug 2022

Business Intelligence Engineer Intern

Amazon

Amazon_logo.svg.png

Seattle, WA

I retrieved and filtered a vast dataset of 120 million data points, comprising 30 attributes of customer behaviors and purchase history, from Amazon Go and Fresh "Just Walk Out" technology using SQL. To streamline data processing, I developed an automated ETL pipeline with a weekly schedule, utilizing AWS Redshift, AWS S3, and Athena to extract and store metrics from the Amazon Grocery API.

 

Additionally, I created two Tableau Dashboards with automated weekly refreshing schedules, enabling real-time monitoring through a Tableau Data Extract data pipeline in the Cloud Server. By implementing these data-driven solutions, I anticipate a significant 15% increase in key performance indicators (KPIs) related to customer behavior at Amazon Grocery.

Sep 2021 - April 2022

IT Operations Intern

Tesla

Fremont, CA

Tesla-Logo.jpg

I provided IT asset inventory data analysis for the IT Asset Management Team, generating statistical reports and dashboards using SQL. This increased the efficiency of the team's reports by approximately 10%.

 

Additionally, I supported renewable energy investment ideas. Moreover, I built and provided backend data filtering and analytics for an IT finance data pipeline, dashboard, and website using Rest API and Python, specifically Django, Jinja 2, and Pyecharts. I produced visualizations on the website with animations, effectively displaying the IT asset inventory.

May 2021 - Aug 2021

Serialization Analyst (Data Science) Intern

Pfizer

Pearl River, NY

Pfizer_new_2021.jpeg

I utilized Aginity WorkBench to produce SQL queries and retrieve supply chain reports from shipping sites across the enterprise. I also played a role in cleaning and transforming datasets for the Global Supply team, which involved handling over 500,000 data entries and 20 attributes.

 

Additionally, I implemented the Splunk Machine Learning Toolkit, incorporating machine learning models such as KNN, Logistic Regression, Linear Regression, and Decision Tree. After careful evaluation, I selected a comprehensive and highly accurate model with a 90% accuracy rate to update our weekly data analysis and predictions. This initiative is expected to contribute to a 10% increase in package shipping speed by the fourth quarter of 2021, thanks to the backend development work.

Education

Aug 2023 - Dec 2024

MA in

Information and Data Science

University of California, Berkeley

UC-Berkeley-Symbol.png

During my Master's program, I focus on developing expertise in machine learning, statistical analysis, and data visualization. 

Aug 2018 - May 2023

BA in

Data Science (Business and Industrial Analysis)

University of California, Berkeley

During my undergraduate program, I gained a strong foundation in computer programming, algorithms, and data structures. I also took courses in artificial intelligence and natural language processing.

UC-Berkeley-Symbol.png

Professional skillset

Data Analysis

Machine Learning (Jupyter Notebook, Splunk)

Natural Language Processing

Data Visualization (Tableau, Power BI)

Programming Languages

Python (NumPy, Pandas, SciPy)

R

SQL

Java

Javascript

C++

Languages

English (native)

Mandarin (native)

bottom of page