Zhiyu Lin

Institution/Organization: Georgetown University

Department: Data Science and Analytics

Academic Status: Graduate Student

What conference theme areas are you interested in:

Adaptive control, optimal control, and estimator design;
Artificial Intelligence (AI) and Machine Learning (ML) for science and engineering;
Applications in science, engineering, and industry; Computation with discrete structures and graphs;
Data assimilation, challenges in data science, math of AI and ML;
High-order methods, novel discretizations, and scalable solvers;
Inverse problems, optimization, and uncertainty quantification; Model and dimensionality reduction


I am interested in machine learning, data science, numerical theories, and applying different statistical theories into computations. Currently, I am pursuing a Masters degree in Data Science and Analytics at Georgetown University, and working as a Data Scientist Intern at Novelis, an aluminum manufacturing company, on building feature selection and unsupervised learning methods specifically tailored to IoT sensor data as part of a machine learning pipeline, to improve the machine safety and forecast potential risks at the factories.

From 2015 to 2019, I studied applied mathematics and statistics at Macalester College. During this time, I was exposed to a variety of data analysis techniques and mathematical models from classes such as Intro to Data Science, Machine Learning with R, and Computational Linear Algebra. I was fascinated by how these theories can be used in such a smart way to tell a story and inform decision making. In the summer of 2018, I completed an internship with a Legal-tech start up based in Beijing, China. I worked with the lead data scientist there on a NLP project, and built a model that scrapes user comments from e-commerce websites and detects fraudulent listings with RandomForest and Boosting. From this internship I not only started working with texts, but also learned how to apply models I’ve learned in class into analyzing real world data. In 2019, I finished my graduation capstone on using survival analysis to predict when and why the U.S. Supreme Court make and overrule decisions the way they do. I learned data preprocessing, applied statistical theories like Kaplan-Meier curves into understanding the current political environments, and found lots of cool things such as, court decisions that are more liberal are twice as likely to be overruled than decisions that lean more on the conservative side.

These experiences prompted me to study data science through a Masters program. At Georgetown, I not only took the most challenging electives such as Optimization and High Dimensional Data Streaming, but also worked at two internships last year. At Hindsight Technology, I worked as a Data Scientist Intern with four other graduate students, on designing an entity ranking algorithm that works across heterogeneous text data. This project has allowed me to apply Topic Modeling, Graphical Lasso, DBSCAN clustering, and Word2Vec in streaming user activity data. I have also gained substantial experience on feature engineering, building machine learning pipelines for deploying the model in production, scaling the performance by incorporating AWS into the data processing routine (One of the largest dataset I’ve worked with has 170 million records). In the mean time, I also worked for Georgetown University’s Business Design and Optimization Group as an Analytics Consulting Intern. I was tasked with a variety of business operations optimization projects, such as building interactive maps in R Leaflet, interpreting and editing JavaScripts on Google Apps to meet clients’ needs, and creating automation dashboards for reporting. Through these experiences, I have built a comprehensive data science skill set that ranges from data engineering , to statistical modeling, to interpreting the results and providing business solutions. I have also gained experience working with a wide range of data and tools.

Non-Work Related Activities/Interests:

I have volunteered for my neighborhood’s pandemic outreach program, where we do groceries for people with underlying health conditions to reduce their risk of exposure to the virus. During my free time, I play piano (classically trained for 10 years), draw, and do Pilates.