
Name: Duan Tu
Pronouns: she/her/hers
Institution: University of Illinois Chicago
Department: Mathematics, Statistics, and Computer Science
Biography:
Duan Tu is a 4th-year PhD candidate studying mathematical computer science at the University of Illinois Chicago. Her research interests focus on machine learning theory, and she frequently employs tools from probability theory and combinatorics to study ML problems from a discrete math perspective. Before attending graduate school, she earned her undergraduate degree in mathematics at the University of Florida. Duan has gained experience working in interdisciplinary teams. During college, she collaborated with neuroscientists at the Khoshbouei Lab at the University of Florida, where she created mathematical models to project changes in mice neurons during aging. In the summer of 2022, Duan interned at AbbVie Pharmaceuticals. She programmed a Graphic User Interface (GUI) in MATLAB to model drug concentration in the bloodstream. Later, she deployed the GUI to an internal server, and the application now serves approximately 500 scientists in the company’s discovery department. Despite her limited experience in sustainability-related projects, Duan is deeply passionate about the field and is eager to explore new opportunities and make meaningful connections through SRP.
Academic Status: PhD Student
Year in program: 4th
Research Area/Department: Applied Mathematics; Computer Science; Data Science; Machine Learning/AI; Mathematics
Other, specify:
Major/Specialty: Mathematical Computer Science
Degrees Earned or in Progress: B.S. in Mathematics, May 2020 M.S. in Mathematics, December 2022 Ph.D in Mathematics, Expected May 2026
What courses or academic preparation have you completed to prepare for a summer internship experience?
I have taken multiple graduate level courses in the area of data science and machine learning, such as Theory of Machine Learning, Theory of Data Science, Combinatorial Optimization, High Dimensional Probability, and Statistical Theory. I am proficient in MATLAB and Python.
Have you published any research or worked on research/technical projects? Yes
Where has your research been published or where have you conducted research/technical projects? (1) Shaerzadeh, Fatemeh et al. “Microglia senescence occurs in both substantia nigra and ventral tegmental area. Glia vol. 68,11 (2020): 2228-2245. doi:10.1002/glia.23834 This was a joint work with the Khoshbouei lab at the University of Florida, where I finished my undergraduate degree. I conducted statistical modeling for the neuroscientists to study the aging process of mice brain. My advisor for this project was Dr. Maia Martcheva at the University of Florida Department of Mathematics. (2) My Ph.D. advisor is Dr. Lev Reyzin at the University of Illinois Chicago. My current project studies sample complexity of the k-wise statistical learning model. I use techniques from adaptive data analysis and differential privacy to analyze sample complexity tradeoffs between different methods of reusing data.
Please describe your research/academic interests:
My research focuses on theoretical analysis of computer science problems. In particular, I am interested in the theory of machine learning, i.e., what types of problems can be learned, what is the sample complexity to learn a problem, how accurately can the algorithm learn? Coming from an applied math background, I mostly rely on techniques from probability theory, combinatorics, and discrete math in general.
Computational and Data Science Areas:
Applied Mathematics; Computational Science Applications, i.e., Bioscience, Cosmology, Chemistry, Environmental Science, Nanotechnology, Climate, etc.; Computer Science; Machine Learning and AI
Research Synergy:
I believe the world can be understood from various perspectives. For instance, when observing the first snowfall of the year, a physicist may explain it as a phase transition of moisture in the atmosphere, an economist may analyze its impact on consumer behavior, and a psychologist may study how it affects human emotions. While all these perspectives are valuable, I tend to be more convinced by those backed by solid quantitative support. To me, mathematical computer science provides a reliable theoretical basis for many scientific and engineering problems crucial to our society. I chose the field of mathematical computer science because I find theoretical work fulfilling, especially when it intersects with practical solutions to real-world problems. Many projects at the DOE labs benefit significantly from machine learning and data science tools. For example, machine learning methods can accelerate computations, extract insights from massive datasets, and facilitate data visualization. Given my theoretical background, I feel confident in researching novel mathematical and computational approaches for DOE lab projects. My previous internships and academic projects have also equipped me with the skills necessary to implement mathematical tools for solving real-world problems and to collaborate effectively within a team.
Motivation:
I am passionate about sustainability and have long sought opportunities to apply my technical skills in this field. Nowadays, many innovations in sustainability technology incorporate mathematical computer science methods. For instance, the optimization of smart grids heavily relies on combinatorics and network learning. While I am equipped with technical skills, I lack a clear vision of how to apply them to concrete sustainability challenges. The Sustainable Research Pathways program offers an ideal platform for me to gain valuable insights into the field and connect with like-minded scientists who share my aspirations.
Lightning Talk Title: Machine Learning Theory