Name: Qiulan Huang
Pronouns: she/her/hers
Biography:
I hold a Diploma in Computer Science and earned a PhD in Applied Computer Technology, specializing in Large-scale Mass Data Storage Systems, from the University of Chinese Academy of Sciences in 2014. Throughout my career, I have served as a senior researcher and principal architect in scientific computing systems at the Computing Center, where I have contributed to various experiments, including High Energy Physics, Neutrino, and Photon Sources. Prior to joining BNL, I dedicated over 12 years of my career to the Institute of High Energy Physics (IHEP), Chinese Academy of Sciences. During my tenure, I held the positions of an associate professor and Principal Investigator (PI) for two Chinese NSF research grants. My extensive experience encompasses research and development, as well as administration, of various distributed computing and storage systems. Notably, I served as the site administrator for the CMS experiment at the Beijing Tier-2 facility. My commitment to international collaboration is reflected in my work at prestigious institutions such as CERN, the University of Catania (Italy), and Fermilab, where I established valuable international connections.””
Institution/Lab: Brookhaven National Laboratory
Website: https://www.bnl.gov/staff/qhuang
SRP Collaboration Topic/Title: Data Popularity and Data placement Optimization for big data Analysis
Field or research area: Large scale storage systems, storage optimization and data analytics
Please select all the topical areas that apply to your project:
Data Science (i.e., data analytics, data management & storage systems, visualization); Machine Learning and AI
Brief Abstract:
Scientific experiments and computations, especially those in NP & HEP programs, are generating and accumulating data at an unprecedented rate. Big data provides opportunities for new scientific discoveries. Nevertheless, for the Scientific Data and Computing Center , managing the vast amount of data cost-effectively while enabling efficient data analysis in a large-scale, multi-tiered storage architecture is a real challenge. The topic revolves around the exploration and development of techniques aimed at comprehending data popularity and optimizing data access. Through studying data access patterns, doing data analytics, we can identify frequently accessed datasets, prioritize their availability and subsequently design a policy engine to enhance resource allocation for analytical tasks. This research topic assumes critical importance in today’s data-driven world, as it holds the potential to significantly improve data analysis efficiency, enhance decision-making processes, and fuel innovation across a diverse range of domains.
Desired relevant skills, background, or interests:
High-performance storage system(dCache, Lustre), big data analytics, AI/prediction modeling, monitoring tools like ELK, Python, C/C++, Java, Linux
Other comments:
Do any special requirements apply? Minimum GPA (specify what GPA in comments below); International OK
Other, specify:
Keywords:
Data popularity;AI/machine learning;Data storage;Data analytics
Lightning Talk Title: AI/ML For Storage Optimization