Johannes Blaschke

Name: Johannes Blaschke
Pronouns: he/him/his

Biography:
Johannes Blaschke has worked in the Data Science Engagement group at NERSC since 2019, leading the NESAP for Data project, and the workflow readiness activities for the next-generation NERSC systems. NESAP for Data addresses data-intensive pipelines that process massive datasets from experimental and observational science facilities like synchrotron light sources, telescopes, microscopes, particle accelerators, or genome sequencers. His work includes researching new high-productivity HPC programming models and techniques; developing cross-facility workflows; and applying novel simulation techniques to a broad range of complex systems.

Institution/Lab: Lawrence Berkeley National Laboratory
Website: https://www.nersc.gov/about/nersc-staff/data-science-engagement-group/johannes-blaschke/

SRP Collaboration Topic/Title: Workflow enablement on HPC

Field or research area: Computing

Please select all the topical areas that apply to your project:
Computer Science (i.e., architectures, compilers/languages, networks, workflow/edge, experiment automation, containers, neuromorphic computing, programming models, operating systems, sustainable software); Data Science (i.e., data analytics, data management & storage systems, visualization); High-Performance Computing

Brief Abstract:
This project explores various workflow technologies, and how they might be used on a Supercomputer at NERSC. Future Supercomputers will offer a broad range of capabilities at NERSC Examples relevant to scientific workflows on the include: simulation performance, data movement and management, application of machine learning, organization and composition of complex computational tasks, interaction with external resources beyond the NERSC data center (e.g. edge, cloud, and cross-facility). We will be working on developing benchmarks and test cases for a range of workflow technologies, depending on the applicant’s field of study and personal interest. Some example focus areas are: 1. Exploring use cases for workflow managers, eg: Fireworks, Snakemake, Parsl, etc 2. Exploring non-MPI communication libraries, eg. Distributed.jl, Legion, etc 3. Exploring new storage technologies and tools

Desired relevant skills, background, or interests:
Experience with: 1. Linux / Git 2. Compiling software (with Make and CMake) and managing shell environments 3. Some basic programming skills in C, Rust, Julia, and/or Python are a must. Depending on the focus area, applicants should have a strong interest in scientific software workflows for HPC systems.

Other comments:

Do any special requirements apply? Permanent Resident OK; International OK
Other, specify:

Keywords:
workflows; advanced scheduling; workflow managers; storage technologies; high-performance computing; Julia language; Python; REST;

Lightning Talk Title: Enabling Advanced Workflows on a Supercomputer