Open Source Data Collection and Visualization

Friday, March 3, 2017

11:20 am – 1:00 pm

Location: Room 207

**Preregistration is REQUIRED for this event**

As computational system grow there becomes a greater desire to increase their efficiency.  Gone are the days where just adding a few computers in a pool of resources and doing science, now these pools consume multi-megawatts of energy, require their own clusters for storage and simply take up a large footprint in physical space.  Now what was once small savings in cooling or power, can now add up to mega-watts of savings.  This growth demands changes to data collection and monitoring and NERSC has addressed this issue with an array of Open-source projects to create a robust and scalable data collection setup.  In this tutorial we will discuss the collection setup, scaling issues, early visualizations methods that allow us to view millions of data points in a few graphs.  Also the participants will install and run against a subset of data, creating their own visualizations of this data.  Upon leaving the tutorial the participant will be able to setup a small data collection and understand methods to help to visualize and scale it.

Bring your laptop to this session.

Presented by:

Cary Whitney – NERSC

Cary Whitney is a computer scientist at NERSC who has always been involved with data gathering and presentation. Coming to LBL in 1999, he was in a team that managed the High Energy Physics cluster (PDSF) and eventually moving on to the High Performance Computing (HPC) system. In both groups, he monitored the systems and collected data.   While reinventing the wheel yet again, he figured there had to be a better way to manage all this data being collected.  He transitioned to the Operations Technology Group and learned that this group was also interested in collecting environmental data of the computational facility and figured he could make one big data collection infrastructure instead of two separate ones.  This is how the data collection project at NERSC was born.

Cary Whitney

Elizabeth Bautista – NERSC

Elizabeth Bautista is Group Lead for the Operations Technology Group, the team that ensures the management and operation of the National Energy Research Scientific Computing Center (NERSC), the primary computation facility for the Department of Energy Office of Science. The group ensures the continuous resource availability to users and is the central location for monitoring, problem reporting, triage and resolution, data collection, and emergency response. They also serve as the Network Operations Center for the Energy Sciences Network ( Since joining NERSC in 1999, she has worked in the Computer Operations & Network Support Group, Workstation Support as a team lead and in the Computational Systems Group as a member of the PDSF team.  Having worked in both center operations and system administration, she brings a unique perspective as the lead in executing tasks, developing employees and energizing change. As a member of the Lab’s Diversity and Inclusion Council and the Computing Science Diversity, she coordinates with Human Resources to develop recruitment and retention programs that meet the strategic mission for diversity. As such she has been involved with Broader Engagement at SC, the Grace Hopper Conference and the Richard Tapia Celebration of Diversity in Computing to broaden participation for women and underrepresented minorities in STEM. Elizabeth manages the student internship program at NERSC. She continues to explore programs that broaden the student pool and create a pipeline for recruitment and workforce development. She has a B.S. in Computer Information Systems and an MBA in Information Technology Management, both from Golden Gate University in San Francisco.

Elizabeth Bautista