The Middleware and Grid Interagency Coordination Team (MAGIC), an public-private team under NITRD’s Large Scale Networking Interagency Working Group, meets monthly to discuss current and future technologies and collaboration activities in the distributed computing. See https://www.nitrd.gov/coordination-areas/lsn/magic/ for more information on MAGIC.
MAGIC is composed of experts from academia, industry, and government, and led by federal co-chairs, Richard Carlson, Program Director, Advanced Scientific Computing Research, Office of Science, Department of Energy and Vipin Chaudhary, Program Director, Office of Advanced Cyberinfrastructure, National Science Foundation.
Currently MAGIC is conducting a series of public meetings to examine different aspects and interconnections of the scientific data life cycle (e.g., gathering, triaging, analyzing, archiving and reusing data). This multi-session series is examining issues such as provenance, verification/quality assurance, tools and services involved in managing complex scientific infrastructure and tools. The issues involved in the data life cycle cuts across scientific domains and need to be addressed to enable cutting edge R&D efforts.
On February 6, 2019, MAGIC began the series with talks that presented a general overview on the issues and challenges involved in the scientific data life cycle.
- Deduce: Distributed Dynamic Data Analytics infrastructure for Collaborative Environments- Deb Agarwal (PI), Department Head and Senior Scientist and Lavanya Ramakrishan, Staff Scientist, Lawrence Berkeley National Laboratory.
- Scaling Data-Driven Scientific Discovery – A Data Lifecycle View – Suhas Somnath, Computer Scientist, Advanced Data and Workflows Group, National Center for Computational Sciences, Oak Ridge National Laboratory.
- Addressing the Data Challenges at the Advanced Light Source – Alexander Hexemer, Senior Staff Scientist, Program Lead for Computing, Advanced Light Source, Lawrence Berkeley National Laboratory.
On March 6, 2019, MAGIC explored scientific use cases, as described below.
- Multi-Messenger Astronomy and the Discovery of a Neutron Star – Neutron Star Merger – Peter Nugent, Senior Staff Scientist and Department Head for Computational Science, Lawrence Berkeley National Laboratory.
- Using Computation and New Sources of Data to Understand Cities – Charlie Catlett, Senior Computer Scientist, Argonne National Laboratory and Senior Fellow at the Mansueto Institute for Urban Innovation and Harris School of Public Policy, University of Chicago.
- Scientific Data Lifecyle: Perspectives from an LHC Physicist – Shawn McKee, Research Scientist, University of Michigan Physics Department, Director of the ATLAS Great Lakes Tier-2 Center, Director of the Center for Network and Storage-Enabled Collaborative Computational Science.
On April 3, 2019, MAGIC hosted presentations on data use/re-use and data provenance and integrity.
- Continuous Learning About Data: Experience from the Dark Energy Survey and NCSA – Margaret Johnson, Assistant Director, National Center of Supercomputing Applications (NCSA) and Don Petravick, Senior Project Manager, NCSA.
- Empowering Data-driven Discovery with a Lightweight Provenance Service for High Performance Computing – Yong Chen, Associate Professor, Computer Science, Director, Data-Intensive Scalable Computing Lab Texas Tech University, Site Director, NSF Cloud and Autonomic Computing.
On May 1, 2019, MAGIC focused on the data triage process.
- Ikay Altintas, Chief Data Science Officer, San Diego Supercomputer Center, Division Director, Cyberinfrastructure Research, Education, and Development.
- A Data Ecosystem to Support Machine Learning in Materials Science – Ben Blaiszik, Research Scientist at University of Chicago, Globus and Argonne National Laboratory, Data Science and Learning Division.
- Simplifying data management through storage system design – Glenn Lockwood (NERSC), HPC Performance Engineer, Advanced Technologies Group, National Energy Research Scientific Computing Center (NERSC) – large scale data storage.
On June 5, 2019, MAGIC delved into tools used in the data analysis phase of the data life cycle.
- Open OnDemand Overview – Alan Chalker, Director of Strategic Programs, Ohio Supercomputer Center.
- Jupyter – An Interactive Platform for Scientific Computing and Data Analysis – Shreyas Cholia, Group Leader, Useable Software Systems, Lawrence Berkeley National Laboratory.
On July 3, 2019, MAGIC further explored the data triage and analysis, including the organizational challenges with respect to stewardship and preservation.
- Organizational Challenges to Promoting Data Sharing, Stewardship and Preservation – Francine Berman, Hamilton Distinguished Professor of Computer Science, RPI, Co-founder, Research Data Alliance.
- Online Data Analysis of Molecular Dynamics Simulations for Exascale Computing Platform – Hubertus van Dam, Application Architect, Brookhaven National Laboratory.
MAGIC is currently exploring connections with other NITRD Interagency Working Groups to discuss relevant activities in the lifecycle of data, from its creation to archival storage.
MAGIC holds public meetings on the first Wednesday of each month (12-2 pm ET). If you would like more information on MAGIC’s activities, please contact NCO at nco@nitrd.gov.