Big Data and the Changing Landscape of Science

Alex Szalay, Bloomberg Distinguished Professor, Department of Computer Science, Department of Physics and Astronomy; Director, Institute for Data Intensive Science, Johns Hopkins University

The talk will present the impact of the emerging big data on Science. Data-driven discoveries are becoming increasingly relevant in all areas of science. Having data at our fingertips is changing the nature of scientific collaborations. Extremely large instruments, and the collaborations operating them are generating tens of petabytes of data and their archives are typically becoming an open resource for wide communities, enabling individuals or even members of the public to analyze them.  Beyond these very large projects, costing often hundreds of millions of dollars, there is a new trend emerging: mid-scale experiments, centered on a unique instrument, like a novel electron microscope or a new telescope, or a new genomic sequencer. These mid-scale collaborations are typically run by relatively young PIs, with a razor sharp focus on their science goals and probably represent the “sweet spot” for science today. Even these mid-scale instruments are capable of generating several petabytes of data, but they cannot afford to build their own separate, vertical computing infrastructure, and thus represent a new challenge for the nation’s cyberinfrastructure.  Given their scale of $10-$50M, even these experiments are too expensive to be easily replicated, and their data is here to stay for decades. The preservation of these high-value datasets with their long lifetimes creates another new challenge for the US science enterprise. The talk will touch upon the different aspects of these complex and dynamically evolving challenges.

Our Human Planet: Why Population Data Are Vital for Science and Sustainability

Robert S. Chen, Director, Center for International Earth Science Information Network, Columbia University

We all have a vested interest in the sustainability of the Earth and the ability of science to improve life for all. With world population approaching 8 billion, understanding both human drivers of environmental change around the globe and the implications of climate change and other challenges on people—from local to global scales and from the short to long term—is essential. Emerging data science approaches, integrating a diversity of georeferenced natural, social, health, and engineering data, offer new ways to discover and analyze complex human-environment interactions and improve modeling and prediction. These efforts are not only important for science, but also for real-world applications in resource management, disaster risk reduction, mitigation and adaptation to climate change, and many other areas of sustainable development. But the proliferation of data and methods, and their application beyond traditional scientific research, also pose challenges with regard to data fitness for use, accessibility, transparency, completeness, validation, responsibility, usability, and trust. New approaches—such as the formation of “data collaboratives” involving data producers, users, funders, and other stakeholders and the development of metrics for trustworthiness of data—are needed.

Physics Guided Machine Learning: A New Framework for Accelerating Scientific Discovery

Vipin Kumar, Regents Professor and William Norris Endowed Chair, Computer Science and Engineering, University at Minnesota

Physics-based models of dynamical systems are often used to study engineering and environmental systems. Despite their extensive use, these models have several well-known limitations due to incomplete or inaccurate representations of the physical processes being modeled. Given rapid data growth due to advances in sensor technologies, there is a tremendous opportunity to systematically advance modeling in these domains by using machine learning (ML) methods. However, capturing this opportunity is contingent on a paradigm shift in data-intensive scientific discovery since the “black box” use of ML often leads to serious false discoveries in scientific applications.  Because the hypothesis space of scientific applications is often complex and exponentially large, an uninformed data-driven search can easily select a highly complex model that is neither generalizable nor physically interpretable, resulting in the discovery of spurious relationships, predictors, and patterns. This problem becomes worse when there is a scarcity of labeled samples, which is quite common in science and engineering domains. This talk makes a case that in real-world systems that are governed by physical processes, there is an opportunity to take advantage of fundamental physical principles to inform the search of a physically meaningful and accurate ML model.  Even though this will be illustrated for a few problems in the domain of aquatic sciences and hydrology, the paradigm has the potential to greatly advance the pace of discovery in a number of scientific and engineering disciplines where physics-based models are used, e.g., power engineering, climate science, weather forecasting, materials science, and biomedicine.

The Nexus of Data Science and Materials Processing - A New Vista for MSE

Diran Apelian, Distinguished Professor of Materials Science and Engineering; Director, Advanced Casting Research Center (ACRC), University of California, Irvine; Provost Emeritus, Worcester Polytechnic Institute

The 21st century is the era of the 4th Industrial Revolution, which has been coined “Industry 4.0 and the Future of Work”.   However, it really is not just about the future of work, but rather the worker.  Accordingly, we have trademarked Industry 4.2 TM to focus on the Future of Work and the Worker.  The revolution in data science with the breakthroughs we are enjoying with high performance computing has enabled us to make advances in materials processing that were not imagined a decade or two ago.  We are entering a new period in materials science and engineering (MSE) that will transform manufacturing as we know it, from processing, to alloy development, to materials recovery and recycling.  In this presentation, the opportunities, and challenges for MSE during the 4th Industrial Revolution, with a few case studies at the nexus of data science and engineering will be presented.  Lastly, the educational needs to prepare students for the 4th industrial revolution and curricular needs will also be introduced.

Creating an Organized, Searchable Data Repository On Next-generation Materials

L. Catherine Brinson, Sharon C. and Harold L. Yoh, III Distinguished Professor and Chair, Mechanical Engineering and Materials Science Department, Duke University

In this presentation we explore the science of building databases on polymer nanocomposites and structural metamaterials to accelerate the design and discovery of new materials by using machine learning to understand the fundamental properties of materials. Because of the complex mechanisms involved in nanocomposite formation and response, and the isolation of data sets from each other, both the fundamental understanding and the discovery of new nanocomposites is Edisonian and excruciatingly slow. We address this issue by creation of a living, open-source data resource for nanocomposites.

Now You See Me, Now You Don’t: Understanding Chemical Tricks That Impact Environmental Regulations

Diana Aga, Henry M. Woodburn Chair Professor of Chemistry, University at Buffalo

This presentation will provide a brief overview of: (1) how problematic chemicals that have been previously banned are modified, but end up presenting similar or even worse environmental hazards (e.g., new Perfluoroalkyl substances that have been labelled a “regrettable substitutes”, (2) how some environmental pollutants escape analytical detection in the environment, potentially mislead environmental regulations (e.g. chiral compounds, highly-water soluble compounds), and (3) examples of environmental pollutants that have no commercial or domestic use, but have been introduced intentionally or accidentally into the environment causing deleterious effects. An example of the latter are dioxins, which are highly toxic; these contaminants are formed during the production of some organic compounds, including a few herbicides such as SilvexTM.  Many Investigations on the fate, transport, and effects of environmental pollutants have paid little attention to the identification of transformation products formed during biodegradation and treatment. However, for a complete risk assessment, it is important to include the appearance of persistent transformation products because these compounds may have their own biological activity that can contribute to ecological and health effects of these chemical pollutants. This presentation will discuss examples to demonstrate that the disappearance of the parent compounds does not necessarily mean that the contaminants have been “treated” and rendered non-toxic. 

Navigating Disinformation and the Mobile Truth in Science Communication: A Cervantine Approach

David Castillo, Professor of Spanish, Department of Romance Languages; Director, Humanities Institute, University at Buffalo

This talk will discuss the problem of the “mobile truth” in our market society, the disinformation challenge (including in the context of climate change science communication) and our resistance to facts in our deeply siloed media environment. The presentation will discuss how the inventor of modern fiction (Miguel de Cervantes, 1547-1616) envisioned his own literary craft as a series of experiments in the arts of illusion and made it his mission as illusion professional to educate people on the mechanics of deception, especially mass-deception. His literary “dis-illusions” or “un-deceptions” (desengaños in the original Spanish) have as much currency today as they did in his own age of inflationary media.

Bridging the Climate Chasm: Culture as Strategy in the Climate Crisis

John Fiege, Assistant Professor, Department of Media Study, University at Buffalo

How can we bridge the chasm between scientific consensus and societal action with regard to the climate crisis? As a filmmaker, photographer, and soon-to-be podcaster, I have spent my career contemplating this question and searching for images and stories that engage a wide range of audiences in the conversation about ecological sustainability. In this talk, I will discuss several of my projects—particularly my new environmental podcast, called Chrysalis - and why I see culture as an essential, but underutilized, strategy in confronting the climate crisis.

Bridging the Climate Chasm: The Aesthetics of Accumulation and Waste

Matt Kenyon, Associate Professor, Department of Art, University at Buffalo

My studio, S.W.A.M.P. (Studies of Work Atmospheres and Mass Production) focuses on critical themes addressing the effects of global corporate operations, mass media and communication, military- industrial complexes, and general meditations on the liminal area between life and artificial life. I like to work with media and technology because on the one hand, I am familiar with them and like the power they hold, and, on the other hand, I want to use art and design to critically examine and disrupt that power. My interdisciplinary art practice combines materials research, biological art, chemistry and phenomena, performance art, and physical computing.