Data analysis and data processing are technologies that are increasingly prevalent in everyday life, with applications in research, industry, commerce, and public administration. However, they also have a significant environmental impact, both direct and indirect, due to their inherent nature.
According to a report by the International Energy Agency (IEA), the collection, storage, processing, and analysis of data account for an estimated 1 to 1.5 percent of global energy consumption. Furthermore, the same IEA report indicates that data centers and the data transmission network are responsible for approximately 1 percent of greenhouse gas emissions related to electricity production and consumption, significantly contributing to global warming. Additionally, the use of non-renewable resources, such as fossil fuels and rare metals, in hardware production should not be overlooked. Furthermore, data centers consume substantial amounts of water, as they require a constant temperature and humidity for optimal operation. Many of them use a liquid cooling system, which can lead to water recycling but also results in increased electricity consumption. While the continuous evolution of technologies in data science advances technological progress, it also leads to rapid obsolescence of hardware and software. This inevitably results in the generation of electronic waste, which may contain toxic additives and hazardous substances if not properly disposed of. According to the United Nations (UN) Agency, the world generated 53.6 metric tonnes (Mt) of electronic waste in 2019, and estimates predict that this figure will increase to 74.7 Mt by 2030.
In this context, the work of Graph Massivizer is of utmost importance. Focusing on the four areas investigated in the project (“Sustainable Green Finance,” “Global Environment Protection Foresight,” “Green Artificial Intelligence for the Sustainable Automotive Industry,” and “Data Centre Digital Twin for Exascale Computing”), Graph Massivizer aims to improve data analysis efficiency by 70 percent and reduce the energy impact of extract-transform-load operations on data by 30 percent. Moreover, it is expected to enhance data center energy efficiency by a factor of two and reduce greenhouse gas emissions associated with operations on graph-organized databases by over 25%. The “Data Centre Digital Twin” use case, involving CINECA and the Alma Mater Studiorium University of Bologna, is crucial. It revolves around creating a virtual representation of the world’s fourth-fastest supercomputer, LEONARDO, in a digital graph form. This representation is fundamental for studying and comprehending its operation, enabling a clear and concise portrayal of all possible relationships within a complex structure like a data center. The study and analysis of these relationships lay the foundation for optimizing the efficiency and sustainability of the next generation of supercomputers, known as exascale supercomputers.
While data science and data processing play essential roles in the fight against climate change, they also possess the potential to be indispensable tools for environmental sustainability. The Graph Massivizer project serves as an example of how technology can drive progress, ultimately reversing the course of climate change and reducing the environmental impact of groundbreaking discoveries
CINECA, December 2023