Which Is Cleaner For The Environment: Training An AI Model Or Five Cars? - Alternative View

Table of contents:

Which Is Cleaner For The Environment: Training An AI Model Or Five Cars? - Alternative View
Which Is Cleaner For The Environment: Training An AI Model Or Five Cars? - Alternative View

Video: Which Is Cleaner For The Environment: Training An AI Model Or Five Cars? - Alternative View

Video: Which Is Cleaner For The Environment: Training An AI Model Or Five Cars? - Alternative View
Video: The surprising solution to ocean plastic | David Katz 2024, July
Anonim

The field of artificial intelligence is often compared to the oil industry: once extracted and refined, data, like oil, can become a very profitable commodity. However, it is now becoming apparent that this metaphor is expanding. Like fossil fuels, deep learning has a huge impact on the environment. In the new work, scientists at the University of Massachusetts Amherst assessed the learning lifecycle of several common large AI models.

It found that this process could generate over 626,000 pounds (about 300,000 kg) of carbon dioxide equivalent, nearly five times the emissions of a typical car in five years (including the production of the car itself).

How AI models are trained

This is a stunning quantification of what AI researchers have long suspected.

Natural Language Processing Carbon Footprint

Promotional video:

The paper specifically addresses the process of training a model for natural language processing (NLP), a subfield of AI that deals with training machines to work with human language. Over the past two years, the NLP community has made several important milestones in the areas of machine translation, sentence completion, and other standard grading tasks. The infamous OpenAI GPT-2 model, as an example, has succeeded in writing convincing fake news stories.

But such advances required training increasingly large models on stretched datasets from sentences pulled from the Internet. This approach is computationally expensive and very energy intensive.

The researchers looked at the four models in the area responsible for the biggest leaps in performance: Transformer, ELMo, BERT, and GPT-2. They trained each of them on a single GPU for a day to measure the power consumption.

They then took the number of training hours specified in the original model documents to calculate the total energy consumed during the entire training process. That amount was converted to the equivalent of pounds of carbon dioxide, which was consistent with the AWS energy mix from Amazon, the world's largest cloud provider.

It found that the computational and environmental costs of training increased in proportion to the size of the model, and then increased exponentially when the final accuracy of the model was adjusted. A neural architecture search that attempts to optimize a model by gradually changing the neural network structure through trial and error incurs extremely high costs with little performance gain. Without it, the most expensive BERT model left a carbon footprint of 1,400 pounds (635 kg), close to a trans-American round trip.

Moreover, these figures should only be considered as baselines.

In all, the scientists estimate that the process of creating and testing the final model worthy of publication required training 4,789 models in six months. In terms of CO2 equivalent, this is about 35,000 kg.

The significance of these numbers is colossal, especially given the current trends in AI research. In general, AI research neglects efficiency because large neural networks are recognized as useful for various tasks, and companies with unlimited computing resources will use them to gain a competitive advantage.

Ilya Khel