Scientists Use Less Than 1% Of The Data Collected By The Large Hadron Collider - Alternative View

Scientists Use Less Than 1% Of The Data Collected By The Large Hadron Collider - Alternative View
Scientists Use Less Than 1% Of The Data Collected By The Large Hadron Collider - Alternative View

Video: Scientists Use Less Than 1% Of The Data Collected By The Large Hadron Collider - Alternative View

Video: Scientists Use Less Than 1% Of The Data Collected By The Large Hadron Collider - Alternative View
Video: The Incredible Science of CERN 2024, November
Anonim

It is clear when there is a problem to obtain scientific data. But it turns out there is a problem to save and process them.

The entire series of high-profile discoveries made with the collider were based on the analysis of data, the volume of which is less than one percent of the total volume of generated data.

The rest of the data is lost irretrievably.

The 26.7-kilometer collider tunnel is used to accelerate particles close to the speed of light. Two streams of particles moving in opposite directions collide at points in space monitored by sensitive sensors. Even at the lowest density level of proton beams containing 120 billion protons each, the number of collisions is 30 million collisions per second.

According to information published on the website of the European Organization for Nuclear Research CERN, one billion collisions per second creates a data stream of 1 petabyte per second. And this is the biggest problem at the present time, since a data stream of such a speed is simply impossible to store, let alone process it properly. “At a minimum of 30 million collisions, we need 2,000 petabytes to store the results of a typical 12-hour collider phase. At 150 collider launches per year, it would take 400,000 petabytes, 400 exabytes of data to store all the data, a huge volume that we cannot even store at the present time,”says Andreas Hoecker, scientist at CERN.

The solution to the problem of a large amount of data is, of course, a drastic reduction in their volume. And this is not done at the expense of any information compression algorithms, for this there is not enough power of all processors of existing supercomputers. The capabilities of the computer technology available at CERN allow saving the results of only 1200 collisions for every 30 million such cases. This is 0.004 percent of the total volume, and the remaining 99.996 percent, as mentioned above, is lost forever.

Image
Image

This state of affairs seems like a terrible waste, but not everything is so sad. Phenomena that are of real interest to scientists do not arise at this rate. For example, the Higgs boson appears at a speed of once per second, while other events occur at a frequency of tens or hundreds of times per second. To highlight the most interesting of the entire data stream, special "triggers" are involved, devices that perform preliminary data filtering mainly at the hardware level. These triggers are developed for each specific case and are tuned in accordance with the properties of the sought particles, such as the Higgs boson, true quark, W and Z bosons, etc.

Promotional video:

Of course, with such an implementation of preliminary data processing, some of the interesting data is lost along with a mountain of unnecessary and uninteresting "garbage". But the remaining information contains mainly significant data, and its relatively modest volume already allows for sufficiently deep processing even in real time.

And in conclusion, it should be noted that the solution to the problem described above is by no means ensuring the possibility of storing mostly useless data. The solution to the problem is to create new sensors for the collider, which will use the latest achievements of modern technologies and which will be able to penetrate into the depths of the currently unexplored areas of physics. By the way, some of these sensors will appear at the collider in the course of its next modernization being carried out right now. And the launch of the modernized collider is scheduled for 2025.