What was India's first supercomputer

Science in the age of supercomputersBig data, big theory - which path leads to knowledge?

"In almost every detective novel, since the admirable stories of Arthur Conan Doyle, there comes a moment when the great detective has gathered all the factual material that he needs - at least for a certain phase of his investigation. However, the detective immediately realizes that he can also bring about a meaningful order in the collected factual material by mere reflection. " (Albert Einstein and Leopold Infeld, "The Evolution of Physics", 1938)

"We connect telescopes all over the world. We then store the radio light on hard drives and then process it in the computers. We store three or four petabytes of data. We then send that into large computer clusters. This material often looks very confused , incoherent and unrelated. For example particle physics!

Gigantic amounts of data accumulate at CERN - which are then searched by supercomputers for significant patterns (picture alliance / Christine Palasz)

At CERN there are particle accelerators that look for certain collision events. That's an incredible amount of data. You can no longer do that with the naked eye. Automate that! But then you know that you are throwing away certain information completely. And if there were something very exotic that only pops up every now and then, it falls into the nirvana of the computer somewhere. It is not even noticed. That is something where the human being can of course discover things with his creative nature and his curiosity that were not planned. "(Heino Falcke, astronomer)

"Data doubles in science every 18 months, that is the global data situation in science in every field. This is also a trend that has continued for over ten years. Without big data, we can actually no longer operate complex biology . Even small laboratories generate data on a large scale and need computers these days. " (Jan Korbel, molecular biologist and bioinformatician)

The British physicist Michael Faraday was born in 1791. The Faraday cage is by no means his only groundbreaking invention. (picture alliance / dpa / Bifab)

"The usefulness of useless knowledge"

"My favorite example is electricity. In the 1830s, Michael Faraday played with electricity. He did funny things, made hair stand on end and things like that."

Audimax of Bielefeld University: Robbert Dijkgraaf - theoretical physicist and director of the Institute for Advanced Study in Princeton - talks about "the usefulness of useless knowledge".

"And once the Treasury Secretary went to Faraday's laboratory and asked, What is the practical use of electricity? And Faraday said, I don't know. But one day you will tax it!"

It is about the ability to think beyond what is visible, existing and known. The Dutch chemist Jacobus van‘t Hoff - the first Nobel Prize winner for chemistry - was such a thought leader. At the end of the 19th century he postulated, among other things, the spatial arrangement of carbon atoms in molecules.

Jacobus Henricus van ’t Hoff was the first Nobel Prize in Chemistry (imago stock & people / UnitedArchives)

The professional world initially ignored van’t Hoff's work, colleagues joked. Van't Hoff - as it is for example in the "Journal for Practical Chemistry" of 1877 - apparently got his ideas "on the chemical Parnassus climbed by daring flight. A doctor van't Hoff finds, it seems, in exact chemical research no taste. "

"Van't Hoff was so hurt by this that he was criticized for his ingenuity! For his inaugural address at the University of Amsterdam later, he had therefore looked at the biographies of the 200 most famous scientists of the past centuries - Galileo, Newton and so on. He subdivided they fall into three categories: first, those with no sense of fantasy, then those with fantasy and an interest in art, music and literature, and third, those with an overflowing imagination who are also interested in strange things like astrology or mysticism. Van't Hoffs In their opinion, 15 of the 200 most important researchers belonged to this third category. Including Descartes, Leibniz, Newton. So - I would say the top 15. "

Married couple Albert Einstein and Mileva Maric-Einstein, taken in 1912 (picture alliance / akg-images)

Purely abstract thinking with Albert Einstein

Albert Einstein, too, initially developed the framework for his theory of relativity outside the academic world. As an employee in the Bern patent office, after work and on Sundays, in exchange with a few friends and his then wife Milena. His thoughts also seemed absurd to many colleagues: that there is neither absolute space nor absolute time, but that lengths and times depend on the state of motion of the viewer; that mass and energy are equivalent; that gravity can be described as a property of a curved space-time. Robbert Dijkgraaf:

"This is such an amazing human achievement! Especially because Einstein was able to develop it with almost no data. It was purely abstract thinking! Machines analyze how things work very well, but only humans ask: 'Why'? The theory of gravity, for example - according to Newton, there had been a lot of careful experiments with heavy bodies for a few hundred years and so on. But the question remained: Why does gravity work? To give an answer - that requires a whole new perspective. Einstein had that . And that's what people can do: take a different perspective and ask questions that you can only ask if you can move outside of a system. "

But how do you succeed in placing yourself outside a system? And why should only humans be able to do that? New theories often take decades to establish themselves. And once thinking buildings have established themselves, they cannot be brought down so quickly. Isn't it completely plausible to expect an unbiased view of the phenomena at hand, especially from machines? Who never have to apply anywhere, don't need letters of recommendation, have no interests?

The first picture of a black hole

In April 2019 the "Event Horizon Telescope" project presented the image of a black hole. Immediately it's about the world - even if the details of the interpretation are quite controversial. Heino Falcke, astrophysicist at Radboud University Nijmegen:

"This is the core of the galaxy M87 - and this is the first image of a black hole. An orange-red ring with a dark spot in the center. We see how light disappears into the darkness, in the shadow of the black hole. We see how light turns, flies in a circle, so to speak, forms this almost perfect ring. And that's why black holes have suddenly become reality after decades of abstraction. "

Black holes are one of the predictions of Einstein's theory of relativity. Andrea Ghez and Reinhard Genzel, who have just received the Nobel Prize for their observations in the center of our Milky Way, were the first to show that they actually exist. In 2015, it was possible to detect gravitational waves caused by the collision of two black holes. Now the picture of one of these gravitational monsters. All within a few years - and: 100 years after Einstein published his equations. Years it took to discuss the theory; to develop the experiments for their testing and to build the necessary devices - and to learn to operate these devices and to evaluate the results of the experiments.

First image of a black hole (Event Horizon Telescope (EHT) / dpa)

Data material battle with computer clusters

"We needed a telescope as big as the world."

A 350-strong international team of researchers worked on mapping the black hole for around 20 years. Heino Falcke describes the collaboration in his book "Licht im Dunkeln":

"We connect telescopes all over the world, we then store the radio light on hard drives; three or four petabytes of data, raw radio light. We then send that through large computer clusters, which synthesize it. Image processing methods and calibration algorithms are required So it is computationally intensive, a lot of material and effort in every respect, a real material battle where all the techniques are used. "

It is a long process of maturation - from theory to observable facts that can confirm or disprove the theory. The methods and technology required for this test are often developed in the first place in this way.

New instruments, new insights in astrophysics

"Okay, let's go to another building." Heidelberg-Königstuhl, observatory of the University of Heidelberg, instrumentation department. Astronomical analysis instruments are built here under the direction of Walter Seifert.

"In here?" "Yes, in here. Near-infrared spectrographs, optical spectrographs, imaging systems - everything that is attached to the telescope that functions as a light collector. In the sense of an instrument that extracts astrophysical information from the light - such as, for example here, 4MOST; this will be a spectrograph for the ESO-VISTA telescope. "

One of the most recent projects at the state observatory: the confirmation of the redshift of light in the sun's gravitational field. Just like black holes, a consequence of the theory of relativity.

At the observatory, the astronomer Hans-Günter Ludwig is developing simulations of flow processes on the sun that can disrupt redshift effects: "Of course I need measurement data! And without people like Walter, who builds these things, I would soon end up too." Much of what happens in astronomy today is actually instrumentation-driven. If it hadn't been for Gravity, Reinhard Genzel probably wouldn't have received the Nobel Prize without these technical innovations, or Heino Falcke and Co. with their image of the black hole. "

And Walter Seifert adds: "Or also with CERN and so on, with all these major research projects it is like this: of course - new instrumentation, new analysis instruments. You can do new science with them."

Circular reasoning danger in instrument design

Gravity - the technology that makes it possible to interconnect several telescopes to form a large virtual telescope - or the particle accelerator at CERN are just two examples: for advanced engineering that generates data that should provide answers to questions that arise within a theoretical framework.

"Then, of course, there comes exactly this phase where it has to be precisely defined: What should the instrument look like so that it meets all the scientific requirements?

"If you make a mistake here, can it significantly falsify a research result?" "It could. That is why it is so - now in the laboratory is exactly the phase where we do tests. Simulations. To see whether the instrument does exactly what it should do. Without it, no instrument works - it will not released for scientific observation. "

Hans-Günter Ludwig: "We have said that we are confirming the general theory of relativity, but it is more that we would have doubted our measurements had it not come out. That we do not doubt the theory of relativity so much, but rather the question of the measurement method. "

A clear theoretical framework produces precise analysis tools. But: is there no danger of circular inference? You want to find something that you think exists, you develop tools and methods to find it, and if you can't find anything, do you suspect that you have measured incorrectly? That more data and even finer tools are needed?

"Well - we expect surprises, of course. Of course, we have a pretty concrete idea in the known context of what we expect. But we hope that these measurements give us something beyond the known context, so that we experience something new there."

Existing theory can make you blind

The science historian Thomas Kuhn wrote in 1962 in "The Structure of Scientific Revolutions": A dominant theoretical structure will be worked out until there are more and more questions that can no longer be answered satisfactorily within this paradigm. Then it will be replaced by a new paradigm - but only after a phase of searching. Because: just as a theory enables you to see things that no one has seen before - it can also blind you to things that exist - but do not fit into the theoretical structure. Heino Falcke:

"Then you realize ten years later: If I had asked the right question back then, I could have discovered it back then. And for this it is important to save data and work through it again and again."

According to statistics, 59 zetabytes comprised the amount of data that was generated, captured, copied and processed worldwide by December 2020. That's about 60 million times the amount of data a human brain can store, and it doubles every two to three years. There is a good chance that there will also be references to connections that go far beyond existing knowledge.

DNA sequencer in 2009 - since then the technology and the analysis capacity have advanced again by orders of magnitude (dpa picture alliance / Jan Woitas)

Big data and supercomputers are essential for genome analysis

"Now we come to the area of ​​our sequencing machines ..." European Molecular Biology Laboratory in Heidelberg. Jan Korbel, geneticist and bioinformatician, leads through the laboratories of his work group.

"We are already working with thousands of human genomes, from the genetic make-up of cancer patients, and comparing them. In order to find out more about it: How does cancer develop? How can an environmental influence and a hereditary factor, how can they interact So that tumors develop later? And programs even have to do this because the genetic material - there are as many DNA building blocks as there are people! Six billion building blocks! In every cell! And that's why we need computer programs to create this To carry out comparison. "

Biology and the life sciences have been particularly changed in recent years by "BigData". "Next Generation Sequencing" about 15 years ago and "Genome Editing" about eight years ago were milestones - machine learning is now also indispensable. Jan Korbel and his team are developing algorithms that compare the genome of tumor cells with healthy cells. This work is firmly interlinked with the "classic" laboratory work, for example on tissue samples.

"We have developed a new method - the first method worldwide - that enables us to look for mutations in individual cells, genome-wide. In order to then better understand cancer mutagenesis. And now we can combine that with microscopy and Then create the end between what has been seen in the microscope for over 100 years - here is a cell division that looks kind of strange - and a mutation process that can be identified in the DNA. And we can now build this bridge with what we propose was unthinkable for a few years. "

DNA sequences decode cell changes found under the microscope. Almost in real time. With thousands of genomes with six billion base pairs each. If you know the mutations, you can adapt therapies. That can save lives. A huge step forward. Machines don't have to do more. And if so: could they?

How can an "Artificial Intelligence" generalize?

Berlin Science Week, beginning of November 2020. A virtual panel of the Cluster of Excellence "Science of Intelligence" at the Technical and Humboldt University of Berlin. Marc Toussaint, Professor of Intelligent Systems, talks about different types of machine learning. He contrasts two ways of drawing conclusions from information. Many deep learning methods follow one: searching neural networks and weighting data according to specific objectives. They learn features by heart, so to speak, and can then provide a suitable example in response to a question on demand, such as: This is also a car. The other way tries to understand the principle that vehicles could generally underlie. The latter is usually called "thinking". The interesting question is: What is the point of "thinking"? And: how do machines and people differ in terms of this ability?

"The term generalization is very important in AI and also in machine learning in general. Imagine a system or a person seeing an apple fall from a tree. And what do we get from it? That is the big question. If I do now If some data, perhaps videos, were from apples falling from the tree and entering these videos into a neural network and practically training the neural network to predict in the future whether an apple will fall or not, then the neural network can safely learn and in the future it will say: Yes, there is a red apple on the tree here, it could fall.If I take as a counterexample that there is a person standing there now - and he sees the apple fall and what he draws from it, is a theory, a model. And this theory and model is so abstract and so wonderful that it applies to so much and can predict so much more than just apples falling. Because it is very, very much more generalized. "

In science, the more comprehensive the predictions that can be made on the basis of a theory, the stronger it is. It arises from the ability to derive rules from observations. Which in turn can also be used in other areas. Toussaint:

"So, if I have now understood gravity, how do I come to know how to build a house so that it doesn't collapse? Closing this gap is highly complex. And we do it often and do it we with what we call thinking. And in this sense the scientific process is actually the greatest form of learning that neural networks actually don't have. "

"But where should you go?" "Um - yes. Exactly. Where to go."

Protein folding analysis is extremely computationally intensive - one approach was therefore "distributed computing" over the Internet (Center for Game Science at University of Washington)

AI Breakthrough in Protein Folding Analysis?

End of November 2020. The AI ​​company DeepMind - like Google a part of the Alphabet Holding - celebrates its newly developed program "AlphaFold" at a press conference:

"As you probably know - our group did exceptionally well in the CASP14 competition; both in comparison to the other groups and in the accuracy of our model as such. It is really outstanding, a big step forward; downright incredible."

It's about the problem of protein folding - one of the greatest challenges in biology for at least 50 years. Bioinformatician Jan Korbel:

"I learned in my university career that it cannot be solved. And now DeepMind has apparently solved it. Of course, we have to wait for the publication to see whether what you have done is really that mind-blowing." So - a protein is a chain of amino acids that forms a tangle. And the tangle ultimately does the function that determines whether the protein in the nose can develop an olfactory receptor or the effect that we can see: they are all proteins. How this folding comes about - that is an insanely difficult problem, because simply, you have a long chain of the amino acids with a lot of possibilities how each amino acid can interact with the other. What that looks like in three-dimensional space it has not been possible to predict reliably so far. "

The shape of a protein determines its function. Anyone who can use the sequence of amino acids to predict how a finished protein will look and function has a key to understanding the basic mechanisms of life. He can use it to prevent undesired functions. Or: create the desired functions in the laboratory. He can understand diseases and develop drugs.

"Humans can't do that. Nobody can mentally fold a protein, but software now seems to be able to do it."

AI goal "Developing Understanding" has gotten out of focus

AlphaFold works with attention-based neural networks. A deep learning variant that only uses information that is currently relevant. As with a puzzle, parts are first put together, between which the connections are clear, only gradually do they come together to form a whole. AlphaFold doesn't know what a protein is. How it interacts with other proteins or what role it plays in an organism. AlphaFold knows the amino acid sequences and structures of 170,000 known proteins. And on this basis, he estimates with great accuracy which structures should belong to unknown sequences. "Thinking" is still not that. Because, according to Marc Toussaint ...

"... the really great, highly generalized learning is not just about learning data by heart, but actually developing theories from data that are applicable to many things."

"That is possible?" "I do think that this is possible. Basically, that has always actually been the goal: that machine learning should deal with learning as highly abstracting models as possible from as little data as possible. Because that was the time when you did little Only because of the changing times, that you suddenly had so much data now, exactly this claim has actually been somewhat lost. "

"You could actually say that this immense flood of data, which has arisen in the last few decades, has, so to speak, made machines stop thinking?" "Yes - you could say that. Well - it is also true that these very data-driven methods are incredibly successful. Really incredibly successful. But, from a scientific point of view, if you are an AI researcher now, it is actually the case it's almost a bit disappointing. Because you have so unbelievable amounts of data, it is no longer so important that the systems really develop theories or compact representations. It is almost enough if they just memorize these billions of data and generalize something between these. And that also worked very well, but does not equal this scientific claim, yes - to develop understanding. "

Robbert Dijkgraaf is also one of those scientists with imagination - besides physics and mathematics, he also studied painting (imago stock & people / Guus Schoonewille)

For the time being, thought experiments are reserved for humans

Machines provide answers - at a speed and precision that humans cannot. They are indispensable when it comes to developing solutions for specific problems - in the gigantic dimensions of space as well as in the microscopic cells and particles. They are driving progress on an unprecedented scale. However: for the time being they only do that where a person has already been. Robbert Dijkgraaf:

"My data sets are mathematical equations. To me they are like toys - in a way, my daily work consists of games. With these equations. Some of them, we think, know more than we do. We do thought experiments. And that's the wonderful thing: In our minds we can penetrate areas into which we have not yet come with experiments. "