The research center for intelligence as an emergent phenomenon (short: Emergent algorithmic intelligence, EAI) has been founded in April 2019. Funding has been generously provided by the Carl-Zeiss-Foundation, with complementary contributions by JGU Mainz.
The research mission is to further our understanding how and why intelligent systems, both natural and artificial, work. We are interested both in a conceptual understanding of what is at the core of learning and intelligent systems as such, as well as method development towards "better" algorithmic implementations of intelligence through statistical machine learning.
In the following, we discuss our main research hypothesis and how this is reflected in the structure of our center and the research performed.
Emergence and Statistical Learning
Starting Point: The ML-Revolution
Establishing the center has been motivated by the observation that technical progress might have opened up new perspectives concerning questions of fundamental nature: In the past few years, roughly since the early 2010s, we have seen a revival of artificial neural networks, but now trained on large data sets with large compute resources, usually provided by accelerator hardware like GPUs. Given the vastly improved resources in terms of data and computational means (peta-flops of compute rather than mega-flops in the 1990s, and internet-scale data rather than small curated benchmarks that fit on floppy-disks), we have been observing an unprecedented growth in pattern recognition and modeling capabilities, unlike any provided by any other previous artificial method.
While the success of the methods is truly impressive (from, for example, human-like image recognition at the time of the foundation of the center in 2018/19 to - maybe - signs of even more general reasoning abilities with contemporary large foundation models in 2023) our understanding of why these methods work has been, and still remains (as of early 2023), very limited. We know how to do it, do not know why this actually works so well.
In that regard, research on artificial neural networks shares characteristics with neuroscience, where we have existing systems but do not understand why they work. Unlike the biological counterpart though, we fully understand, in this case, how artificial networks work, but nonetheless struggle with seeing why this is successful.
Why is this Surprising? Why is Learning Hard?
The mathematical approach and theory behind the current successes is that of statistical machine learning: Given a set of examples, a system (computer or natural) faces to task to generalize, i.e., find general rules of how to transfer the knowledge encoded in the examples to the general case.
For complex, high-dimensional data (such as photos with more than just a hand-full of pixels), this is a hard problem: It is actually quite easy to see (even formally) that generalization requires prior knowledge; without explicit or implicit assumptions on how the phenomenon behaves in between the data points, generalization is not possible (mathematically, we would only get random answers):
Example: Classifying photos of fruits by color. Without assumptions, random models would appear with the same likelihood as structured ones. Here, a human, due to her/his own prior knowledge, can see the an obvious pattern, but in an 30000-dimensional space of 100x100-pixel RGB images (3x100x100 degrees of freedom), finding patterns by taking a glance is impossible. So how do we teach this "human" knowledge to a machine?
Worse than that, simple arguments (for example via the bias-variance-trade-off in statistical learning theory) show that the gap is exponential: We need to have much more prior knowledge than what we can (realistically) extract from training data (even when using the whole internet as data source). This means:
- True universal learning (intelligent) systems in a mathematical sense are inherently impossible.
- Systems that work on "all" or at least "most" patterns in the natural world are only possible, if all of these phenomena are restricted to a small subset of mathematically possible patterns. Implicitly, that means that they can be compactly encoded once we have captured their structure (i.e., exponentially less information is needed to describe which of the special patterns we are facing rather than an encoding of one of "any mathematical" pattern would require).
Given that we do have intelligent systems that are very successful at understanding the natural world (e.g., humans, but we do not understand our brain very well) and are getting closer to universally applicable (to "natural" data) algorithmic methods (deep artificial neural networks, which we build ourselves), this begs the question of what is the nature of the required prior knowledge.
So, what is it that makes our universe so simple that we (and maybe our artificial networks) can learn to understand it? What is common between natural language, photographs, music, playing games, and quantum mechanical simulations (all of which are phenomena well-modeled by artificial neural networks)?
At this point, in our perception, the question of understanding artificial intelligence systems and statistical machine learning turns from a research question in math and computer science into a broader question of empirical sciences: We can ask, from the perspective of (statistical) physics, what do we know about the rules our universe seems to (roughly, ignoring the unsolved quantum-gravity riddle) be? And how does this give rise to coarse-scale effective theories that can, again, be understood in a compact and learnable form? Are there more general principles hidden behind this that could explain "how to" build general statistical learning systems? Are deep networks already implementing (some of) these principles?
Similarly, life sciences like biology have already gained insights how living systems organize themselves through biological evolution and adaptation. Neuroscience, in particular, has studied the properties of biological neural networks, such as their evolution over millennia and individual adaptation to outside signals.
The question of how to build a universally useful statistical learning system (or to understand why current systems are getting closer to this goal, or to understand how the natural systems do the trick) appears to be a fundamental and, likely, hard one: If we want to understand how statistical learning works for the world we live in, we have to understand what the basic statistical structure of this world is.
The crucial observation that makes this question actionable (rather than fundamentally elusive) is that we see that very simple artificial systems, such as the now-popular artificial neural networks (which, in essence can be as simple as just stacks of linear transformations and simple switches) appear to capture at least some crucial aspects of this statistical structure – otherwise, their quite general success would not be explicable. It is for this reason that we believe that it might be worth revisiting this originally maybe seemingly meta-physical question and try to find concrete links to structural models observed in empirical sciences.
The key objectives of the research center for Emergent Algorithmic Intelligence are thus:
- Bring together researchers from disparate disciplines in empirical sciences and mathematical and algorithmic modeling to address these questions jointly.
- To further our understanding of natural intelligence as well as artificial statistical learning methods, including research in new methods and their implementation, including unconventional options based on self-organizing physical systems.
It should be emphasized that we perceive the core research question of our center as "one of the big ones", which might be very hard to solve, if possible at all. Concrete research projects therefore are designed in a healthy mix of concrete near-term goals and more forward looking endeavors. Nonetheless, we believe that it is worth not shying away from asking the fundamental questions, in particular in times of turbulent developments like ours.
Structure of EAI
The research center has been build from three "startup projects", representing three research pillars of
- Statistical physics (going bottom-up, with emergence of large structures from smaller components)
- Biological evolution and adaptation: (looking top down, starting at the most complex structures and systems we know about)
- Self-organizing systems and computational matter: A middle ground, where we try to build learning systems in-matter by exploiting physical self-organization principles.
Over the course of the running time of the center, eight additional projects have been added (in a competitive call for proposal procedure among researchers on campus) that expand on these topics and form the final overall 11 projects within the center.
The process of bringing in additional colleagues and ideas has contributed to shaping the structure and direction of the center and to better understand potential contributions from other disciplines. One of the most important outcomes in that regard was probably the addition of several projects and researchers working in neuroscience, who are not only asking similar questions and are able to share important complementary knowledge and expertise on what we do and do not know about the answers, but also benefit from the technical expertise contributed from colleagues in disciplines dealing with data science and formal system modeling.
Impact and Responsibility in the Age of Emerging AI
One and not the least motivation for establishing a new research center on the topics discussed here was a likely and foreseeable impact of practical machine learning systems on our university, on empirical research in a broader sense, and on our society as a whole. Looking backward to 2018/19 from the perspective of now early 2023, it seems that development has been at least as rapid as expected, if not even more surprising in scope and potential impact.
The emergence of strong algorithmic intelligence comes with risks, ranging from economical disruptions to intentional or unintentional abuse and misuse, or even fears of existential risks. The role of this center is to help understanding the foundational background, which might be one prerequisite to understanding some of the inherent risks. At the same time, it is part of the mission of EAI to spread existing knowledge to researchers, educators and students on campus. For example, by making them aware of the nature of generalization errors in statistical learning or the nature of bias in learning systems, one can help reducing risks of unintentional misuse, and by creating awareness of current developments, the broader community on campus is brought into the debate. In addition to the indirect effect of collaborative research, members of EAI have also contributed to several educational events such as various summer schools with local and international participants and other outreach activities (e.g., 2018, 2019, 2020, 2021).
Funding and Future
The EAI center has been established in April 2019 and had been funded for 5 years by the funding parties, i.e., this initial funding period will run out in March 2024. This will, however not be the end of EAI. Not only will the expertise, the personal connections, and emerging new collaborations and follow-up projects persist and strengthen the research expertise of JGU in this crucial area but the task of providing a platform, a forum, and concrete support for researching emergent structures in complex systems, including their links to statistical models and machine learning, will be continued as an area of the broader M³ODEL research initiative, one of the five key "profile" research areas at JGU, with funding from the state of Rheinland-Pfalz through its "Forschungsinitiative".
At this point, we have not yet figured out what constitutes the statistical nature of intelligence; however, the center has already made numerous research contributions that might add little pieces to the overall puzzle. We encourage you to read about the details here,on our publication page.