In my career I've encountered researchers in several fields who try to address the (artificial) intelligence problem. What I found though, is that researchers acting within those fields had a vague idea of all the others trying to answer the same question from a different perspective (in fact I had a very faint idea myself initially as well). In addition, following the best tradition of Sarye's law there is often tension and competition between the researchers occupying their niches resulting in violent arguments. I've had the chance to interact with researchers representing pretty much all of the disciplines I'll mention here, and as many of the readers of this blog may be involved in research in one or a few of them, I decided it might be worthwhile to introduce them to each other. Within each community I'll try to explain (at least from my shallow perspective) the core assumption, prevalent methodology, and the possible benefits and drawbacks of the approach as well as a few representative literature/examples (purely subjective choice). My personal view is that the answer to the big AI question cannot be obtained within any of these disciplines, but will eventually be found somewhere between them, and likely bits of the answer will lie in each of those fields. In the summary I'll explain how the Predictive Vision Model (PVM) draws from ideas in these disciplines and how it may begin to bridge some of the gaps.
-
Neurosciencie
-
Assumptions/reasons
Neuroscience attempts to understand the brain and related organs in living animals. This endeavor is motivated by the fact that only biological beings show intelligence and advanced behavior and therefore the key to understanding intelligence lies inside the brain tissue. In addition, understanding biological mechanisms of intelligence could allow for better treatment of cognitive and psychiatric problems.
-
Methodology
Neuroscience is a primarily experimental approach, there are several main ways in which the research is conducted e.g.:
- Recording cells from a live animal. The animal is head-fixed in a recording apparatus and a surgery is performed to open the cranium. Electrodes are inserted and pushed until they penetrate a neural cell. Once a cell is found, the activity is recorded until the cell is lost (either dies or slips from the electrode). The activity is then reverse correlated with the stimuli the animal was exposed to. The original experiments by Hubel and Wiesel which lead to the discovery of simple and complex cells in the primary visual cortex were performed this way.
- Connectomics - brain tissue from a deceased animal is sliced and scanned. Next the a 3d connectivity diagram is recovered based on the slices.
- Two photon microscopy with genetic modifications to cells. Certain populations of cells can be altered (either by genetic selection or by infection with a special virus) to express bioluminescent proteins. Next the tissue is illumined with a laser and recorded with a microscope.
- Growing tissue samples in a dish.
- Intracranial stimulation. Electrodes are being injected into a region of the brain and fixed. Next the brain gets stimulated and animal behavior is recorded.
- Recording of mean field potentials (averaged neural activity in the area)
- Various larger scale recording/scanning techniques such as MRI, fMRI, DTI, PET to identify the regions of the brain responsible for given cognitive functions.
- Obtaining electrophysiological properties of neurons as well as mechanism underlying synaptic plasticity and creating and simulating detailed (possibly large) models.
- Various combinations of the above methods.
-
Issues
The primary problem neuroscience faces is the enormous complexity of the biological brain. There are multiple levels of interaction ranging from molecular level all the way up to entire networks. Potent feedback loops at multiple scales and different kinds of neurons (e.g inhibitory cells) make system identification extremely hard, particularly further away from the perceptual font end. As a result, only early stages of visual processing can be somewhat characterized and even in the primary visual cortex this characterization is far from complete. Connectomics, aside from purely technical problems in remapping microscope slides into a 3d structure, faces the problem of estimation of strength of synapses and dynamical properties of neurons (which is not available in dead microscope samples). Aside from that, the complete connectome of even small animals such as mice would take petabytes of space and is extremely hard to analyze. Above all, the omnipresent feedback makes biological systems nearly impossible to decompose into clear functional units, therefore we will likely need to understand everything before we can understand anything. Since the lack of decomposability, all the computer simulations are deemed to be incomplete in some aspect, unless they become extremely big in which case the number of free parameters and the chaotic character of nonlinear feedback interactions renders them nearly impossible to control.
-
Examples/major efforts
Hubel and Wiesel - seminal work on cat primary visual cortex, which lead to discovery of simple and complex cells, later on inspired neocognitron which in turn inspired convolutional neural nets
Human brain project - an ambitious European project, attempt to gather data and simulate in a computer at high level of detail the entire human brain
Brain initiative - a US effort to improve data gathering and microscopy to understand larger networks
-
-
Machine learning, particularly connectionism
-
Assumptions/reasons
The key observation is that the brain is composed out of multiple similar units (neurons) and, although the biological neuron is very complex, some simple operations can be abstracted out and studied in a simplified computer model. For example the perceptron assumes that a neuron is performing a dot product between its inputs and input weights and passes the result through a (often nonlinear) activation function. Larger networks can be composed of such units into so called connectionist models.
-
Methodology
The field originated as a search for optimization techniques that would allow to tune parameters in large networks of simple neuron-like units such that the emergent systems would capture some aspects of cognition. Since the invention of the backpropagation algorithm in early 1980's, the methodology often involves creating a dataset that represents the input/output mapping that needs to be obtained. Next models are being created (often called artificial neural nets) and trained. The weights between neurons are adjusted using forms of gradient descent algorithm until the model achieves satisfactory results.
-
Issues
Although many ideas in machine learning did originate in neuroscience (e.g. perceptron itself, the neocognitron - aka convolutional neural net, Kohonen self-organizing map), the field, particularly recently, has grown substantially in more practical directions. Good results on image classification and other tasks where large labeled datasets are available has lead researchers to explore these architectures, often disregarding the biological origins of these algorithms. However, we know for a fact that there are numerous aspects of biology that these models currently do not incorporate or incorporate only minimally. One such property is the overwhelming amount of feedback connectivity in seemingly feedforward sensory systems (such as e.g. vision). Another, related, is the apparent near critical state of the cortex manifested by scale-free neural avalanche structure. Also, connectionist models are typically optimized as statistical black boxes without any regard for the temporal/dynamical aspect, which is essential for real world behavior. Last but not least, the majority of efforts in machine learning are focused on supervised and reinforcement learning. These paradigms, though useful in restricted applications, are not the primary way biological entities learn. The majority of human/animal learning is unsupervised or in some way self supervised by interacting with physical reality. Hence the models created in this field often suffer from a lack of "common sense". In such a scenario the training set is sufficient to roughly isolate the classes, but the errors made by the systems indicate that there is no deeper "understanding" of the problem.
-
Examples
Most notably deep learning in its various incarnations.
-
-
Neuromorphic engineering
-
Assumptions/reasons
The human brain with its billions of neurons and trillions of synapses dissipates roughly 20W of power (approximately 20% of the total energy budget of a human being). Depending on the estimate of how neurons map to transistors this is still many orders of magnitude less power/unit of computation than any technique we currently use in digital computers. Notably, operating at high frequencies necessitates the use of higher switching voltages which in turn consumes a lot of energy (and is the reason why nearly every CPU these days has a large fan). In fact heat dissipation is one of the major limiting factors in computing. Contrary to digital computers, the brain operates relatively slowly (at the order of a few kHz not GHz) but is massively parallel. The aim of neuromorphic engineering is to mimic these properties in digital or analog fabric and to build much more energy efficient chips. This is very important since unreasonable energy consumption of the "cognitive module" is a show stopper for many autonomous robotic devices and may be an issue even for large mobile robots such as autonomous cars with quite a forgiving energy budget. Simply put, autonomous robots may need to have a datacenter processing capacity onboard, but cannot afford a datacenter's energy requirements.
-
Methodology
The prevailing methodology in this field is to build a chip, make sure it consumes little power, offers huge amount of parallel computing and then figure out what it is good for.
-
Issues
The main issue of this field is the methodology itself, since the entire approach is somewhat backward. Traditionally, once an application is established, a processing hardware can be built for it, and that design stage is considered as one of the final stages of optimization (e.g. hardware support for mpeg compression etc.). Here we try to build hardware, but our only guidelines are that it needs to consume little power, be very parallel and, to various degrees, "biological" - the application is largely missing. In the end, without a killer application, many of these (often amazing!) chips end up in a museum before they can do anything useful.
-
Examples
IBM TrueNorth - large effort from IBM, result of the DARPA Synapse program
SpiNNaker - A very interesting effort to put many small ARM cores connected with a switched network, run by Steven Furber in Manchester
BrainScales - part of the Human Brain Project
KnuEdge - a company in San Diego working on something roughly similar to SpiNNaker.
-
-
Robotics
-
Assumptions/reasons
Roboticists obviously want to build useful and capable robots. For that they need to solve many mechanical and control problems and last but not least develop some level of cognition. Robot builders are typically engineers and very practically oriented people - they like solutions that are simple, robust and that work. Consequently they don't like to hinge their products on big scientific unknowns such as solving AI.
-
Methodology
Given the practical approach of this discipline, roboticists will often solve difficult problems of perception and planning by incorporating expensive but very accurate sensors such as lidars or precise actuators with feedback. In addition they will try to limit the scope of what the robot can do and control in its environment. When the environment is controlled, numerous priors about it (and the robotic body itself) can be coded into the control mechanism and used to generate optimal actions. Many advanced robots will therefore map their environment with sophisticated devices, compose this data with assumptions about the environment and perform pre-programmed actions.
-
Issues
Since the robots have very limited brains, they can only safely operate in restricted environments. In addition, advanced sensors and actuators are very expensive, limiting the adoption of robotics to wealthy industrial applications (some in healthcare), where it does not matter that much if a robot costs north of a million bucks. Applications of robots in unrestricted environments are limited to several DARPA/NASA challenges and recently tests of self driving/driverless cars by several big companies. However the inherent complexity of reality often leads to unrecoverable situations where the pre-programmed priors are insufficient to generate the right behavior. Consequently there is some drive to apply more machine learning in robotics to cover these corner cases. That however faces issues, since machine learning, when primarily focused on supervised/reinforcement learning, needs large amounts of data, which the roboticists often cannot provide (as the robot may end its mechanical life before it can even physically generate enough data). Another issue is the fundamental question of whether it is even possible to memorize all the corner cases, or should some new approach be used to allow the device to reason in the field and adapt to completely new situations.
-
Examples
Carnegie Mellon Robotics Institute
iRobot
Rethink Robotics
Boston Dynamics
Google Self Driving Car (Waymo)
-
-
Cognitive science
-
Assumptions/reasons
Cognitive scientists are often philosophers who try to view the problem of cognition more broadly. They merge the findings of psychology, psychophysics, neuroscience and machine learning and try to build more general and coherent theories of that is going on when we generate behavior.
-
Methodology
Since this discipline is at the edge of scientific approach it does not have a single established methodology. Many arguments are speculative. This however limiting in some aspects allows for greater freedom in others.
-
Issues
Since there is no good methodology it is often hard to evaluate the results. That being said, this field is very important as it allows to formulate the questions before we know how to systematically answer them. The idea of the predictive brain as summarized by Andrew Clark is one of the basic assumptions behind the PVM. However, cognitive departments typically don't have the resources to hire programmers and test their ideas in simulation and are being to a large degree ignored by the application oriented machine learning people. Consequently many great ideas remain only on paper.
-
Examples
Andy Clark BBS paper and references therein.
-
-
Physicists (thermodynamics)
-
Assumptions/reasons
Recently the problem of intelligence has been formulated in the framework of thermodynamics. The main observation is that a population of intelligent agents is a thermodynamical system like any other and therefore the existing formalisms can be used to analyze their behavior.
-
Methodology
Broad mathematical formalism of thermodynamics, estimation of free energy available to extract and formulation of cognitive tools that allow agents to extract maximal amount of energy from the environment.
-
Issues
As much as this approach is extremely exciting, the formalism will often just say that some things need to be done without telling how. So to some degree this approach is great at telling us what is intelligence and why exactly do we need it, but does not tell us how to construct it. This, however limiting is still extremely useful, since the field of Artificial Intelligence has survived many years without any satisfactory definition of intelligence, which is in my honest opinion quite embarrassing. Moreover, many researchers diving deep in machine learning don't even perceive the very need to make that definition and instead prefer to use the term AI arbitrarily, whenever convenient (which is even more embarrassing and has lead to phenomenon of AI winter several times before).
-
Examples
Alex Wissner Gross paper on Causal Entropy
Susanne Still work on thermodynamics of prediction and iterative learning and a few references therein.
-
How PVM fits all of this
Predictive Vision Model is an attempt to bridge several of these approaches. It departs from the "predictive brain" idea summarized by Andy Clark. This idea is further reinforced by the findings of physicists, where intelligence is formulated as an optimization of causal entropy (future choices), hence requires ability to predict the state of the world. Same is true for robotics where having a good forward model is crucial for planning and behavior execution - here we note that the forward model should include external environment and be trained instead of being pre-programmed. It then draws from neuroscience to incorporate in a sensible way the overwhelming feedback connectivity. The feedback is allowed to grow strong, since the predictive constraint prevents it from "blowing up", resulting in a near critical system. Since the component needed for PVM is an associative memory unit, it takes these building blocks (currently MLP's) from machine learning where they've been developed and studied. That said, it leaves the path open for various neuromorphic implementations, since PVM does not crucially depend on the features of particular machine learning algorithms, but rather on the sole ability to associate and compress (which can be achieved in a variety of ways). The open path for neuromorphic implementations will in my opinion be crucial for future applications (power budget) as well as scaling this up beyond our current, primarily von Neumann computing architectures.
It turned out after putting this whole thing together that it worked right out of the box and allowed us to build a very nice visual object tracker (details in the paper). But more importantly, the model integrated many seemingly distant features in a sensible way. The work on PVM is just the beginning and much more needs to be done, beginning with efficient GPU implementation, but I think there will be many exciting developments in the not too distant future.
If you found an error, highlight it and press Shift + Enter or click here to inform us.