Welcome back. First of all, apologies for not posting as frequently as I used to. As you might imagine, blogging is not my full time job and I'm currently extremely involved in a very exciting startup (something I'm going to write about soon). On weekends and evening I'm busy with 7mo infant to help care for and altogether that leaves me with very little time. But I'll try to make it better soon, since a lot is going on in the AI space and signs of cooling are visible now all over the place.
In this post I'd like to focus on the recent book by Gary Marcus and Ernest Davis, Rebooting AI. Let's jump in.
If you are a person who is not necessarily deeply involved in recent (recent 10 years or so) developments in AI and instead you've been building your image of the field based on flashy PR statements by various big companies (including Google, Facebook, Intel, IBM and numerous smaller players) - this is a book for you. The first part of the book goes thoroughly through various press releases and "revolutionary" products and tracks how these projects either spectacularly or quietly failed.
Reading the first several chapters you get the feeling of a breeze slowly blowing away the fog of the corporate propaganda and shining light on a much different picture than what big corp PR tries to maintain. And that picture is not that there has been no progress at all, but that the systems widely hailed as "pre-AGI" (those in the know can giggle here for a bit), systems that supposedly solved language translation, speech recognition, image recognition and so on are littered with limitations and are extremely fragile, brittle. This is not unlike something you could read in this blog every now and then, but in the book these examples are condensed, distilled and the whole narrative sounds almost like a round fired from a machine gun, one hit after another.
The main premise of the book could be summarized in several points:
- The actual capabilities of the contemporary AI systems (mostly dominated by more or less deep learning solutions) are actually much less impressive than one is led to believe
- Even though the leaders of the deep learning movement appear to indicate that systems which are learning end-to-end are deemed to be superior than hybrid (combining pattern matching and learning with some other more symbolic methods), the actual top performing examples prove the contrary with examples such as Alpha Go being clearly a hybrid (utilizing tree search etc.).
- Current deep learning models are black boxes and have surprising failure modes, hence cannot be trusted in important applications.
- Authors note that vast crowds of people who currently call themselves AI researches have highly inadequate knowledge of psychology and neuroscience to be able to even frame the actual magnitude of the problems they are claiming to solve.
- The book appears to argue for more hybrid approaches to leverage the best of both worlds, symbolic good old fashioned AI (GOFAI) with the new wave deep learning AI.
I agree with most of these observations except the last point which gives me somewhat mixed feelings and whether I agree with it or not depends strictly on the context.
If we are talking about building a system to solve a particular problem, as in where the problem is relatively well defined and narrow - such as the problems entrepreneurs would often want to solve for somebody to make a profit, the hybrid approach seems totally reasonable. In fact, if you have a well defined problem, ANY approach that allows to solve it to a satisfactory level is good. In some sense, perhaps naively, I'd like to believe that entrepreneurs don't build their companies in order to participate in the somewhat academic argument as to whether everything should be learned, or can there be a font or back-end to a deep learning system which e.g. runs on symbolic logic or engineered features. In the end what matters is if the problem at hand is solved, not how exactly it is being solved and if that is not the objective of a company, but rather some semi religious statement about the current fashion in AI, that company is deemed to fail.
On the other hand, mixing up current deep learning stuff with symbolic method does not seem to me personally like a road that would get us to actual AI, as in AI that is actually "intelligent". The authors in the second half of the book focus on language (which is their speciality), and go over various projects from the good old fashioned AI that were attempting to bring the so called "common sense" to the field and soberly conclude that they pretty much all failed rather miserably. They observe something I've been explaining in this blog since I started it - nobody really knows what common sense is. We only realize - namely verbalize - some common sense statement once something in the world (e.g. a hopeless computer program or a robot) point to us that the statement in question indeed is not obvious. Common sense is like our own blindspot. Each one of us have it, but unless explicitly instructed we can't see it. My hunch is, there is a mountain of "common sense" stuff which is at a much more primitive level than whatever can be easily expressed using language. I'm mostly thinking of vision (since that is my primary area of expertise). There are myriads of features of visual scenes which are "obvious" to our low level visual systems, and any violations of such low level rules trigger behavioral response. The way light refracts and reflects from surfaces, the way solids deform, the way things vibrate and swing on the wind, all that stuff. The thing that allows us to tell apart a scene rendered on a computer from real image (which arguably gets harder as we program more and more of these details into 3d rendering software - which details? Exactly those which let us distinguish real from rendered). This is true for vision and likely all other senses including higher level fusion of senses, such as a myriad of things can be considered common sense about audio-visual aspects of scenes, audio-tactile, visual-tactile, audio-olfactory, visual-olfactory and so on. Anyway, the idea to put all that stuff expressed as sentences in a language seems infeasible and the history of all these common sense databases seems to be the best evidence for that.
The stuff I'm talking about is in a way symbolic, but these symbols have such primitive meaning compared to concepts present in the language that it would take likely enormous number of words to actually carve out the limited aspect of reality these symbols encompass and even more words to express their semantic relationships. Some researchers call that stuff pre-symbolic, but I don't like that term.
That said, obviously Marcus and Davies are correct in that the current deep learning models don't learn these kinds of symbols either. In my opinion not because they couldn't do it in principle, but because the current training methodologies and an obsession with narrow benchmark results explicitly forbid them from learning any such things. A convolutional neural net based on a Neocognitron is not built to learn e.g. low level temporal dynamics of visual scenes, first by design (the net itself does not have the expressibility to represent such symbols) and by training (the net is bombarded with millions of randomly flashed images with labels rather than coherent temporal sequences).
I've been arguing a lot in this blog that the way for machines to acquire common sense is to let them sense the reality by themselves, directly. Without human high level categories and labels. For that to happen we'd have to have system which could digest stream of sensory information and learn as much as possible from it. And for such system to make sense, we have to have a training paradigm and an objective function which would fit this description. I've been arguing for temporal input prediction as a promising candidate for such objective, expressed so far in the Predictive Vision Model model, there are likely subtle variants of it which will yield a better result than what I currently have, but certainly it will be something in that spirit.
Now obviously that is an ambitious research project, and there may not be flashy press releases and benchmark beats for many years to come, but I personally strongly believe that unless we commit to this particular adventure we have no hope to solve AI whatsoever, but I digress.
Anyway, the authors defend the idea that at least in part our intelligence is innate, while the connectionists argue that models should learn stuff from scratch. I think the innateness argument is very strong - there is no shortage of animals on Earth with brains bigger than humans (both in volume, counts of neurons etc.), equipped with similar anatomical structures, yet clearly humans are exceptional in the ability to develop language and complex abstract reasoning. So clearly there is some clever innate wiring in our brains which enables that. Just having more neurons is of no help here.
On the other hand, I think we have not even scratched the surface of what a blank slate model could possibly learn by directly interacting with the world leveraging training paradigms such as sensory input prediction. Animals, even without high levels skills such as language and abstraction, tend to do very well in surviving in complex environments, stuff our best robots are pathetically hopeless at.
So overall I have the sense that the argument is somewhat vacuous on both sides. It is a bit disappointing that almost nobody sees that there is a third way, something between what I'd call "primitive connectionism" which dominates the deep learning crowd and the GOFAI - instead of slapping GOFAI on top of connectionist models, make connectionist models expressive enough to discover and represent aspects of GOFAI in their internal structures, as attractors of their internal dynamics. Let's call that "dynamic connectionism" for lack of a better term, though Novuelle AI 2.0 could do fine as well. I bet once we figure out how to build something which would resemble neocortex, we will find out there are numerous clever "innate" tricks we can employ to make such a system better. But that said, we don't have anything even close to cortex anywhere in the connectionist world and I think the ball is in connectionist part of the field to construct it. Sadly, almost nobody there wants to pick up that challenge, they prefer to pose as if they could solve AGI using a Neocognitron on steroids, which I think is pathetic and very arrogant.
So going back to the book, it is a much needed commentary on the current hyped up AI scene. It exposes some of the naivety of the connectionist movement and shows that unlike GOFAI people, connectionist (mostly the young, embolden by the success and arrogant) have not really spend much time thinking their stuff through in a broader context. On the other hand, connectionist stuff, though naive (sometimes even silly), for the most part works better than GOFAI in a fair number of applications. The book illustrates this clash of philosophies very well. Unfortunately, the book does not really propose any executable agenda other than just encouraging connectionists to be less naive about the limitations of their contraptions. I concur to that, they should be less naive, but I'm sure they won't listen since they are drunk with their current success. They will only wake up to any sort of criticism once it is obvious they are stuck. And there is plenty evidence, with plethora of recent results using more and more obscene amounts of computing resources to deliver progressively less exciting results, essentially diminishing returns all across the board (I will write a separate post on that soon but here I want to focus mainly on the book).
The book concludes with several considerations about how to embed ethics into future AI, a discussion which by itself can be exciting (see Ex Machina movie and similar works of sci-fi around the subject), in the down to Earth reality is largely premature. For now the discussion should be about potential human bad actors using the clueless technology we tend to grandiosely call AI against other humans, which is a real threat.
Anyway, constructive or not, it is a must read. Criticism is the first step towards finding ways to improve stuff and in the era in which the scientific press and media is full of hubris and self congratulatory puff, such critical take is a breath of fresh air.
If you found an error, highlight it and press Shift + Enter or click here to inform us.