This post is an extension of my previous post on Statistics vs Dynamics in machine learning. I'll try to expand here on what I think is the key missing ingredient (possibly not the only one) for efforts such as a self driving car or other robotic projects that are aimed at unrestricted environments.
The way the problem of control in machine learning is approached today is by end-to-end training of motor command based on sensory input (such as e.g. here). The authors argue that the optimisation algorithm will do a better job than explicitly breaking down the task into perceptual/planning submodules because it can do everything at once. This logic is influenced by behaviourism and the observation that humans essentially appear to do the same thing - map sensory input onto motor command.
This approach is flawed, as I will try to explain in paragraphs to follow.
What looks like direct sensory to motor mapping maybe be a lot more complex
Looking naively at a human performing a task of let's say driving a car, one may think that the human performs the task of matching what he currently sees onto a motor command. This certainly looks like that, … Read more...
There are two very important branches of mathematics relevant for building intelligent systems: statistics and dynamics. The rationale is the following:
- data has regularities and patterns that repeat, therefore an intelligent system should analyse them statistically
- things in the world are in motion and that motion has regularities, therefore the intelligent system should build models of that dynamics
Although seemingly these approaches are very compatible, it is important to understand the different modes of thinking: statistics tries to find a pattern given we know nothing else about the system (often making assumptions that things come from a known distribution) based on many samples. Dynamics tries to write down the equations of motion of the system given very few samples. Statistics wants to estimate the expected value and variance of things. Dynamics wants to predict exact value of something with strict error estimate.
Current machine learning is heavily biased towards statistics. Although some priors are inserted into the models, the general approach is to throw more data and compute power at a system and expect miracles, rather than building a system that could intelligently infer based on the dynamics (see e.g. the ImageNet and similar purely statistical approaches to understanding images … Read more...
Everybody is trying to build a self driving car today. Google has been testing their solution for the past ten years or so, Tesla just announced they'd be putting the "self driving hardware" onto their newly manufactured cars, Uber has a big effort with Volvo in Pittsburgh, comma.ai is trying to ship a box for outfitting certain cars with a self driving mode etc. Obviously the car manufacturers are following with Ford making announcements recently, BMW working silently and so on and so on. Some of these efforts are explicitly cautious on what they promise (driver assist technology rather than full autonomy such as e.g. Toyota), but many voices, particularly the VC's from the Bay area are hyperactive announcing how the life will be great and how the self driving car (in the sense of full autonomy) is a done deal.
Well I would not be a sceptic if I did not put all those hyper-optimistic statements to doubt. Let me go through a few claims about self driving cars one by one and put my sceptical comment next to each statement. To be frank: I'm not against the technology, I'm against the hype.
- Self driving cars will be
… Read more...
Here is something completely different. Nothing today about AI or deep learning.
I'm a big fan of Star Trek and generally like the utopian version of the future that Gene Roddenberry had given us. But obviously this is just a vision and a TV show, so it's full of stuff that makes people watch it. Inspired by that vision though, I've been day dreaming what it would be like if we actually had the 24'th century technology.
This will just be daydreaming exercise, so let us not bother for now on whether faster than light travel is feasible. Clearly with our current understanding of physics it does seem like a very fundamental limitation. But there is some new physics lurking, perhaps looking crazy, but quantum mechanics did look crazy in the beginning (and it still does) and yet has proven to be extremely good at describing nature.
Here are my assumptions:
- faster than light travel is possible at a rate of say 1 light year per hour. For now let's just assume that the "warp" drive takes the ship into a thin wormhole like tube, so when the ship is in warp mode it cannot interact with matter and
… Read more...
There is an ancient argument in the field of AI called the Chinese room experiment. The thought experiment proposed by John Searle in the early eighties goes as follows:
- You put somebody who does not know Chinese in a room
- You give them a lengthy instruction (a program) on how to respond to given Chinese symbols
- Finally you run the experiment by feeding in Chinese sentences in the input and getting sentences at the output. The Chinese fellows are convinced they are running a conversation with a sentient being but the poor guy inside just shuffles symbols and has no idea what is he conversing about
The conclusion is that even though the external observers assume (by Turing test) that they are observing intelligence, the guy inside is clearly unaware of what is going on, and therefore the intelligence is somehow unreal.
Personally I have several issues with that experiment. First of all it is a thought experiment and it assumes we can have externally recognised intelligence implemented by a guy with a book of symbol transformations. Although a computational in/out relation like that should be implementable by a "computer", the size of the necessary derivations could be enormous. In … Read more...
In the previous posts I've been investigating the current state of the art deep nets for casual vision application - telling what is in the image taken in an average office and average boring street. I've also played a bit with adversarial examples to show how the deep nets can be fooled. These failure modes tell us something important about the level of perception we are dealing with - very basic level. In this post I will discuss why I think perception is such an elusive problem. Let's begin with vision.
Each of us is born with a blindspot in their visual field - the place where nerve fibres from the retina exit the eyeball. However, unless somebody tells us how to discover it, we are completely ignorant of its existence. In some sense it could be qualified as an example of anosognosia - a condition in which humans are not aware of a defect in their perception. A more extreme case of this is known as Anton-Babinski syndrome, typically occurring after a brain damage in which the patient claims to see even though he is technically blind! As much as this seems unbelievable, patients will confabulate … Read more...
In the previous post I applied an off the shelf deep net to get an idea how it performs on average street/office video. The purpose of this exercise was to critically examine and reveal what these award winning models are actually like. The results were a mixed bag. The network was able to capture the gist of the scene, but made serious mistakes every once in a while. Granted the model I used for that experiment was trained on ImageNet which has a few biases and is probably not the best set to test "visual capabilities in the real world". In the the current post I will discuss another problem which is plaguing deep learning models - adversarial stimuli.
Deep nets can be made to fail on purpose. It's been first shown in  and there have been quite a few papers since then with different methods to construct stimuli that fool deep models. In the simplest case one can directly derive these stimuli from the network itself. Since ConvNets are purely feedforward systems (most of them at least), we can trace back the gradients. Typically gradients are used to modify the weights such that they better fit the given … Read more...
There is a lot of hype today about deep learning, a class of multilayer perceptrons with some 5-20 layers featuring convolutional and polling layers. Many blogs [1,2,3] discuss the structure of these networks, there is plenty code published so I won't get into much detail here. Several tech companies had invested a lot of money into this research and everyone has very high expectations on performance of these models. Indeed they've been winning image classification competitions for several years now and media are reporting superhuman performance on some visual classification tasks once in a while.
Now just looking at the numbers from ImageNet competition is not really telling us much on how good these models really are, we can only maybe confirm that they are much better than whatever came before them (for that benchmark at least). With media reporting superhuman abilities and high ImageNet numbers and big CEO's pumping hype and showing sexy movies of a car tracking other cars on the road (2min video looped X times which seems a bit suspicious) one can get the impression that vision is a solved problem.
In this blog post (and a few others coming … Read more...
So we are trying to build Artificial Intelligence. But what is it? Is a program playing chess or go intelligent? After some though I think most people would agree that not really. It's just a computer program that managed to master a game. Is a large neural network -- optimised with gradient descent to approximate a dataset -- intelligent? Well, it is just a function approximator so technically I would say no. All these exercises do capture some aspect of what we would call intelligence, but the core of this idea seems elusive.
So why all the fuss about Artificial Intelligence?
A bit of history
The term "Artificial Intelligence" was coined by Prof. John McCarthy for the famous Dartmouth Conference in 1956. By his own words he had to invent something to get the funding. Since the very origin this term caused controversies and boom-bust iterations known as AI winters, among which the better documented ones are the LightHill report in 1974, Minsky and Papert book Perceptrons in 1969 (which busted the connectionist studies for quite a while), the 1987 collapse of expert systems (predicted by Minsky and Schank), and more recent smaller crisis in Backpropagation powered neural networks … Read more...
Apparently we live in the world where singularity is about to happen and artificial intelligence (AI) will cover every aspect of our lives. But the field of AI had always been inflated by bubbles and busts known as AI winters. Why is it so and is this time different?
There are several weaknesses of human psychology that make us very susceptible to hype in AI. First of all, we should note that humans have amazing perception, particularly visual perception. The problem is that great majority of our marvellous vision develops by the age of 2 and so neither of us remember what it's like to not perceive the world correctly. By the time we begin to verbalise (and remember anything), all the low and mid level perceptual machinery is up and running. So our psyche wakes up in a world where everything already makes sense and what needs to be learned and achieved are the higher cognitive tasks.
This phenomenon is reflected in our approach to AI. We tend to believe that artificial intelligence is about playing chess or go (or atari) because that is the kind of higher cognitive task that we are excited about by the … Read more...
So finally after many months we can share our progress. Predictive Vision Model (PVM) is a new recurrent learning architecture we've been exploring for a while now. The paper showing initial results is available here https://arxiv.org/abs/1607.06854 and the corresponding code is https://github.com/braincorp/PVM .
So what is PVM? It is a new approach to learning foundations of perception in an unsupervised way. We exploit the idea of multi-scale and multi-level stacked predictive encoders (similar to autoencoder but tries to predict the next frame in a sequence of inputs). We then find, that if we train this architecture online, we can liberally wire it with feedback and lateral connectivity and nothing brakes! So we end up with a scalable, unsupervised architecture that naturally operates in time and is able to exploit all the regularities, which are so obvious to us - humans highly visual animals - that we don't even notice them consciously until we are faced with an optical illusion.
This is really just the beginning of the work. We experimented a lot, therefore we decided not to invest into a GPU implementations, but now this certainly is a good avenue to pursue. Recurrent feedback and online operation make it difficult … Read more...
The Deep Nets are the hot thing these days in machine learning research. So hot that institutes are being established to study the social consequences of AI overtaking humanity and the White House has concerns regarding AI. Now every respecting sceptic should ask a question: is humanity really that close to solving the secret of intelligence? Or maybe this is just hype like in the 50'ies and 80'ies?
This is a long discussion. I will post many articles on that in the future hopefully. Here lets dissect a few popular myths:
- Convolutional deep nets solve perception. It is true that these systems have won ImageNet by a substantial margin and often can classify the content of the image accurately. It is also known that they get fooled by stuff that certainly would not fool a human. So that indicates that there is something missing. I think that we have somewhat shallow understanding of what perception really is. Vision is not about just categorising what we see. In fact we more often than not ignore the class of what we see. Humans or animals are more interested with affordances, namely "can I perform an action on what I
… Read more...
I've recently moved from a managed hosting company to pure, raw Amazon instance and I have to say it's fun. I've set up all the LAMP stack, email server, ssl/certs and a few other services and it feels good. Of course within the first maybe 2 months my machine got attacked by a DDOS targeting WordPress installations and essentially went down.
It took me a while, one evening and one morning of stressful rebooting and fiddling to finally figure out what is going on. In the meanwhile I've mastered every aspect of running an AWS instance like detaching/attaching volumes, changing volume type, changing instance type, reassigning elastic ip's you name it. The thing manifested itself in a substantial (like 99%) IOWAIT time on the machine CPU (top -> wa) and resulting sluggishness of response. So if you are apparently running out of IO on your server, it likely is not without some shady reason - amazon limits for IO are well within what should be enough for a small server with a few websites.
Finally I looked up my apache logs (which I should have done in the first place) to see that I'm being bombarded by large post requests … Read more...
Recently the world has been thrilled by the game of go played between the world top player and a computer program. The program eventually won 4 of the 5 rounds, marking the historical moment in which go had finally been solved. This is almost twenty years after another important game - chess had suffered similar defeat. Why did it take almost 20 years?
You will hear that go is apparently a lot more difficult than chess and therefore the search space is much larger and bla bla bla. Well did we know in advance it was so much harder? Probably not until we started trying to solve it. Do we frankly even now have any reasonable intuition as to why go is so much harder? I doubt.
OK, let's look at something simpler - graph problems. Some of them are easy, lets say minimum-spanning-tree or even all-pair-shortest-paths. Some of them are extremely hard e.g. traveling salesman problem. Even though these problems sound very similar. It's all about some minimal path in a graph, it would seem they should be similar, yet their solutions vary greatly in complexity.
Figure 1. Checkerboard pattern is solved by this net in … Read more...
Let me start this blog with a short introduction. My name is Filip Piekniewski and I'm a researcher working on topics of artificial intelligence, machine learning, perception etc (check my website for more info). For the past six+ years I've been working at Brain Corporation in San Diego. The company has ambitious goal of building brains for robots and the work we've been doing is quite unique. I'd like to use this blog to share some of my thoughts on Machine Learning, from a slightly different perspective than a lot of the mainstream, namely from the perspective of actually applying these techniques to a physical device existing in physical reality. As we've learned, this is a whole different ballpark than running your algorithm on a dataset in a sterile, digital world. I hope you will find this read entertaining.
… Read more...