As it might be apparent from the content of this blog, I'm interested in AI applied to physical reality in the form of robotics and autonomous behavior. Here is a brief biographical note of where that came from and what I did so far. This is meant to help understand where I'm coming from and why I believe what I believe.
My adventure with AI started in late 90's. While still in high school I started attending some lectures in my hometown Torun (Poland). These lectures were given by Wlodek Duch - a very interesting and extraordinary person. Prof. Duch is actually quite well known figure in European Neural Network scene and is very passionate about the brain and cognitive science.
I started my university work at Nicolaus Copernicus University in Torun, where I studied computer science on a math department. Soon I reinvigorated my connections with the physics department, where prof. Duch and his team was, to include some "AI" related stuff in my program. Back in the early 2000's, there was no such thing as deep learning, and even training a perceptron was perceived as a peculiarity (everyone one new an SVM would be better anyways). There was hardly any software, and it was difficult to get any meaningful data. At least those were my impressions from being an early stage student.
Nevertheless one thing that always kept me interested: applying these learning networks to some dynamical problem. At the time I was also fascinated by fractals and complexity and dynamical systems. Though back then these things seemed disconnected, the more I moved into robotics the more clear it became that there is a subtle and yet uncharted connection between these two fields. More on that later.
I also happened to work with another extraordinary gentleman at the time, Tomasz Schreiber; a young and very talented professor at the math department who not only understood the math very well, but was also able to see beyond his sandbox and tried connecting ideas in physics, AI and math. We then worked on recurrent networks (Hopfield style) and the intersection of that with phase transition/thermodynamics. Again at the time it seemed to me like something completely disconnected from anything else; today I see how all these things are a part of a bigger whole.
By mid 2000's I got introduced to self-organizing criticality and complex networks, preferential attachment, small world phenomena, power laws, expanders and all these weird things. In 2006 I got accepted for Okinawa Neuroscience Course - I really wanted to understand what is going on in the brain, so I started working on spiking neural networks and more "biologically plausible" models. This lead me to Eugene Izhikevich, a mathematician and a neuroscientist back then working at the Neurosciences Institute in San Diego. I planned to apply for a post-doc, but it so happened that Eugene was starting Brain Corporation at that time, so soon after my PhD defense (at Warsaw University) I joined as a scientist in early 2010. Sadly, this was also the time when my dear friend and mentor Tomasz Schreiber died very prematurely from a cancer.
Among the few very interesting things we did at Brain, I had the opportunity (in fact I was forced to) read a boatload of neuroscience literature, particularly the neuroscience of early vision. This was as much exciting as it was frustrating. Used to solid mathematical results, plowing through tens of often contradictory biological results was something completely new. Rather quickly I realized that the brain is a complex system, likely critical (as in self organized criticality) and the process of driving it to criticality is the same process that allows it to learn about the environment. It quickly became clear that any attempt to disentangle the brain as if it was a separable collection of parts - which is what many neuroscientist try to do - will most likely fail. Everything is connected to everything else, and potent feedback loops make it impossible to assign clear function to any neuron.
Another eye opening exercise we did at Brain was robotics and application of machine learning to it. And it was quite shocking how difficult it is to make a robot "know" anything relevant about reality. It soon became apparent that classical end-to-end supervised training is not going to cut it (though as of 2017 some groups are still desperately trying to apply such paradigm to self driving cars), reinforcement learning would not be practical beyond simulations and classical approach of "programming everything upfront" could not scale to deal with the physical complexity. So together with Todd Hylton - an ex DARPA PM who lead several relevant programs in the past such as the SYNAPSE and Physical Intelligence (currently a director at UCSD Contextual Robotics Institute), we started putting together some whitepapers on what the core of the problem is, and what a solution might look like. Several other guys were involved with this were: Patryk Laurent, Csaba Petre (currently at Cleverpet), Micah Richert (currently at Brain Corporation), Dimitry Fisher (currently at Analytics Ventures) , Moslem Kazemi (currently at Uber).
At that time the deep learning was already blooming, but it was abundantly clear to us that much like other traditional machine learning methods, it will not solve the robotics problem. Not because the methods are not useful, but because they solve the wrong problem. We applied for a DARPA grant in 2014 and got $1M to pursue our ideas in 2015 (AFRL and DARPA Cortical Processor seedling contract FA8750-15-C-1078).
That is about time when many things came together to form the Predictive Vision Model. Inspired by neuroscience and cognitive science (mostly the amazing "Predictive Brains" BBS paper by Andy Clark), we put together a uniform, scalable machine learning structure which trained by prediction could acquire online, deep hierarchies of features, incorporate ubiquitous feedback and offer several other interesting features. And it also worked very well for visual object tracking, which was our benchmark of choice. And it worked right out of the bat, without any outrageous hyper-parameter tuning, like its done today with many machine learning models.
PVM like the brain is massively recurrent and not decomposable. With the very general objective of online input prediction it has to solve many problems of perception at once (most of these problems likely have not even been yet verbalized in machine vision) - in that sense it also treats perception as a non-decomposable problem. This is in agreement with vast number of psychophysical observations which show how both spatial, temporal, cultural, situational etc. context deeply affects perception.
For me the PVM is just the beginning, but it is also very funny how pretty much all my education has made a full circle. The world is a complex dynamic system, a fractal of sorts. In order to act within it, a cognitive system needs to capture as much dynamics as it can, and it can only do it via prediction. But to do it well, the substrate (machine learning fabric) has to allow for complexity and sensitivity, hence overwhelming feedback and a near critical, sensitive operating regime. Prediction is also compatible with thermodynamical views on intelligence.
That being said, I think this is just a tip of the iceberg and there are lots of challenges ahead. Once the DARPA money ran out, our company changed its direction and our group has dispersed. But the idea remains and is being continued (quite intensely) as an after hour project (financed privately) by a few of us. I use this blog to share some of the results and views on the current AI scene.