California DMV disengagements reports are out for 2019, and it is time to plot some data.
As usual, these number are not really measuring reliably the safety of AV's and there are plenty ways to game them, or overreport. Please refer to my last years post for a deeper discussion (and 2017 post here, 2018 post here) on why these numbers are essentially flawed. Nevertheless these are the only official numbers we get, the only glimpse of transparency into this giant corporate endeavor called the "self driving car".
First the disclaimer - this data came from
- California DMV disengagement reports for years 2019, 2018, 2017, 2016 and 2015
- Insurance Institute for Highway Safety fatality data.
- RAND driving to safety report.
- Bureau of Transportation Statistics
all which is easily verifiable. And so here comes the plot everyone is waiting for (click to enlarge):
And as usual a quick commentary:
First of all, the only players who really have a number anywhere in the vicinity of interesting are Waymo, Cruise and Baidu. I'll discuss Baidu later, since their sudden jump in performance seems a bit extraordinary. Nevertheless even Waymo and Cruise disengagements are still approximately … Read more...
Earlier last week I posted a poll on twitter asking If my readers would like me to post a GPT generated article. The votes were very evenly distributed:
The remainder of this article is generated using GPT-2 network (using this site) primed on bits of my other articles to covey some of the style. The images were generated by https://app.generative.photos/ from RosebudAI - a recent hot startup in the AI space. When done reading, please consider future historians analyzing the outburst of AI in 2010-2020 and decide if they'd be impressed or will they be like "WTF were they thinking back then!?".
The study was done in the summer of 2014, but there have been so many recent news stories about Uber (and similar companies) and the impact it has had on public safety, ”We're very happy” to add to the body of knowledge we've accumulated.
What can we learn about the state of public transportation?
Our findings indicate that if public transportation is to be made safe, “we have to build the systems on a much higher level”, and that this will require substantial change from the traditional public-sector perspective. We've discussed the problems in the above graphic:
In … Read more...
It's been 7 months since my last commentary on the field, and as it became regular appearance in this blog (and in fact many people apparently enjoy this form and keep asking for it), it is a time for another one. For those new to the blog, here we generally strip the AI news coverage out of fluff and try to get to the substance, often with a fair dose of sarcasm and cynicism. The more pompous and grandiose the PR statement, the more sarcasm and cynicism - just to provide some balance in nature. The field of AI never fails to deliver on pompous and grandiose fake news hence I predict there will be a material for this blog for many years to come. Now that the introductory stuff is behind and you've been warned, let us go straight to what happened in the field since May 2019.
Self driving cars
As time goes, more and more cracks are showing on the self driving car narrative. In June, one of the prominent startups in the competition - Drive.ai got acqui-hired by Apple, reportedly days before it would have ran out of cash. For those not … Read more...
Welcome back. First of all, apologies for not posting as frequently as I used to. As you might imagine, blogging is not my full time job and I'm currently extremely involved in a very exciting startup (something I'm going to write about soon). On weekends and evening I'm busy with 7mo infant to help care for and altogether that leaves me with very little time. But I'll try to make it better soon, since a lot is going on in the AI space and signs of cooling are visible now all over the place.
In this post I'd like to focus on the recent book by Gary Marcus and Ernest Davis, Rebooting AI. Let's jump in.
If you are a person who is not necessarily deeply involved in recent (recent 10 years or so) developments in AI and instead you've been building your image of the field based on flashy PR statements by various big companies (including Google, Facebook, Intel, IBM and numerous smaller players) - this is a book for you. The first part of the book goes thoroughly through various press releases and "revolutionary" products and tracks how these projects either spectacularly or quietly failed.
Reading the first … Read more...
This post is not about AI and not about winter. I have a few of those coming, but this one is about something different. I hope you don't mind.
A friend of mine recently gave a lot to think about by stating the following thought experiment:
Imagine you are taken back in time. To what extent would you be able to advance the civilization of the given era with all the knowledge in your head (no notebooks).
Initially the reaction is obviously that since we all live and breathe the current technical civilization, one should be able to recover almost everything right? There are some many uncertainties to which we already know the answers, so this should be much easier than to get there without such insight?
When you actually give some thought to it, you will realize that things may not be so easy. First of all, in most cases if somebody was taken back in time but left in the same place, they would end up in a middle of nowhere and would have to first survive to even get into contact with any contemporary humans. Say San Diego 300 years ago was an empty costal desert, and … Read more...
It's been roughly a year since I posted my viral "AI winter is well on its way" post and like I promised I'll periodically post an update on the general AI landscape. I posted one some 6 months ago and now is time for another one. And there has been a lot of stuff going on lately and none of it has changed my mind - the AI bubble is bursting. And as with every bubble bursting we are in a blowoff phase in which those who have the most to lose are pulling out the most outrageous confidence pumping pieces they could think of, the ultimate strategy to con some more naive people to give them money. But let's go over what has been going on.
The serious stuff
Firstly let's go over the non-comical stuff. Three of the founding fathers of deep learning - Geoffrey Hinton, Yoshua Bengio and Yann Lecun - received a Turing award - the most prestigious award given out in computer science. If you think that I will somehow question this judgement you will be disappointed, I think deep learning is well worth the Turing award. The one thing that in … Read more...
Many people these days are fascinated by deep learning, as it enabled new capabilities in many areas, particularly in computer vision. Deep nets are however black boxes and most people have no idea how they work (and frankly most of us, scientists trained in the field can't tell exactly how they work either). But the success of deep learning and a set of its surprising failure modes teach us a valuable lesson about the data we process.
In this post I will present a perspective of what deep learning actually enables, how it relates to classical computer vision (which is far from being dead) and what are the potential dangers of relying on DL for critical applications.
The vision problem
First of all, some things need to be said about the problem of vision/computer vision. In principle it could be formulated as follows: given an image from a camera allow the computer to answer questions about the contents of that image. Such questions can range from "is there a triangle in the image", "is there a human face in the image" to more complex instances such as "is there a dog chasing a cat in the image". Although many of … Read more...
Once upon a time, in the 1980's there was a magical place called Silicon Valley. Wonderful things were about to happen there and many people were about make a ton of money. These things were all related to the miracle of a computer and how it would revolutionize pretty much everything.
Computers had a ton of applications in front of them: completely overhauling office work, enabling entertainment via computer games and changing the way we communicate, shop and use banking system. But back then they were clumsy, slow and expensive. And although the hope was there, many of these things wouldn't be accomplished unless computers somehow got orders of magnitude faster and cheaper.
But there was the Moore's law - over the decade of the 1970' the number of transistors in an integrated circuit doubled every ~18 months. If this law were to hold, the future would be rosy and beautiful. The applications would be unlocked for which the markets were awaiting. Money was to be made.
By mid 1990's it was clear that it worked. Computers were getting faster and software was getting more complex so rapidly, that upgrades had to happen on a yearly basis to keep up … Read more...
It has became a tradition that I write a quick update on the state of self driving car development every year when the California DMV releases their disengagement data [ 2017 post here, 2018 post here]. 2018 was an important year for self driving as we had seen the first fatal accident caused by an autonomous vehicle (the infamous Uber crash in Arizona).
Let me start with a disclaimer: I plot disengagements against human crashes and fatalities not because it is a good comparison, but because this is the only comparison we have. There are many reasons why this is not the best measure and depending on the reason the actual "safety" of AV may be either somewhat better or significantly worse than indicated here. Below are some of my reasons:
- A disengagement is a situation in which a machine cannot be trusted and the human operator takes over to avoid any danger. The precise definition under California law is:
“a deactivation of the autonomous mode when a failure of the autonomous technology is detected or when the safe operation of the vehicle requires that the autonomous vehicle test driver disengage the autonomous mode and take immediate manual
… Read more...
Every rule of thumb in data science has a counterexample. Including this one.
In this post I'd like to explore several simple and low dimensional examples that expose how our typical intuitions about the geometry of data may be fatally flawed. This is generally a practical post, focused on examples, but there is a subtle message I'd like to provide. In essence: be careful. It is easy to make data based conclusions which are totally wrong.
Dimensionality reduction is not always a good idea
It is a fairly common practice to reduce the input data dimension via some projection, typically via principal component analysis (PCA) to get a lower-dimensional, more "condensed" data. This often works fine, as often the directions along which data is separable align with the principal axis. But this does not have to be the case, see a synthetic example below:
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from scipy.stats import ortho_group
from mpl_toolkits.mplot3d import Axes3D
import numpy as np
import matplotlib.pyplot as plt
N = 10 # Dimension of the data
M = 500 # Number of samples
# Random rotation matrix
R = ortho_group.rvs(dim=N)
# Data variances
variances = np.sort(np.random.rand((N)))[::-1]
… Read more...
Elon Musk is a polarizing figure. His ideas frequently come about in casual conversations. People are often amused and impressed by his achievements. I must admit, a few years back I thought he is literally the next Steve Jobs, only actually better, since he was onto so many things... I admired SpaceX, thought that Tesla cars had many great solutions in them...
At some point in 2015 or 2016 Elon started talking outrageous stuff in the domain of AI, a domain of my own expertise, which I could tell right away was total bullshit. And then I began looking at all this stuff in detail. Doing some math here and there. Reading various opinions. As a result, my opinion on Musk and many of his ideas has changed somewhat substantially. At this point, I can pretty much say with confidence that 90% of his stuff is utter BS, and the remaining 10% is perhaps impressive but still questionable.
Nevertheless he is quite a character with many fans almost religiously believing everything he says. Any time I meet somebody who is a Musk fan I have to go over these issues so I decided to write this post as a point … Read more...
Almost six months ago (May 28th 2018) I posted the "AI winter is well on its way" post that went viral. The post amassed nearly a quarter million views and got picked up in Bloomberg, Forbes, Politico, Venturebeat, BBC, Datascience Podcast and numerous other smaller media outlets and blogs [1, 2, 3, 4, ...], triggered violent debate on Hacker news and Reddit. I could not have anticipated this post to be so successful and hence I realized I touched on a very sensitive subject. One can agree with my claims or not, but the sheer popularity of the post almost itself serves as a proof that something is going on behind the scenes and people are actually curious and doubtful if there is anything solid behind the AI hype.
Since the post made a prediction, that the AI hype is cracking (particularly in the space of autonomous vehicles) and as a result we will have another "AI winter" episode, I decided to periodically go over those claims, see what has changed and bring some new evidence.
First of all a bit of clarification: some readers have … Read more...
There are many many deep learning models out there doing various things. Depending on the exact task they are solving, they may be constructed differently. Some will use convolution followed by pooling. Some will use several convolutional layers before there is any pooling layer. Some will use max-pooling. Some will use mean-pooling. Some will have a dropout added. Some will have a batch-norm layer here and there. Some will use sigmoid neurons, some will use half-recitfiers. Some will classify and therefore optimize for cross-entropy. Others will minimize mean-squared error. Some will use unpooling layers. Some will use deconvolutional layers. Some will use stochastic gradient descent with momentum. Some will use ADAM. Some will have RESNET layers, some will use Inception. The choices are plentiful (see e.g. here).
Reading any of these particular papers, one is faced with a set of choices the authors had made, followed by the evaluation on the dataset of their choice. The discussion of choices typically refers strongly to papers where given techniques were first introduced, whereas the results section typically discusses in detail the previous state of the art. The shape of the architecture is often broken down into obvious and non obvious decisions. … Read more...
In some recent email exchanges I've realized that when people by some coincidence make it to this blog, they rarely end up visiting my main website, and even if they do, they rarely browse through the teaching materials. This is not really a complaint, I hardly ever visit my website myself, but there are some materials there that I go back to every once in a while (though I have copies on my laptop). These are the lecture notes I made for a lecture on mathematical foundations of neuroscience.
As a bit of a background, in 2009 after I defended my PhD and before I joined Brain Corporation I was briefly an Adjunct Professor at the Faculty of Mathematics and Computer Science Nicolaus Copernicus University in Torun. During that time I decided to refresh everything I gathered about mathematics of neuroscience and prepare a lecture series complete with exercises, lots of pictures, graphs, and all the necessary theory. And even though 9 years have passed since then, the lectures hold up pretty well, hence why not bring that content to a broader audience?
The lecture consists of 15 main pdf presentations, a number of sample exercises as well … Read more...
Since it is fashionable these days to compare the performance of connectionist models with humans (even though these models, often referred to as deep learning only stand a chance of competing with humans in extremely narrow contests), there is a popular belief that these models powered by modern GPU's somehow approach the computational power of the human brain.
Now the latter is really not defined, since we don't even know how brains work and therefore it is extremely hard to estimate at which level of abstraction to assign the fundamental computation but we can still play with some numbers just to get some vague idea of where are we.
So let us start with neurons: average human brain has roughly 80 billion neurons. The popular belief is that neurons are responsible for the function of the brain but there are plenty other cells there, called glia, whose function is not yet understood. So it is very likely there are actually orders of magnitude more cells that somehow realize the computational function, but for now let us stick to the "official" 80B figure.
Each of these neurons is an extremely complex cell, with membrane, electrochemical dynamics of action potentials … Read more...