Self-Driving Cars: They're Finally Learning to Drive Like Us (and That's a Little Scary)

Here’s the English (UK) translation, maintaining the casual yet professional tone:

Right then, let’s be honest: the promise of self-driving cars has been dangling in front of us like a carrot on a stick for years. We’ve been promised robot chauffeurs, stress-free commutes, and the ability to finally finish that crossword puzzle on the way to work. But the reality has been a tad… bumpy.

Until now, it seems. A new paper is making waves with a surprisingly simple approach: let the cars learn to drive by playing against each other. Yes, you read that correctly. It’s rather like a demolition derby, but with algorithms.

Gigaflow: Where Cars Go to Driving School (and Cause Mayhem)

The secret sauce is a system called “Gigaflow”, a batched simulator capable of synthesising and training on 42 years of subjective driving experience per hour on a single 8-GPU node. Imagine a digital Thunderdome where self-driving cars are spawned into existence, given a basic set of rules (don’t crash, get to the destination), and then set loose to duke it out on virtual roads. They learn by trial and error, constantly adapting to each other’s… let’s call them “unique” driving styles.

Fun Fact: In just 10 days of training, these AI cars drove over 1.6 billion kilometres - 
that's more than the distance from Earth to Saturn! Quite the road trip...

The result? A single policy trained entirely via self-play outperforms prior state of the art performances on CARLA, nuPlan, and the Waymo Open Motion Dataset.

The “Minimalist Reward Function” – Or, How to Teach a Car to Behave (Sort Of)

Here’s the really interesting bit. The researchers didn’t spoon-feed the AI with terabytes of human driving data. Instead, they used a “minimalistic reward function”. Essentially, the cars are rewarded for:

  • Reaching their destination
  • Avoiding collisions
  • Staying in their lane
  • Not running red lights
  • Keeping acceleration reasonable

Think of it like training a puppy. You don’t need to show it hours of videos of well-behaved dogs. You just give it a treat when it sits and tell it off when it chews on your favourite shoes.

The Good, the Bad, and the Downright Hilarious

The good news is that this approach seems to be working. The resulting policy achieves state-of-the-art performance on multiple autonomous driving benchmarks, even outperforming systems trained on real-world human data. The cars are also surprisingly robust, averaging 17.5 years of continuous driving between incidents in simulation.

The bad news? Well, if the cars are learning to drive like us, that means they’re also learning our bad habits. Expect to see self-driving cars cutting each other off, engaging in passive-aggressive lane merges, and maybe even the occasional AI-powered road rage incident.

And the downright hilarious? Imagine a future where self-driving cars are programmed to be overly polite, yielding to every pedestrian and letting everyone merge in front of them. Traffic would grind to a halt as these hyper-courteous cars engage in endless loops of “after you, no, after you.”

The Future is (Hopefully) Less Bumpy

Of course, there’s still a long way to go. As the researchers themselves point out, many of the infractions sustained by the AI during testing were due to limitations of the benchmarks themselves, such as pedestrians darting into traffic without looking. But the fact that self-driving cars can learn to navigate complex, unpredictable environments through self-play is a major step forward.

So, the next time you see a self-driving car on the road, remember that it’s probably been through more simulated traffic jams and near-misses than you’ve had in your entire life. And if it cuts you off, just remember: it’s probably just learning from the best (or worst) of us.

Editor's Note: No actual cars were harmed in the making of this AI system. 
Though some virtual ones definitely had a rough day at the office.

Source: Robust Autonomy Emerges from Self-Play