In a milestone that feels both inevitable and straight out of science fiction, an Earth observation satellite has, for the first time, found what it was looking for entirely on its own. The achievement, which occurred in April aboard Loft Orbital’s YAM-9 spacecraft, marks the first reported use of a vision-language model (VLM) in orbit, freeing a satellite from its reliance on human analysts back on Earth. This isn’t just about a clever algorithm; it’s a fundamental shift in what space-based sensors can do.
The satellite was running Google DeepMind’s Gemma 3 model, an AI specifically designed for “edge” applications where computing power is scarce—like, say, on a satellite hurtling through space. The demonstration was powered by an NVIDIA Jetson Orin AGX GPU and managed by a software package from NASA’s Jet Propulsion Laboratory. Instead of the usual process of beaming terabytes of raw imagery to Earth for overworked analysts to sift through, YAM-9 was given natural language queries—like “identify infrastructure around railway hubs”—and the onboard AI did the initial triage, flagging only the relevant data.
Why is this important?
This demonstration effectively turns satellites from dumb cameras into intelligent, autonomous observers. By processing data at the source, it slashes the monumental amount of information that needs to be sent to the ground, breaking a major bottleneck in satellite operations. More profoundly, it paves the way for what Loft’s Head of AI, Paul Lasserre, calls “always-on, patrol layers in space.” Instead of tasking a satellite to take a picture, operators can give it persistent commands like, “Monitor this border and alert me when you see something suspicious.” It’s the first step toward a future where space infrastructure is not just collecting data, but actively making decisions.

