Forget smart glasses: UW researchers put tiny cameras in earbuds for hands-free AI

Wireless earbuds seem to have come out of nowhere. Popularized by Apple’s AirPods, they were suddenly everywhere – on the train, in the store, in the ears of the person sitting across from you – until somewhere along the way, they became something that almost everyone wore without a second thought.
Could that popularity make earbuds better than AI smart glasses? That’s the bet behind VueBuds, a prototype developed by University of Washington researchers who embedded a camera the size of a grain of rice in each earbud of a standard pair of Sony wireless earbuds. The result is a virtual AI assistant that hides in plain sight: look at a can of food and ask how many calories it has, grab an unfamiliar kitchen tool and get an answer in about a second.
The system processes images on the device and responds with a connected AI model — no cloud required, no images stored.
The UW team believes it is the first to embed cameras directly into commercial wireless earbuds.
The earbuds don’t remember anything, but the people around you might not. That tension sits at the heart of what the UW team developed and raises a question the researchers take seriously: what are the social norms when cameras are embedded in things that no one thinks of as cameras?
The team’s response is to lean more towards limiting data collection. Images are processed and discarded; nothing is saved. But the system doesn’t give an outward signal to observers that the camera is there, which researchers admit is more of an open challenge than a solved one.
For technology like this to gain trust, Maruchi Kim, lead researcher and UW doctoral student in the Paul G. Allen School of Computer Science & Engineering, argued that privacy cannot be an afterthought.
“We don’t support photo storage,” Kim said. “The key is to bridge the gap between human interaction and access to AI on the go, especially in hands-free situations.”
Another central point of contention for the group is about the form factor — and it’s a direct challenge to Meta, which has spent years and billions of dollars trying to make camera glasses a mainstream product.
The UW team’s position is that smart glasses will never completely shed their social burdens: the memory of Google Glass, the discomfort of being looked at, the visual signal that the wearer has opted into something that most people don’t. Earbuds don’t have that history.
“From the beginning, we didn’t want to be associated with that,” said Kim.
Putting cameras in the earbuds required solving the power problem first. Cameras use a lot more power than microphones, so the team opted for a low-power sensor that captures about one frame per second in black and white — slow for video standards, but fast enough for the question-and-answer style the researchers had in mind.
The cameras have five to 10 external angles, offering a 98- to 108-degree field of view, and images from both earbuds are stitched into a single frame before processing, cutting response time to about one second.
Applications range from functional to essential. The system can read text on food packages, identify items, and translate written Korean. But for people with low vision or cataracts, the effects are more profound.
The team received more than a dozen emails from people with visual impairments explaining what they would use it for: recognizing facial expressions, reading books, watching TV — tasks that existing AI tools can’t easily support in a hands-free, ambient way.
Kim sees another team that is not working well at work. Electricians, plumbers, and workers in industrial settings often can’t stop to unplug during a job – a pipe installed in a location, a live wire that requires both hands.
For those workers, a voice-activated virtual assistant that doesn’t require touching the screen is the difference between having access to AI and not having it at all.
“There’s a lot of blue-collar work where those people can’t really take advantage of the latest AI advances,” Kim said. “They can’t just pull out their phones and take a picture.”
The hands-free framework extends far and wide: surgeons, chefs, anyone who has ever tried to follow a recipe with wet hands.
The program is always being tested and is not available for purchase. Shyam Gollakota, a professor at the Allen School and the project’s senior researcher, said interest from tech companies is significant, and camera-equipped earbuds could reach consumers within a few years.
For the cost, Gollakota is optimistic. The camera sensor itself can run under a dollar at the component level, he said — meaning that at the scale of a major consumer electronics manufacturer, the price tag for standard earbuds is likely to be modest.
The $10 figure Gollakota quoted refers to a very conservative estimate for small production volumes.
“What we are doing in universities shows that you can solve technical problems,” said Gollakota. “Then we show a way for these companies and other people to say that this is possible.”


