It can be easy to focus on Apple Vision Pro’s exorbitant price and questionable marketplace performance, but what is undeniable is the stunning technological achievement that is about to be nudged forward by VisionOS 2.
Apple unveiled the mixed reality platform update during WWDC 2024, promising new gestures, updated guest mode, more visual experiences like a Bora Bora environment, and immersive videos through Safari.
Perhaps, though, no teased update is more anticipated than spatial photos from standard 2D photos.
The platform is still in development beta and should ship as a full update sometime in September, but I’ve tried it out and am free to discuss what is yet another remarkable Vision Pro achievement.
VisionOS 2 dev beta’s new image conversion capability is far more than simply doubling an image and showing one eye a slightly off-kilter version of the same photograph.
Traditional spatial or stereo photography typically involves taking two photos about eye-width apart to replicate our 3D vision capabilities. But turning a single flat digital photo into a spatial one is something else entirely. Apple’s machine learning imaging system can analyze a photo – almost any image – down to the pixel level.
Apple’s understanding of subjects and objects in a photo is well known. On iOS, it’s how the lock screen can separate the subject of a photo from the background just enough that, say, your head can appear overlaid on top of the lock screen’s clock. It’s how you can pluck your puppy out of a photo and drop his furry self into an iMessage.
With spatial photos in Vision Pro, image analysis is combined with the system’s exquisite stereo vision capabilities to stunning effect.
To try it out, I selected a range of photos that I thought might best illustrate the insta-stereoscopic capabilities. I chose photos of me in the city, me with my wife on a trip, some of my favorite images of my mother-in-law’s Yorkie, Charley, and one image that is more than 50 years old: a rare photo of me and my siblings with my grandparents (on my dad’s side).
It’s not a complicated process. I put these images in an album on my phone and then Airdropped it to the Vision Pro Headset.
I started by selecting one of my photos, a picture of the dog in my mother-in-law’s backyard. There’s now a little 3D cube in the upper lefthand corner of the photo view window that you select to convert the photo into a spatial-compatible image. As soon as you do that, the system appears to scan the image, picking out the subject(s) and background. A second later, the original image is replaced with a 3D one.
Charley stood out in front of the bushes behind him. If I moved my head, the perspective changed ever so slightly. His fur was clearly defined; it didn’t look as if he was clipped or cut and pasted back in. It looked natural but with real dimension. I switched to immersive mode, where the image frame fades away, and the photo presses in toward you. Charley looked real enough to pet.
This was the case with all the images I chose, but the one that truly startled me was the 50-year-old image of my family. I didn’t have high hopes for this image. It was, after all, a photo of a small 5-inch x 6-inch photo.
I repeated the conversion process and was struck a bit speechless: yes, the resolution was lower than the other images, which had more to do with the source than the spatial 3D process, but I had never seen a family image quite like this before.
My long-gone grandparents appeared almost tangible again, and there, sitting at the foot of the couch, was a young me. I looked real but like someone I barely recognized. I’d seen this picture numerous times before but started picking up details I hadn’t noticed, like the fact that my little sister, who was seated on my grandfather’s lap, was holding a crisp dollar bill. The grandparents always came bearing money; I’d kind of forgotten about that.
I’ve written before about the power of these spatial images, but applying them to any old photo takes them to a new level. The technology can, in fact, be applied to a wide variety of otherwise flat images, like screenshots, photos of movie posters, and album covers. The system can analyze and convert pretty much anything.
One thing I did notice with all the image conversions is that some cropping occurs during the process. It’s not a lot, but when I, for instance, converted a photo of Star Trek‘s Enterprise starship (pulled off a trading card), it lost the bit of padding around the image so that the edges of the saucer or primary hull filled my view.
VisionOS 2 will automatically select 10 photos a day for conversion, but you can also choose the ones you want.
Better control
There are many other, less emotionally triggering visionOS 2 updates, like a new and better way to access the Home Screen, status bar, and control panel. I can now hold my hand out with the palm up in an almost beseeching gesture, look at my palm, and a little circle icon will appear. Then, I tap my fingers once to bring up the Home Screen; no more pressing the digital crown (if you don’t want to).
When the circle appears, I can also flip my hand over and tap my index finger and thumb together to pop up the control panel. This last action is way easier than the current VisionOS method, which requires you to glance way up to see a little green icon and then tap your fingers. If I make that same gesture and hold the pinch, I can access a new volume slider, again, easier than using the digital crown.
The Home Screen is now customizable. I could pinch and hold an app icon to reorganize the apps as I see fit, including moving the Home app out of the “Compatible” folder and onto the main screen.
visionOS 2 also adds a new keyboard-finding capability that, even when you’re in full immersion mode, lets a real magic keyboard or Mac keyboard you’re connected to bleed through. I immersed myself in the new Bora Bora environment (and immediately wanted to be on a real beach), but when I moved my hands toward my MacBook Air, a window opened around my MacBook’s keyboard. visionOS knows where it is and makes a little immersion cutout. It’s pretty smart and might make the system more useful for multi-taskers.
Safari also got a bit of an upgrade with the ability to take embedded panorama photos (assuming sites add them) and turn them into 360-degree panoramas in the headset. I was also able to view YouTube videos in much the same way I could video in, say, the Disney+ Vision Pro app: as floating screens in an immersive environment.
Apple is still quite interested in giving dev partners and, especially, enterprises more reasons to develop for and integrate Vision Pro into their businesses.
There’s a new Volumetric API that will make it easier for developers to build 3D objects and interactions that do not have to take up the entire Vision Pro environment. I tried an app that let me manipulate a realistic-looking ground speeder without losing access to Safari.
Tabletop API lets developers build more detailed tabletop experiences and games. I saw a 3D chess board that also incorporated elements of the game Clue, complete with tiny and very realistic-looking little rooms built into the virtual tabletop game board. The Chess pieces looked especially real.
For enterprises, Apple hopes that a mixed reality approach using the Vision Pro’s main camera will entice them to build things like training systems. I used one such system to check that an under-the-sink garbage disposal was in good working order. Big orange virtual arrows floated around the device, pointing to where I should look while guidance messages instructed me on what to do. When I pushed the garbage disposal to the side, the arrows followed along with it.
Numerous other updates are also available, all designed to make Vision Pro more useful, easier to use, and more inspiring. None of them can change the $3,499 price or the fact that you still have to wear a 1.4-lb. computer on your face to experience them, but if you’re a Vision Pro user, this update may bring a smile to your goggled face.
You can install visionOS 2 on your own Vision Pro headset right now, but remember that it’s still dev beta and is subject to bugs and changes until the final release arrives.