A month of the Vision Pro

Meta bought Oculus ten years ago this month, for $2bn. Oculus was selling a device that was amazing, and clearly part of the future, but also impractical and nowhere near ready for the mass market. Last month, Apple started selling a device that’s amazing, and clearly part of the future, but also expensive, impractical and clearly nowhere near ready for the mass market. And meanwhile, Meta has gone from the Oculus DK1 to Quest products that are good enough for a passionate base of enthusiasts but, again, clearly haven’t reached the mass market yet. We’re still at the beginning of the S-Curve.

I wrote a long essay about xR and the Vision Pro when it was announced last summer, and I won’t repeat all of that here (I won’t do a detailed hardware / software analysis either - you should read Hugo Barra’s excellent piece). It seems to me that Apple and Meta both think that something in this space can lead to a general and perhaps universal computing device that could be the next smartphone, but they’re trying to build that from opposite directions.

Meta started with games and VR, partly because of what was possible with the hardware of 2014, and it’s spent the last decade working its way upwards, trying to catalyse an ecosystem on the way. Apple thinks that you have to start with the general computing experience, which means iPad-style apps, and AR rather than VR (indeed, there really aren’t any VR experiences from Apple on the device) - and above all, that means text.

If you want to deliver a general computing experience at parity with the phone, tablet or PC we have today, then you need to have apps floating in the room around you, that look as though they’re really there, with really good ‘passthrough’, and you need far better text than we’ve ever seen on a VR device before. That means a much better display, lots more sensors, and a lot more compute (the Vision Pro effectively has similar compute to a MacBook Pro), and this is why the Vision Pro is so heavy and expensive: the components to deliver that experience are years away from fitting into a Quest 3’s size, weight and price. But I think that Apple believes this is the MVP: better screens are always better (obviously), but Apple thinks a display system this good is the minimum possible to have a viable product at all.

There’s a trade-off, obviously, between size and weight e.t.c. on one hand and capability on the other, but the really interesting trade-off that Apple has chosen is time. As Mark Zuckerberg pointed out quite correctly last summer, Apple isn’t doing anything fundamental that no-one else plans to do - there is no equivalent of the iPhone revelation from 2007. However, it decided to do it in 2024, rather than waiting for a couple of years until this is all cheaper and more practical. The Vision Pro is a very un-Apple product launch: it’s effectively a very polished prototype or a developer kit, rather than a consumer product, and Apple doesn’t do that. (It has occasionally sold developer kits in the past, but it didn’t sell them to consumers.)

All of that was pretty clear last summer, but now that this thing has shipped and we can all use it, we can experience both sides of that trade-off in person. My reaction, like most of the reviews, is that it really does deliver the demo Apple gave at the launch. There’s some motion blur and some parts could be more polished for the 1.5 and 2.0, but mostly it looks just like Apple’s marketing materials. You can open a couple of iPad-type apps and float them over the desk in front of you, and they look like they’re really there. You can put a beautiful rendered game on the coffee table, and get up and walk round it. And it’s the best home cinema (for one person) that money can buy. But yes, it’s too heavy to wear all day (and the battery is only good for two hours), and, well, it’s $3,500 plus tax.

A lot of people focus on the price and the weight, but I think that misses the real challenge - the price and the weight can and will come down, but then what? The conclusion of every review is, essentially, ‘it’s amazing, but what’s it for?’

This is not an easy question to answer. In a sense, I think this device might function as a test for that whole general computing thesis. With every previous xR device, you could always say ‘yes, but just imagine what it can be once the tech is better!’ Well, now we have something a lot closer to that ‘just imagine’ device. It’s a lot harder to hide behind plans for the future. This thing doesn’t even have any VR games - it’s naked before us, forced to survive as an actual computer. If we cannot make a compelling general purpose computing experience on a display system this good, then the whole field might have a problem.

Apple’s answer, I think, is that we begin with the user-cases we already have on other devices, and then, over time, developers will invent new things that are native to the form. That’s what happened with mobile: we began with the web and mail, and truly mobile-native things came later.

However, I don’t think the future of computing is seeing several apps at once. I don’t think the future of productivity is seeing more rows in your spreadsheet, or more emails at once, or more records in Salesforce at once, on one big screen. I think the future, as seen for the last 20 or 30 years, is task-specific UIs that reduce complexity and data overload and focus on what you need to see. And obviously, I think the future is AI systems that show you less and tell you more.

And if that’s where productivity is going, that applies even more for consumers. The power-user criticism of the iPad has always been to claim that it can’t replace your Mac, but the real problem for iPad sales has aways been that for most consumers, your iPad actually can replace your Mac - but so can your iPhone. The Meta team might claim that the Vision Pro is under-serving VR, but even the iPad is over-serving normal computing for normal users and the Vision Pro overshoots even further.

Of course, this is a change on two dimensions: it’s a much bigger screen, but it’s also 3D. The puzzle is that games can use 3D, and so can some forms of media (Apple’s immersive videos are fun), but what else? Text is 2D. Email is 2D. Why, after 20 years of GPUs, are computer interfaces all 2D? Is it only because the screens are? Will that change?

This might be like looking at a colour monitor in 1985 and saying that text, spreadsheets and databases don’t need colour, so why bother? But it does seem to me that making computing itself 3D (as opposed to 2D planes in 3D space) is a different character of change to adapting maps or email to a mobile device. And it’s interesting also to compare this to pen computing. Bill Gates spent 20 years thinking that pens were the future, but the hardware was never ready. Yet today, Apple has been shipping a technically flawless pen computing system for close to a decade and it’s irrelevant. Some people use their Apple Pencil, and some even use it for more than drawing, but that was not the future of computer UIs. In hindsight, this was a form of skeomorphism, and ‘spatial computing’ might be as well.

Of course, making predictions about how entirely new computing experiences will work out is generally little more than guessing. It’s easy to talk about ‘face computers’ and say that ‘no-one would ever do that’, or indeed to say that ‘productivity isn’t 3D’, but you can’t really know. We all of us every day do lots of things that ‘no-one would ever do’.

On the other hand, something can seem amazing and part of the future, but ‘part of the future’ can come in different sizes. Drones and 3D printing are amazing, but most people don’t have a use for them. A few days after Christmas, once you’d seen the roof of your house or printed a little plastic Eiffel Tower, they went into a cupboard. Looking at VR in particular, I always worry about games consoles. If you’d had a demo of a Playstation 5 in 1980 or 1990, you would have been amazed, but today the installed base of games consoles and gaming PCs combined is perhaps 3-400m units, and not growing. It’s a big market, but most people see one, say ‘well done, very pretty’ and walk past. Looking at something and saying ‘this is amazing!’ doesn’t have much more predictive power than saying ‘this is stupid!’

Meanwhile, comparisons with iPhones are much-overused, but an iPhone was a drop-in replacement for the phone you already had, and it was better than most phones even before it had 3G or an App Store. A Vision Pro is not a direct replacement for anything, and I think a better comparison is with the original Mac, back in 1984. It cost $2,500 - over $7,200 adjusted for inflation - and most people didn’t really know why they needed one, or indeed why they needed a personal computer at all. Even a decade later there were probably only 20-30m consumer PCs on the planet - it took the web to give most people a reason to buy. Some people like to argue that the ‘metaverse’ will play the same role, but to the extent that ‘metaverse’ means anything (and generally it doesn’t), the definition is circular. “What will we do with VR? The metaverse! What is the metaverse? What we do with VR!”

All of this is a long way to say that there are lot of questions about what people will want to do, and whether this will work. Check back in 2025, or 2030. However, I think there are two things we can say in March 2024 with much more certainty.

First, Meta and Apple both think that some combination of VR, AR and pass-through are the future, but neither they nor anyone else think you would wear a headset all day, no matter how light.

For that, you need glasses, but we don’t know when we’ll have the optics for that and we’re clearly not close (if Apple was close, would it still have shipped the Vision Pro?). Some part of Meta’s $15bn annual Reality Labs investment is going on glasses, plus whatever Apple is allocating, but for the time being this is what we have. That means that we don’t have a device that can replace the smartphone as the universal device that everyone on earth carries every day and uses as their main computer. That in turn means the market size, if all of this works, is an optional smartphone accessory, very similar, in fact, to an iPad. The iPad, after more than a decade of the ‘right device’ at the right price (which VR is nowhere near) has an installed base of perhaps 300m units: a big market and a decent outcome for shareholders, but only the smartphone is universal, so far. (Indeed, if we’re talking about smartphone accessories, we should also think about Apple’s watch and AirPods, and Meta’s Ray-Ban audio assistant. These are also accessories and also AR.)

Second, taking one step further back again, even if my doubts are all wrong, we won’t know any of this for years, and right now this is all still in the experimental category. Apple sells more watches in a typical quarter than Meta has active Quest users. Even the iPhone took years to start selling. It’s possible than in five years this will have started to work, and it’s possible than in five years we’ll have concluded that this is a niche, and we’ll have to wait for glasses, contact lenses or neural implants.

Meanwhile, back in March 2024, have you heard about this ‘ChatGPT’ thing?

AR & VR, AppleBenedict Evans17 March 2024