February 17, 2023
Most of us are watching most of our movies in our homes today, and the Covid-19 pandemic only accelerated this trend. Yet the entire artform of film – indeed, the entire multi-billion-dollar film industry – was built around producing films that were meant to be watched in a cinema.
There’s an aspect of the cinematic experience that goes back to the earliest days of theatrical storytelling: for the audience, cinema is a shared experience. When you physically go to the movies, other people in the room are sharing the same experience at the same time. You laugh at jokes together; you experience fear, joy, wonder and excitement as a group. As convenient as watching films at home has become, we have also lost that shared experience.
But the next wave of interactive video experiences like VR (virtual reality) and XR (extended reality) promise to bring back some of that shared experience that is so integral to our humanity. We will be able to interact in real time with people anywhere in the world through VR/XR devices, thereby reclaiming some of that shared experience that dates back to early human history.
Most experts predict that, in the coming years, the world will experience dramatic increases in the volume of video content being consumed. Video content delivery networks (CDNs) are nothing new; in fact, they have been around in some form for a couple of decades.
CDNs help streamline the user experience by bringing content closer to the end users, for reduced latency and less network congestion. Over the years, these networks have become faster and more efficient, reliable and stable. And they have evolved to accommodate a wider range of devices – TVs, laptops, mobile phones, tablets, head-mounted displays and smart glasses.
A big question remains, and it’s what we’d like to explore today: how will video content delivery change as the experiences become more immersive? Most content consumed today is still fairly linear and one-way: you download or stream a video and watch it on the same device. But interactive experiences are the way of the future, and content delivery networks need to evolve to meet the demands of tomorrow’s interaction and immersion, where VR, XR and other technologies become the dominant paradigm.
As we go from the current, relatively passive, content consumption model where you’re watching an episode of Ozark on Netflix or a live sporting event on ESPN+, to a future where we engage in interactive, immersive XR, a lot of innovation needs to happen.
Current models for video content distribution respond to a fairly prescribed set of inputs from users: play, pause, fast forward, rewind, and maybe toggling subtitles on or off. The servers on the CDN side today are designed to respond to those basic inputs.
As content becomes more interactive and we progress toward VR, AR (augmented reality) and XR, there will need to be considerable innovation to provide more sophisticated interactivity. The way users will want to interact with content may not be limited to a few basic commands that can be traced back to the days of the VCR.
In the future, users may want to zoom in, change their angle of view from ground level to bird’s-eye, or choose to chat with another user. These different ways of interacting with content and with other participants, either in-person or virtually, will place different and unfamiliar demands on the CDN.
Interactive and immersive experiences will provide excellent engagement more like video gaming, with many more modes of interaction than we’ve seen in the past. Responsiveness to multiple participants’ requests in shared, real-time content rendering and consumption will engender innovation.
As consumers, we already face much more content than we need. Pick any of the major streaming services: Netflix, Disney+, Amazon, Hulu, the list goes on. Each one has more content to offer than any of us can possibly consume.
Why? They want to deliver the most personalized experience possible, and today, personalization – which shows they suggest – are handled by tracking which shows you’ve watched and how you rated them. If they know you like comedies, those will be prioritized.
As content becomes interactive, there will be much more data to mine about your habits since those habits will become interactive as well. Data-rich ecosystems provide content distributors with unparalleled insights into consumer behavior and end-device capabilities. This data can be used to create optimized experiences for target users.
If you and I watch the same movie, but you accessed a very high bitrate stream on your 4K TV while I was served a low-quality stream to my mobile device, we would probably report very different experiences. Our experience is tied to the quality of the service we are provided, and not just to the content itself.
One of the most important things that needs to happen with video content delivery in the future is to make it more equal. People in many developing countries already tend to get much lower quality video experiences. As we head toward higher levels of interactivity, it will become very important for those differences to level out, so that everyone can participate in the interactive experience, regardless of where they are in the world.
For example, let’s say an interactive experience is being enjoyed in the metaverse by participants in four different places. One lives in an underdeveloped region with unstable, intermittent internet connectivity. The experience of all participants will suffer because they won’t be able to interact naturally with that participant, as they will with the others. Solving the digital divide will be just as important as ever.
What many people don’t realize is that there is not only more content on each streaming service than we can possibly consume, but each of those video files is also available in several resolutions. The version you are served up depends on the capability of your display, the stability and bandwidth of your internet connection, and other factors.
While a lot of video content is designed for the highest quality (e.g., 4K at 60 fps today), it is increasingly common to watch on mobile devices, anytime and anywhere. In fact, recent research indicates that only four percent of content delivered on mobile networks was played at the highest possible quality.
This content becomes underused when consumed at a lower resolution and lower bitrate, leading to lower quality. Since a lot of expense goes into developing high quality video, content creators and service providers are very interested in having more people consume it at the highest quality.
This begs the question: if we are already not consuming 2D content at the highest quality, is there a bleak outlook for VR and XR via mobile networking? VR and XR are more challenging to deliver than 2D content. They require a lot more bandwidth, extremely low latency, and very high resolution for an optimal experience. This is a challenge that is yet unresolved, and one where there is a great deal of potential.
Video powers everything for everyone, everywhere, and it has long been a driver fueling digital innovation. If we can find ways to equalize the quality of experience for all users, we will get closer to achieving the truly shared human experience that is one of the most exciting promises of VR and XR technology.
Tao is VP of Advanced R&D, Media IP at Adeia. He is responsible for initiating and leading R&D projects, supporting the Adeia CTO defining the future technology roadmap and research strategy. Prior to joining Adeia, Tao was Senior Director, Applied Research at Dolby. He managed R&D teams to develop core technologies for various applications including Dolby Vision, which won a Best of CES Award in 2020. Earlier, he was with Panasonic Hollywood Lab and led the development of specs and encoding systems for Blu-ray and 3D production. Tao was an honouree by the Primetime Engineering Emmy for his contributions to Dolby’s Philo T. Farnworth Award in 2021, and a recipient of Emmy Engineering Award in 2008. He won Silver Awards at the Panasonic Technology Symposium in 2004 and 2009. Tao received the award of Most Outstanding Ph.D. Thesis from the Computer Science Association, Australia, in 2001.