On September 1, 2015, Google, along with Microsoft, Cisco, Intel, Mozilla, Amazon, and Netflix, made a stunning announcement that will fundamentally change the future direction of open standards-based WebRTC video. Together they announced the formation of the Alliance for Open Media (release, website) as “an open-source project that will develop next-generation media formats, codecs, and technologies in the public interest.” Later this year the alliance intends to provide information on how other organizations can join and contribute.
This blog explores why this announcement is so significant for WebRTC, why this excites us at Agora.io, and how we plan to contribute. We believe this alliance has the potential to open up how vendors focused on specific communications issues, like Agora.io, will be now able to better participate in an open process, rather than watching on the sidelines, along with everyone else, as a few large vendors battle over opposing video codecs! This will greatly benefit WebRTC and the Internet and mobile applications needing new innovations and better Quality of Experience.
History – The Clone Wars
WebRTC came into existence only four years ago in May 2011, first released as an open source project by Google followed by standards initiatives around protocols within IETF and browser APIs within the W3C. The goal of this work is to provide open and easy-to-deploy voice, video and file-sharing communications between browsers and applications – without the need for plugins within browsers.
Over the past four years, there have been two main areas of contention within the WebRTC community – comprehensive browser support and the required video codecs. Microsoft Explorer and Apple Safari have been the browsers “missing in action” so far. These issues are somewhat inter-related since these vendors did not want to commit to WebRTC in their browsers without the resolution of which video codes they would need to support. Some progress is now being made on the browser side with Microsoft moving forward with WebRTC capabilities and new codecs being gradually incorporated into the new Microsoft Edge browser. On the Apple side, there has been work to make to easier to include WebRTC primitives within “WebKit” (see here), which underlies Safari, but there is still no official commitment from Apple.
On the audio codec side, the immediate issue was resolved early on – the WebRTC pre-standard required Opus and G.711, both of which are openly available, and everyone agreed to this.
However, on the video codec side, a pitched battle ensued between the use of the current enterprise industry standard H.264, which comes with various licensing requirements, and VP8, which Google provides for free as part of its WebM open source project (building on hundreds of millions of dollars of Google acquisitions and investments). There were strong arguments on both sides – open vs. well understood, integrating with current equipment vs. new and web-based, license fees vs. potentially no license fees, etc. This long battle delayed the video codec standards decision for several years and the final IETF working group decision in November 2014 was a compromise – browsers and certain classes of an application need to support both VP8 and H.264 while “WebRTC-compatible endpoints” could support whatever they needed according to their communication requirements.
The Return of the Codecs
The WebRTC video codec “provide both” compromise was some kind of a resolution, albeit at a cost of making things more complicated, but unfortunately, the situation is very unstable.
Both VP8 and H.264 are older video codecs that are being rapidly replaced by new innovations. Google is already shipping VP9, which is now heavily used within YouTube delivering tens of billions of hours of more efficient video streaming, and Google is in development on VP10 targeted at optimizing 4K video streams. Meanwhile, the H.265 “HEVC” spec is complete, along with various extra “layers”, and being incorporated into enterprise communications equipment, and also comes with even more complex licensing requirements. Other vendors unhappy with the situation have also started developing their own open video codecs – such as Mozilla with Daala and Cisco with Thor.
So the most-likely prospect before September 1st was that the video codec wars were simply going to get worse and more contentious, making it more and more complex for ordinary application developers to simply build great apps that incorporated interoperable video!
And a further concern is that this key area of modern communications was turning into just a “big vendor” war between the major players. Within the previous codec war, there was no space for other vendors to participate and contribute to these standards to meet expanded needs – we all just got to watch the largest players fight it out, and still not reach a simple decision!
Agora.io believes that there are key innovations needed to enable high-quality live voice and video interactions across the world, across many different mobile networks and devices, and across highly variable network conditions. There is, therefore, a need for continued innovation in the WebRTC industry. And these requirements are different from video compression and streaming needs for entertainment. So we are pleased that there will be an open alliance within which to discuss these evolving needs.
The (Rebel?) Alliance
The immediate objective of the Alliance for Open Media (link) is “to deliver a next-generation video format that is:
- Interoperable and open;
- Optimized for the web;
- Scalable to any modern device at any bandwidth;
- Designed with a low computational footprint and optimized for hardware;
- Capable of consistent, highest-quality, real-time video delivery; and
- Flexible for both commercial and non-commercial content, including user-generated content.”
These are great objectives and are fully aligned with the Agora.io vision for easy worldwide communications, especially between mobile devices. WebRTC authority Tsahi Levent-Levi called these “high goals” although jokingly commented that it might be “easier to just bio-engineer Superman”!
But the most important feature of the alliance are the players. The alliance brings Google, Microsoft, Cisco, Intel, and Mozilla – all the vendors otherwise planning to “build their own open codec” including key protagonists on both sides of the previous VP8 vs. H.264 battle. This will move the debate about the next generation of open video codecs from a Google vs. all other alternatives battle into a much more focused alliance vs. non-open alternatives (that is, HEVC) discussion. It will focus what would otherwise have been separate VP10, Daala, and Thor initiatives into one common much better supported effort. This is good for the industry.
For us, this appears to be the key driver from Google’s perspective – to make it less about them as one vendor and to get back to their original stated goal of ensuring that there are open video alternatives available for broad web and mobile use. The breakthrough here has been to ensure that everybody else gets a say by creating a joint development alliance. Google is giving their technology away for free anyway (despite their large investment) so this is not a revenue issue for them. The question for Google, and all these vendors, is whether the alliance will continue to move fast enough in pushing new innovation that Google and others can then monetize in their applications and services rather than at a codec level.
A key real-world need is hardware acceleration for new codecs, especially on mobile devices. Without hardware acceleration video quality is lower, CPUs are consumed, devices over-heat, and batteries run down faster! Intel’s involvement is a start here and we would hope that the real mobile hardware producers will become aligned with the alliance. Google has made some progress in starting to get VP8/VP9 hardware acceleration in various chipsets, which previously tended to only have H.264, and with the open multi-vendor nature of the alliance, we would expect these commitments to accelerate.
Which brings us to Netflix and Amazon. These two vendors plus Google/YouTube are responsible for delivering the majority of the Internet’s video streams every day – which accounts for the majority of the Internet’s overall traffic! They are the biggest customers for video streaming on the planet and have huge experience with what works and does not work with video. Their expertise and requirements will be invaluable in shaping next generation codecs and ensuring that whatever is agreed then gets rapidly adopted in the real world! Of course, we have yet to see if the next major video providers on the Internet will join the alliance – Apple and Facebook?
But there is one point to note about Netflix, Amazon and YouTube – these vendors mainly deliver video streams one-way (which are heavily pre-encoded and can be buffered) largely for entertainment purposes. Compression, bandwidth-use, and error-recovery are common issues for this use as well as other video uses. But real-time people-to-people interactions, especially across mobile devices and networks, have further requirements that must also be a key input to any new standard. This is the goal of Agora.io, WebRTC and the focus for the enterprise and browser vendors in the alliance. For WebRTC to be successful as a common global communications foundation, Quality of Experience even in the most difficult mobile and network circumstances is essential
Quality of Experience – The Force Awakens!
The important news for Agora.io is that the Alliance for Open Media will be a point of collaboration between many different vendors and an opportunity to participate, understand and contribute as next generation codecs are designed and completed. There was not previously any good forum for vendors apart from Google to contribute in an open way to the future of Internet video that had a path to becoming adopted and used across the industry. Now there is.
Current areas of Agora.io innovation for real-time interactions include:
- Optimized transport and routing of real-time voice and video streams to ensure high Quality of Experience (QoE) across variable global networks.
- Optimized handling of rapidly changing and sometimes high packet-loss and jitter situations connecting to mobile devices over the “the last mobile mile” across many different global mobile networks.
- Optimization of voice and video codecs to enable improve real-time QoE.
- High scale distributed conferencing and streaming allowing thousands of users to be efficiently inter-connected within a single session.
- Flexible session management that manages global routing to avoid network slow-downs and better facilitate disconnects and re-connects across variable networks.
These innovations are already built into the Agora.io mobile SDKs (link), using our own optimized codecs today, and our customers can benefit from these capabilities immediately. We see the long-term benefit of the Alliance for Open Media being that Agora.io will be able to better align our innovations with new open video standards and contribute openly to the alliance ourselves as appropriate. This will give our customers more choices and better global Quality of Experience for all their users.