How long would you keep watching a grainy livestream that keeps freezing? In today's video-first digital environment, streaming video quality has an enormous impact on user experience. What can you do ensure a high-quality real-time video experience for your users?
Agora’s Adaptive Video Optimization™ (AVO) is a combination of technologies that work together to deliver the best possible live video quality, with a holistic approach that carefully considers every step, from capture to playback. AVO uses machine learning and AI to ensure superior performance even in less-than-ideal network scenarios.
How does this look in a common real-world scenario? For example, under network conditions with 20% packet loss, real-time video from Zoom’s Video SDK experiences freezes, stutters, and delay, while Agora’s video stream maintains fluency. Zoom's image quality often appears noticeably blurry, while Agora maintains consistent image quality and clarity. Watch this video example below to see how AVO optimizes real-time video in a side-by-side comparison with Zoom.
How is Agora able to deliver smooth, high-quality video under these conditions? Keep reading to learn about the specific user experience pillars and video technologies powering Agora’s Adaptive Video Optimization.
Three Key Pillars for Delivering Exceptional Live Video Experiences
Before diving into the specific technologies powering AVO, let’s look at the core principles we focus on when enhancing the live video user experience (UX): Image Quality, Video Fluency, and Low Latency.
Optimized Image Quality: Agora’s technology ensures images are crisp with enhanced contrast, vibrant detail, and depth, offering a clear and more engaging viewing experience that captures every moment in its true essence.
Unmatched Video Fluency: Experience smooth video playback that ensures a fluid and natural viewing experience, eliminating the common frustrations of buffering and stuttering.
Ultra-Low Latency: Ultra-low latency means that users can engage interactively in real-time with hosts or one another. Additionally, quicker load times and instant first-frame rendering allow users to benefit from seamless transitions between different video streams or feeds within an application built using Agora, making for a natural and responsive video interaction.
These three key pillars are at the heart of Agora's mission to deliver best-in-class user experiences, ensuring that every interaction with live video is as immersive and engaging as possible.
Optimizing Live Video Processing for Quality
Building on the three key foundational user experience pillars, improvements in video processing are essential to providing an optimal real-time video experience for users. The five stages of video processing that Agora’s AVO optimizes are: pre-processing, encoding, transmission, decoding, and post-processing.
Pre-Processing
This initial phase involves Agora refining the raw video input, ensuring the video is of the highest quality before encoding. This step is crucial for setting the stage for the highest-quality output.
Agora’s pre-processing stage includes:
Contrast Adjustment automatically compensates for challenging lighting conditions such as low light environments to ensure a user’s image can be clearly seen.
Perceptual Video Coding (PVC) utilizes deep learning to provide the highest-quality video while consuming up to 30% less bandwidth without compromising visual clarity. PVC considers how our eyes perceive video content and removes any visual redundancies, thus improving compression.
Region of Interest (ROI) framing achieves a better subjective impression of a given video image by allocating more bits to regions of interest-like faces and bodies versus background objects.
Color Enhancement makes video content vibrant and true to life, ensuring it stands out with accurate, vivid colors that appear stunning on any screen.
Pre-Processing Auto Adjustment balances system consumption and video quality improvement by selecting the proper level of algorithm complexity.
Encoding
Video encoding is compressing raw video frames into a bitstream with a much lower bitrate and converting as necessary for optimal transmission.
Agora’s Coding Engine (ACE) is a core function of Agora’s AVO and provides optimal encoding by supporting the following key functions:
Agora Coding Technology (ACT) includes an auto adjust function to dynamically select the best encoder given device capabilities and interoperability requirements. Agora supports a wide range of video codecs in hardware and in software such as H.264, H.265, VP8, VP9 and AV1. ACT can also dynamically adjust the codec selection based on changes in CPU load, ensuring seamless session continuity by reverting to H.264 or VP8 as needed, particularly when compatibility issues arise during a session.
Video Quality Control (VQC) leverages machine learning to dynamically adjust video resolution and frame rate based on the available bandwidth to ensure optimal quality of experience.
Transmission
After Agora optimizes the video in the pre-processing and encoding stages, it is transmitted over Agora’s Software-Defined Real-Time Network (SD-RTN™). Agora's network is a real-time overlay to the internet that adapts automatically to varying network conditions, ensuring that video is routed on the optimal path with minimal latency and the best overall performance.
Agora’s Last Mile Transport (ALT) works between the SDK running on a given device and Agora’s SD-RTN ™ to optimize for transmission over challenging last mile connections.
ALT on the SDK (device) side includes the following functions:
Loss Detection Analyzer which is a machine learning algorithm that classifies packet loss into congestion, random or other categories of loss.
Congestion Control and Bandwidth Estimation intelligently calculates transmission related parameters based upon the detected congestion level of the network and estimates the available bandwidth.
Forward Error Correction (FEC) and Automatic Repeat Request (ARQ) which increase packet-loss resiliency.
Pacing to smooth the flow of video transmitted to the SD-RTN™.
Agora’s SD-RTN™ provides the following key functions to optimize video transmission:
Smart Routing which chooses the most reliable transmission path with minimal latency.
Pre-Load / Pre-Join works in conjunction with SD-RTN ™ to predict and pre-set key information to make first frame rendering extremely fast for users. One key example of where this is very beneficial is in an app where a user is switching between multiple video feeds and wants to see video instantly.
Piecewise Congestion Control which performs congestion control dynamically on each hop across the network according to the traffic situation
Agora Last Mile Transport in the SD-RTN™ provides dynamic selection of the right bitstream(s) of the scalable video codec (SVC) to send to the receiving SDK.
Decoding
Once the SD-RTN™ routes the video traffic along the optimal path to the destination, the video traffic must again traverse the last mile between the final hop of the SD-RTN™ and the user’s device running an application leveraging the Agora SDK. The SDK supports optimal decoding of the stream by leveraging the following capabilities:
A Jitter Buffer temporarily stores received video packets, re-ordering ifnecessary for uniform playback. This smooths out jitter and packet loss in real-time video streams, with an adaptive size that adjusts dynamically based on network conditions. The jitter buffer is carefully tuned to balance latency and quality.
Software and hardware decoding for the given codecs (e.g., H.264, H.265, VP8, VP9, and AV1)supportedby Agora Coding Technology (ACT). The combination of software and hardware decoders helps speed up this process and reduce latency.
Post-Processing
Once the video stream reaches its destination and is decoded in real-time, it undergoes final enhancements to ensure an optimal viewing experience.
Auto-Adjust avoids device performance issues and overheating through real-time status monitoring and automatically re-configures settings for any resource-intensive features.
Super Clarity leverages deep learning to provide sharper and clearer video images, maintaining the same resolution while consuming minimal CPU resources.
Super Resolution upscales resolution and brings life-like clarity to end users. It also makes screen sharing content sharper and clearer.
Video Quality Assessment (VQA) utilizes deep learning to estimate video viewing quality in real time.
Conclusion
Agora’s Adaptive Video Optimization (AVO) delivers optimum live video quality at every step, from capture to playback.
Leveraging the three essential pillars of user experience—Image Quality, Video Fluency, and Ultra-Low Latency—AVO improves the live video experience, giving the audience a consistently excellent quality of experience and driving increased end-user engagement.
With ultra-low latency, consistently high video frame rates, and reduced frame freeze rates, AVO ensures a smoother real-time experience, even under challenging network conditions with packet loss. With the enhanced image quality facilitated by default features like Perceptual Video Coding (PVC) and Super Clarity, AVO offers users an unparalleled video communication experience characterized by details, clarity and fluidity.
These enhancements in user experience are showcased through our Agora Live App, currently accessible on the Google Play store for Android devices and on the Apple App Store for iPhone.
RTE Telehealth 2023
Join us for RTE Telehealth - a virtual webinar where we’ll explore how AI and AR/VR technologies are shaping the future of healthcare delivery.