Skip to content
What it takes to build real-time voice and video infrastructure

In this series, WebRTC expert Tsahi Levent-Levi of BlogGeek.me provides an overview of the essential parts of a real-time voice and video infrastructure—from network to software updates. Check out his informative videos and read how Agora’s platform solves the challenges so you can focus on your innovation.

3.1 WebRTC

Watch time:
Category: Chapter 3: Clients

Browser-based RTE applications depend on WebRTC. Learn about the uses, pros, and cons of WebRTC.

Dive deeper: Read about The Past, Present, and Future of WebRTC.

Transcript

Let’s talk about the different clients that we’ve got in an RTE network. And we’ll start with WebRTC. Here’s what we’re going to do in this lesson: We’re going to introduce WebRTC. Explain what that exactly and browser support with WebRTC and then we’re going to understand where it fits in the realm of in real-time engagement platform.

Read the full transcript

If I had to say what WebRTC is, then for me, this is the definition: WebRTC offers real-time communication natively from a web browser. If you take a modern web browser today, Chrome, Safari, Firefox, Edge, they all support WebRTC (web real time communications). What that means is that I don’t need to install anything or do anything in order to be able to communicate using voice and video in real-time within a browser, besides having a browser.

The second part of it is that WebRTC is a media engine with JavaScript API’s. This is important. We have a single API today in the browsers that tells JavaScript. Web developers can use it to implements a media engine. The media engine is what we have in any Voice Over IP, or real-time communication platform where we need to manage voice and video in real time. So, there is nothing new in WebRTC. Besides the fact that it’s now available in all browsers and there’s a JavaScript API to it.

WebRTC became an official standard specification by the WCC and ITF visa standard organizations in January 2021. So, this is a done deal, and it’s out there and available for all. One thing to remember: WebRTC is a technology and not a solution. I can build stuff with it, but it’s not the solution itself—it’s not my application. It’s important because you cannot compare WebRTC to a specific application, you can only compare one application to another. And each one of them might or might not use WebRTC and if they both use WebRTC, they might end up using it in different ways, and this is going to affect the user experience.

How do we make calls with WebRTC? We’ve got here an example with two browsers and the server. This is the classic example with over DC. In order for the browser—the user on the left side to find the user on the right side, he needs to first communicate through a web server that knows both of these users. So, the user on the left is going to send a message, this is going to be a message that isn’t related to WebRTC. It is in the fact that it includes components related to WebRTC, but the message itself was built and communicated by the application layer. So he is going to send the kind of an offer through the server to the other user, and that offer is going to tell him something like, “I want to call you and talk to you and I want this to be a video call.” Now just an example.

The other user received the message, passed that to the WebRTC component inside the browser received the answer that he wanted locally and then he just forwards that back as an answer to the first user. Once this initial negotiation complete, we’re going to have no direct communication between the browsers. This is line number five here, this is where media, you send directly across these two browsers. Only WebRTC can do that, within a browser, the ability to send data directly between browsers and in real time. We might want to do that and route domain data through a server—that’s up to us. We might not have a chance other than routing through server. But at the very basic level, this is a web RTC works. The three main API’s in WebRTC get user media, which gives us access to the camera microphone and the screen of the device. Then there’s pure connection that does everything from encoding to decoding, to sending the data over the network, receiving it dealing with packet losses on the network, things that we will see later in this course. Then there’s a data channel that is able to send arbitrary data directly between the browsers in a low latency fashion.

So why all the fuss? I mean, WebRTC. It’s just a media engine, and it exists in the browser. What does it do differently, or a newer than what we had before it?

First and foremost, it’s free, you can go to RTC.org, download the source code, and use it wherever you want. It’s open source in nature. And it’s available on all browsers and has a JavaScript API, which means that the target audience here aren’t Voice over IP developers. This is generic web developers. (I’m calling them generic but they’re not.) It means these people have not been indoctrinated about what Voice over IP is and for them, this is just another feature in the browser. So they can do whatever they want with it.

For real time engagement, this is important, because now, people can envision the features or the use cases of scenarios that they want. And they can develop it directly on top of browsers with nothing that they need to install. They might still install things if they want to, but they don’t have to, it’s up to them how they deployed.

WebRTC sits in an intersection of worlds Voice over IP, the traditional way. So far, we send and receive media between users telephony, let’s call it, and then our internet, or the web. This intersection means that it brings with it a lot of challenges to people that not only work or to people with no only web. The advantages of using WebRTC? Well, there is no installation needed. It’s just during the browser. It has a large ecosystem and backers. So, you know that there are people around you that can help you and there are many users so that it’s not a technology that is going to die tomorrow and it is available mostly everywhere.

The challenges of using WebRTC are also quite few. While you’re not in control of the client implementation—think about that for a second. You have the browser, the browser implemented WebRTC, he decided what part it needs and what it doesn’t. Then it gives you a JavaScript API. If the things that are there doesn’t fit your needs, there is nothing you can do about it besides looking for a crown (sic) or deciding not to work in that browser. So, you are not in control. And you need to live with that fact. With the same applies to the fact that it might not be optimized for your specific use case. If your use case is different than what the developers of the browser decided to implement in WebRTC. Then again, you need to figure out how to deal with that.

Last but not least, they will implementations of browsers of WebRTC moves fast and break things, they add features, they deprecate features, and then they try things. They optimize the code over time, which changes the behavior and from time to time, it also breaks applications in production. So, you need to keep in mind the speed of change pace of change in WebRTC, within the browser deployment.

So, what did we see? WebRTC offers us built in support in the browser for real time communication applications, and by extension, through real-time engagement platforms. It is a good starting point for developers, you can take WebRTC and build applications with it. But, it lacks a lot of the infrastructure pieces that are necessary by real time engagement. You’ve got only the browser component, but a lot of the servers that we’re going to deal with a need are things that you need to add on your own on top of WebRTC and this is part of what you need in a real time engagement platform. Thank you.