Skip to content
What it takes to build real-time voice and video infrastructure

In this series, WebRTC expert Tsahi Levent-Levi of provides an overview of the essential parts of a real-time voice and video infrastructure—from network to software updates. Check out his informative videos and read how Agora’s platform solves the challenges so you can focus on your innovation.

6.2 Scale and Distribution

Watch time:
Category: Chapter 6: Signaling

Learn about how signaling scales, and the important concepts of causality and group chattiness.

Dive deeper: Agora provides cross-platform Real-Time Messaging and Chat SDKs.


We can’t talk about signaling without discussing scale and distribution. Here’s what we’re going to do in this lesson—we’re going to review the signaling, and focus at scale and what we need to do there, then we’re going to look at the aspects of causality and chattiness. 

Read the full transcript

This is what I think when I look at signaling. Signaling isn’t as simple as it looks, it’s like an iceberg. There’s how it usually seems—we saw that also in the previous lesson, when we discussed send and receive to primitives but there are a lot that goes into that. Similar to scaling. 

 So, what it feels nice, there is a lot to it. There are a lot of things that are related to the scale of the service. There are a lot of things we need to get to the bottom of, order to get signaling right at scale. We need to also know that signaling scales very differently than media.  

If we put know this orange box and say that this is 10 users, then the media servers can handle about 100 users per server, or may be 500 depending on the use case—sometimes 2000, but that’s about it. So anywhere between 100 to 1000. Usually, a signaling server on the other end, starts at around 10,000 users that it can handle and then move on from there. 

 So, there’s a different scales here that we need to deal with and since the scales of media servers start so high, we usually only get to scaling a signaling server later in our development and implementation; which means that we might not think about all the things that relate to the architecture that we need to put in place from day one. When the time comes to actually scale out a signaling server, the only way that we can do that, besides growing the server, is open up additional servers to scale it out to multiple servers. So instead of having one, we’re going to have a lot of different servers dealing with signaling. Now the problem is that there are a lot of different issues that now we need to solve. One of these problems is that we cannot really force different groups of users to the same server. Let’s say I decided to place users as close as possible to where they are in terms of what signaling server that they’re going to be on. So, I will be a user sitting in Israel, and you might be user sitting in India or in any United States andwe’re not on the same server, geographically at least. So, we cannot even group us together because we don’t know how big the group is going to be for most of these use cases. I might speak with you today and someone else tomorrow. So, the server that I’m on might need to be different.  

How do we find which server user belongs to if I want to call you to reach out to you on whichever server you are registered now? So all of these interactions across servers happen a lot more than they do in media servers. 

Also, remember that we have a lot of messages that go out of the scope and out of the context of the actual media session. How do I deal with database database connectivity and bottlenecks there? A million users in a database needs to be live, because people are sending messages to each other has a different set of requirements, then someone on a call, that doesn’t send a lot of messages. Then there’s the problem of guaranteeing the source of truth. I want to show an example of that. What does it mean, and I’ll do that through the concept of causality.  

We have a group session in that group session, we have three participants, the green person here is going to send a message to everyone. So both participants receive M1, the first message in this conversation, then the orange person read message one (M1), and is sending a message to the group. His message might be received for the third person before the third person received the message from the green person. So he might receive M2, before he receives M1. Is that okay? Are we fine with it? If you want to fix that, what we’re looking for is to have causality within our messaging system for group sessions. That means that people will only read messages that are sent to them, after all messages before them that that person that sent the message have sent have been received already. So M1 needs to be received before M2. Or M2 will need to wait until M1 has been received.  

Then there is group chattiness, and this is a problem in large groups. Let’s say, the only thing that we want to do is read-receipt. If I send a message, everyone simply tells me they received a message or not, because I want to see it in the UI. If we have a group of two people, that’s easy, I’m going to send a message and the read receipt is the message back. If there are 3 participants we now have 6 messages. Because now I am sending a message to the third person as well. He’s replying back to me with a read receipt, but he also needs to send that between the other two participants as well, so that they can check the read-receipts. And if I have 4 participants, well, now we have 12 messages. If I have N participants, we have n multiplied by (n-1). Which is kind of a lot. This means we cannot grow the group linearly. This means we have a lot more messages and the service will be very very chatty, just because of the messaging indicators that need to go around. So we need to have special attention to groups and how we scale the groups.  

So here are two things we need to deal with when talking about signaling and scaling. We’ve got the scaling, or decentralization, of the servers themselves—how do we go from one server doing all of our signaling, to multiple servers that are globally distributed. Then we need to think about the group sizes if we can have them and the chattiness of group messaging. You should take care of scaling early in your design. Doing it later will be a lot more of a problem. Thank you.