Audio, Video and Screenshare with Agora.io in React.
In the development of timefic.com, a meetings management application, I needed to add audio and video to it.
Today, with WebRTC supported for most browsers, even in mobile, there is much simpler to add this to your website.
But, like every new thing you do, there are a lot of iterations until you find the right way to understand, structure and integrate the “WebRTC code” into your existing one.
So, we are going to divide the problem into 4 categories:
- A React Component: In charge of the lifecycle of the media elements and method calling orchestration.
- WebRTC Methods: Basically native methods provided by the WebRTC service of your choice, separated to achieve only one purpose each.
- DOM manipulation: Although I am using React on my website, the media elements are created and removed explicitly calling DOM operations (we are going to see why in short).
- Actions: Just wrappers for a series of methods calling, that are convenient not to have inside the component for better legibility and understanding.
The React Component
This component is in charge of all your WebRTC needs. I decided to have all in one place because one thing learned is that you need to be sure that this component is never unmounted (unless you want to exit), because on unmount you are going to stop the service, and therefore the audio, video, etc.
For example, I initially had the video inside a tab in the sidebar. So, whenever I changed the tab the audio was lost…
So, this component may be located at the root, or near it, of your component hierarchy and the nodes that will be actually holding the video elements can be somewhere else. Actually, the component receives the id of this nodes as props, so it can “paint” the DOM where is needed.
Note: This component assumes that you already have loaded the script of your provider, in my case, I am using this:
So, let’s dissect each part of this component:
- Class properties: All properties (from AgoraRTC to shareVideoProfile) are used by the methods we will explain later. Some are initialized with a default value and the others obtain its value inside of a method. Because they are used all over the methods, they are defined inside ‘this’.
- State: The state will be holding the streams that are initiated by the local peer and the streams that come from the other connected peers. We need to use the lifecycle method componentDidUpdate, so we need the state to have it available.
- UpdateState: This a custom method that “enhances” this.setState with a callback to a function passed as props: syncStreamList. This is to be able to inform a store outside this component that the stream list has been changed. In my case, I have labels with the name of the connected peer over the video, so this component is not managed by the WebRTC component.
- Constructor: Just a regular one, with binding this to some methods for convenience.
- ComponentDidMount: The peer is created (video) and if the peer has also the screen shared another peer (screen) is created.
- ComponentWillReceiveProps: To be able to manage audio/video and screenshare enable / disable.
- ComponentDidUpdate: Called after state is changed and used to call the method that updates the streams in the DOM.
- ComponentWillUnmount: Closes everything.
- Render: Just returns null, why? Because DOM elements are not created by React but with the RTC provider api, so in this case needs to be like that.
The methods are the ones (and the only ones) that talks to the WebRTC service api. So, I extracted DOM manipulation, React lifecycles and other things that don’t have to do with that definition.
Remember all code is here.
First, we must create a client with our service of choice. We will need one client per stream. So, if a peer needs to emit audio/video will create one client for that and if needs also screenshare, will create another client for that. The output is the client created and located at this.client and this.shareClient.
Subscribe Stream Events
The main client (this.client) created are going to be aware of:
- Stream added: It will subscribe to that stream.
- Stream subscribed: The consequence of the previous event, when the peer is subscribed we are going to add the stream (to the DOM)
- Peers that left the meeting: We will remove the stream.
Share client will not subscribe to this events because it’s used only to create a second stream in the same peer.
When we use audio/video and screenshare we need 2 clients, but we need these clients to join the same channel. This may be not the only way to do this, but its clear in the sense that the screen its only another stream inside the same “room”.
The stream needs a different configuration for audio/video and screen share. The image shows that (but remember all code is available here).
Here we actually initialize the local stream, call the DOM method to add the node and publish the stream so connected peers that subscribe this channel can make it appear 2 seconds after its created.
Manage Audio and Video
The api also allows us to enable and disable the audio and the video of the local peer. So, because in timefic.com we can moderate the conversation or have only some peers with video enabled, we need to call these methods once the component is mounted and when the props coming from the outside changes.
The mental model to have is here:
As you can see, every peer publish its own streams and subscribe to the streams generated by the peers. How do they know? Because they are on the same channel.
The DOM operations
DOM operations are simple: they look for elements by its id and creates or removes them from the DOM. Also can be used to style and make imperative calls to the streams (like the stream.play method on updateStream). Actually this is the reason why we don’t use React to render the nodes.
This operation touches the DOM indirectly. It updates the stream List that is in the state and then the lifecycle componentDidUpdate calls the update Streamoperation that finally touches the DOM.
This operation finds and removes the element from the DOM and updates the state to keep it in sync.
When there are state changes, this method checks if there is a new DOM element that needs to be created and, after that, played. Because elements may need different styling (coming from outside) this operation uses the Update Stream Style utility.
Actions are just wrapper for calling methods in a sequence that makes sense.
Call the following methods (by chaining promises in this order):
- Init Service
- Subscribe Stream Events
- Join Channel
- Stream Config
- Stream Init
- Manage Audio
- Manage Video
Call the following methods:
- Init Service
- Join Channel
- Stream Config
- Stream Init
As you can see, this second sequence has not Subscribe Stream Events step. This is because all subscriptions are managed at the unique channel that is created when the peer joins the main channel (Start Peer action).
Just removes the share screen stream and closes it.
Some lessons learned that can save you some time:
- Chrome requires an extension for Screen sharing. For Agora, for example, that extension can be found at the chrome store but works only with the demo they have available here. So, for localhost development, I had to download the extension, change the manifest.json to allow localhost and then load it in chrome instead of the one of the provider. In production, I will need to upload the extension to the chrome store pointing to timefic.com.
- Be careful with your global CSS styles. I had (I don’t remember why) the video tag CSS property backface-visibility set to hidden. So I was going crazy because I was not able to see my local stream, only a black square… So it was just CSS. Local streams are rotated 180deg to be able to see you in a more natural way, but because of that I was seeing only the dark side… 😝.
- Choose the right provider: So far I am satisfied with Agora.io because the support in Slack is great and this is very important when you are just starting!
I hope this post saves you some headache, cheers from Chile 🇨🇱!