How to Create an Online Karaoke App Using Agora SDK

Video content is king—but video content in particular is emperor when it comes to digital outreach and audience impact. Did you know…

The average user watches 40 minutes of YouTube video on their device every day
6 billion hours of YouTube videos are watched every day
Facebook posts with videos are shared 2X more than text- or image-based content
Facebook Live videos are watched 3X longer than other videos

Almost every online platform, big or small, is finding some way to incorporate video chat or other interactivity in order to boost usage and user engagement. Otherwise they run the risk of losing relevancy with each generation.

For instance, let’s look at one of the digital powerhouses: Facebook. Facebook announced the one-to-one and group video chat feature on Instagram Direct in 2018, at their F8 Conference. This feature allows users to connect with each other using real-time video chat, even if they cannot physically be together. There is no doubt that many popular social media giants are paying more attention to the implementation of real-time audio and video technology in their platform.

Agora is already aiding many businesses in transforming their traditional applications to excelling in the video revolution. These developers and companies turn to Agora to provide unparalleled level of excellence to their app. Here’s just one example. Way before Instagram’s announcement, Momo, the top social and dating application in China, has started to explore a new use case for real-time audio and video communication features, which are becoming popular in the market: “KTV Together.”

KTV refers to Karaoke Television. It’s for people to sing along with recorded music using a microphone. KTV is heavily ingrained in Southeast Asian culture. Traditionally, people required a karaoke box and TV to start. Since nearly everyone has a smartphone now, online karaoke can be a new exploration in the social media market–one that Agora SDK is more than ready to help you discover and dominate.

How Does “KTV Together” Work?

The following demo video shows the specific workflow and basic features:

After the host creates a chat room, they can turn on the KTV feature and grant the microphone access.
The room host picks a song online (through a karaoke database, such as Karafun). The participants and the host can start singing along with the songs by following the video and subtitles.
Other audiences can send requests to take over microphone, letting them pick a different song and sing.
Singers can also adjust the volume and background music.
The room host can play, pause, or skip the current song.

Example of KTV Together on Momo, the top social and dating application in China

The Challenges of Online KTV

KTV is very similar to the basic live broadcasting use case. However, it has its own challenges.

Song Synchronization Control

Online KTV emphasizes participants singing “together.” The room host can invite multiple listeners to sing, and everyone has a chance to shine in the chat room.

In this process, the “microphone” access can be transferred to different audiences in sequence. And the host can still control the song playing, such as play or pause the song. However, if we use RTMP to do the transmission, the network delay is relatively high. In the case of good network conditions, when the host pauses or skips the song, it may take 3-4 seconds for the audience’s client side to receive. Or the song may have already started, but the next singer’s client side may not have yet started. If the network is poor, the delay may exceed 10 seconds.

High Sound Quality and Image Quality

Singers want the best experience so that they can show their singing skills. If the platform cannot provide a high sound quality, it will definitely affect the user experience. The music video image quality in this scenario is equivalent to the video quality in one-on-one or group communication. The issues such as jitter or blur also exist in online karaoke apps. If developers want to build the platform entirely on themselves, they have to carry out transmission based on UDP protocol, and optimize the strategies in the deployment of edge nodes, network congestion, weak network transmission, etc.
At the same time, good sound and image quality and low latency are variables at both ends of the scale. In addition to optimization strategy, developers need to optimize codec algorithms as much as possible to reduce audio and video delay at the client side.

How Should We Solve It?

Here is the logic as shown above:

The room host turns on the “KTV together” feature and becomes the lead singer.
The host’s client side pulls and reads data from the third party Karaoke database.
The audience sends a request to gain the “microphone” access.
After the request is approved by the host, the audience can pick songs from Karaoke database.
The host’s voice and the background music are transmitted to cloud (Agora’s SD-RTN) based on UDP protocol.
UDP protocol delivers the host’s audio and background music to the audience side.
When it is time for the next singer to sing, the singer has the “microphone” access. Except for the music control access, the singer in sequence should have the same access as the host.

The Advantages of Using Agora SDK for Online Karaoke App:

High-quality audio and video experience

In terms of audio frequency, the Agora SDK provides an industry-leading audio and video encoding/decoding technology, which supports 192kbps 44.1KHz sampling of high-quality audio, such as traditional KTV. The SDK also supports 720p and 1080p high-definition image transmission to ensure the clarity of the videos on the client side.

Lower Video/Audio Latency and Better Synchronization

With Agora’s SD-RTN™ real-time communication network, audio and video data are transmitted through a private UDP protocol with lower latency. That’s why Agora SDK can provide a better audio and video synchronization among hosts and audiences. At the same time, SD-RTN™ has the algorithms to optimize the routing path, which would automatically avoid network congestion and the impact of network failure.
SD-RTN™ has deployed nearly 200 data centers in the world and supports more than 200 countries and regions. It can empower various applications in the global market.

How will you use Agora SDK for your Karaoke app? Do you plan to create a whole new Karaoke function from scratch or make it an add-on to your platform? We want to see the brilliance you brainstorm and how you apply Agora to make your vision a reality.

Support Voice Control

In addition to the basic KTV features, Agora’s SDK allows developers to add host controls in their application, such as volume adjustment, skipping songs, switching microphone access, etc.

Learn more about Agora's video and voice solutions

Ready to chat through your real-time video and voice needs? We're here to help! Current Twilio customers get up to 2 months FREE.

Complete the form, and one of our experts will be in touch.

Try Agora for Free

Try for Free

TEN

App Builder

Flexible Classroom

Download SDKs

Support Plans and Pricing