Multimodal Communications in the Metaverse

Summary: In this blog post we will discuss how enabling multimodal forms of communication across different devices can help drive the adoption of spatially oriented virtual spaces.

There are many advantages to supporting both native and web applications. Doing so allows you to reach diverse types of users across a variety of devices. While web applications offer wider reach, native applications provide access to device-specific features and functionalities, resulting in a more comprehensive and functional experiences. However, as we consider user experience on different devices, it is important to acknowledge that a one-size-fits-all approach is not practical nor desirable, especially within the context of shared 3D virtual spaces.

In this post, we dive deeper into the ways that real-time engagement solutions, along with quality-of-life improvements like 3D spatial audio and AINS (AI Noise Suppression), can help instill a sense of presence for any user, on any device. This will be a key driving factor in the widespread adoption of 3D virtual spaces and the metaverse.

Asymmetrical User Experiences

The 2D internet has its advantages. It is easy to access and universally available. Yet, its flat nature restricts our ability to feel a sense of depth, spatial awareness, and presence in virtual environments. Traditional 2D websites and applications are functionally a series of text blocks, lists, and forms that we can interact with. Even in recent years, with real-time interactive video having emerged as the dominant medium for online content, the term “Zoom fatigue” has likewise seeped into the lexicon of the global workforce. We’ve all begun to experience the limitations of interactivity in 2D spaces. So, now we wonder, where to go next?

Companies experimenting with mixed reality and real-time rendered 3D spaces aim to address this pain by enabling embodied interactivity in digital environments. These technologies have the potential to revolutionize the way we interact with digital content, allowing us to move beyond the constraints of 2D space and offer engagement opportunities that more closely resemble our physical interactions.

However, the transition to a 3D interaction design paradigm presents a steep learning curve for both users and developers. Just like when video games underwent the proverbial teenage years of 3D gaming in the 90s and early 2000s, modern digital natives will experience facets of this awkward transition over the next few years across a broad spectrum of online interactions. Moreover, as we discussed in part 1 of this series, ubiquitous, cross-platform availability is a key component in driving mainstream adoption of the metaverse. This means that the transition to 3D will require a willingness to experiment with applications that brandish both 2D and 3D interfaces, depending on the end user’s device-type. This is where the concept of asymmetrical user experiences comes into play.

Context appropriate interactions

Asymmetrical user experiences are an extension of the responsive design trends that started in the early 2000s, when designers were trying to find ways to adapt to the growing number of devices that people were using to access the internet. Similarly, as we move from 2D interactivity to 3D interactivity, we need to apply the wisdom that led to the development of responsive design and understand how we might craft asymmetrical user experiences that enable interactions that are appropriate to the affordances of each type of device.

An affordance refers to the potential actions that can be taken by a user in relation to an object. For our purposes, they suggest how a device can be used to interact with a virtual environment. To illustrate what we mean by this, let us look at the interactions afforded to virtual reality (VR), augmented reality (AR), desktop, and mobile users.

When wearing VR and AR headsets, users have a greater sense of immersion. The affordances of these devices approximate physical realism and responsiveness, allowing for more intuitive interactions. For example, a user can reach out and physically manipulate digital objects. This creates a more natural and immersive experience, enabling users to perform actions that feel closer to how they would behave in the real world. Likewise, the naturalistic movement afforded by these technologies enables users to engage in more personal real-time conversations that are closer to the types of interactions that they might have in co-located, physical environments.

On the other side of the spectrum, we have desktop users, who typically rely on a mouse and keyboard for interaction. The affordances here are more limited, as users’ actions are primarily limited to clicking, dragging, and typing. While these interactions are familiar and efficient for many tasks, they lack the tactile and sensory feedback that is present in AR/VR. However, desktop users have access to a wide range of software and applications that may not be available or optimized for other devices. The vast array of once limiting text blocks, lists, and forms, now become tools for efficient multitasking.

Mobile devices, with touch screen and sensor-based interactions, provide a balance between the spatial freedom afforded to AR/VR users and the multitasking efficiency of desktop users. While the touch screen lacks the physical feedback of buttons or controllers, it allows for intuitive interactions. Additionally, the camera allows users to participate in mixed reality experiences, even if the smaller screen size limits the users’ perceived sense of immersion.

The key takeaway is that any virtual space, particularly spatially oriented ones, should enable users to interact with it in a way that makes sense for the tools in their hands. An excellent example of this concept is the telepresence platform, Beame, by Aetho, which enables users to “communicate and collaborate meaningfully by teleporting people and content into [their] real world environment.”

Source: Beame.me

The core value proposition of Beame is that it enables users to meet “face-to-face” using avatars in a virtual or mixed reality environment. Users can access the digital realm using AR/VR headsets, or even through a mobile device if a headset is unavailable. In addition, a web portal enables anyone to participate in the meeting using more traditional 2D real-time engagement functionalities, like video conferencing and text chat. Altogether, the cross-platform nature of Beame, and the asymmetrical user experience considerations for 2D and 3D users, yield an inclusive platform that enables everyone to participate in a context appropriate manner.

Build simultaneously for any metaverse scenario

By considering the needs and preferences of a diverse range of users, we can create virtual experiences that are accessible, inclusive, and appealing to a broad audience. This, in turn, will drive the adoption of 3D virtual spaces, as more users will be exposed to the benefits of more immersive and interactive virtual environments, even from the comfort of familiar 2D interfaces.

Agora’s platform-agnostic solutions have been designed to work seamlessly across a broad spectrum of device types. It is the essential RTE (real-time engagement) platform to enhance virtual communication and collaboration. Whether you are working on a web or native application or building a virtual world with a 3D engine like Unity or Unreal, Agora can help you connect with your audience in real-time across devices.

Agora also offers extensions to enhance the experience for all users. For example, the 3D spatial audio extension helps create a sense of presence in relative virtual space, regardless of their device and interface. AI Noise Suppression can likewise improve the quality of online interactions by making it easier for users to communicate effectively, even in noisy or distracting environments. Altogether, Agora’s complete solution stack is the ideal infrastructure to build the internet of tomorrow, today.

Want to learn more? Check out Agora’s Metaverse solutions.

Stay tuned for part 3 of our infrastructure for the metaverse series, where we will discuss how developers can account for passive and active users who will be gaining access to virtual spaces from a variety of social contexts.

In the meantime, sign up for free to discover Agora’s potential for yourself.

Learn more about Agora's video and voice solutions

Ready to chat through your real-time video and voice needs? We're here to help! Current Twilio customers get up to 2 months FREE.

Complete the form, and one of our experts will be in touch.

Try Agora for Free

Try for Free

TEN

App Builder

Flexible Classroom

Download SDKs

Support Plans and Pricing