A short decade ago, telemedicine was still in the nascent stage, struggling to fit into the daily workflows of a medical practice. The barriers to adoption varied from the lack of staff education to reimbursement considerations to limited patient access to the technology. But the pandemic is pushing the technology to fast track.

Now that going to a medical site for a face-to-face visit has become plainly risky, the demand for telemedicine services has skyrocketed. The support for this unprecedented expansion also comes with relaxed regulatory restrictions around the use of telemedicine technology and insurance coverage.

To live up to the expectations and effectively serve the growing base of users during COVID-19 and beyond, privacy and scalability issues need to be addressed head on. Earlier, we have discussed HIPAA compliance at length. And here’s our take on how to bring reliable scalability into the equation.

Highly available architectures

High availability simply refers to a system’s ability to ensure dependable performance with minimum downtime for a long period of time. For telemedicine applications, this is a major requirement since a disruption in service delivery can result not only in revenue loss but can have a negative impact on health outcomes as well.

To guarantee resilience of a critical video conferencing application, highly available architectures leverage redundant, fault-tolerant configurations of network connections, power supplies, and other components. Redundancy is also supported through the use of clustering — a method of grouping signaling and media servers so that if one server fails, another can immediately restart the solution.

Highly available cluster architecture

Highly available cluster architecture

Cloud-based deployment

When it comes to scaling your business, cloud infrastructure offers unrivaled flexibility that on-prem deployment can hardly compete with. Whether it’s point-to-point video conferencing or simultaneous multipoint connections, cloud capacity can be provisioned immediately to meet the demand in the most cost-effective way.

Moreover, faster and more accurate resource provisioning can be ensured through the use of predictive autoscaling. By tracking capacity utilization and traffic trends, ML-driven algorithms can forecast the future demand and provision the right amount of instances just in time to ensure seamless response to fluctuating activity.

Amazon autoscaling architecture

Amazon autoscaling architecture

How we can help

Looking to develop a scalable telemedicine video conferencing solution? Backed by over a decade of experience in the video domain, we are experts in building WebRTC-enabled telehealth solutions that are dependable, scalable, and designed to meet the stringent HIPAA compliance needs.

Hybrid topologies

In video conferencing, three main topologies are used — peer-to-peer (P2P), selective forwarding units (SFU), and multipoint control units (MCU). P2P, as the name suggests, enables video conferencing participants to connect directly without involving any infrastructure like a server. While it is a cost-effective solution, it is not well tailored to handle the increasing number of participants because sending multiple audio and video streams between end points will drain bandwidth, overwhelm processing capacity, and degrade video quality.

To scale your video conferencing solution beyond a doctor-patient consultation, SFU and MCU-based topologies can help. In SFU settings, users send their streams just once to the server, which then decides which streams to forward to which participants. This topology significantly reduces the bandwidth usage and CPU burden.

Also called a videoconferencing bridge, MCU receives encrypted streams from all participants, decodes and mixes them into a single media stream to be sent to end users. This means that no matter the total number of participants, a client will have just one bidirectional connection with a MCU, which notably decreases latency. In addition to transcoding, recording, and mixing, server-based topologies also enable more advanced processing capabilities like CV-based face recognition, speech recognition, and more.

Highly available cluster architecture

Multiparty WebRTC architectures

To be able to support both one-to-one consults and many-to-many training sessions, make sure that your telemedicine video conferencing solution leverages a hybrid approach. Depending on the use case, a hybrid architecture enables seamless switching between topologies to meet scalability demands without compromising the quality of experience.

Performance testing

The efforts to ensure scalability of your telemedicine video conferencing solution would not be complete without a comprehensive performance testing strategy that includes stress and load tests.

Typically, there are two vectors to load testing a video conferencing solution — how a service can handle multiple sessions in parallel and how a service can accommodate an increasing number of users within a single session. By simultaneously scaling the size and number of sessions, you can address the both needs though it can be tricky in terms of bandwidth and processing power needed.

WebRTC has many moving parts to it. You have the user devices, signaling servers, the application server, TURN and STUN servers, sometimes media servers. The areas you need to focus in your WebRTC testing will be different than those of someone else and would depend on both the use case and the architecture chosen.


Tsahi Levent-Levi, BlogGeek.me

Analyst & Consultant for WebRTC, CPaaS and ML/AI

Since a WebRTC-enabled solution is composed of many components, make sure that load testing tools you choose cover signaling mechanisms, transport protocols, encoders/decoders, TURN/STUN servers, and more.

Wrapping up

Pushed into the mainstream by the recent pandemic, telemedicine is now trying to catch up with the snowballing demand. But scaling telehealth video conferencing is no easy feat and requires a well-grounded approach to architecture design, topology infrastructure, deployment options and WebRTC-specific performance testing.


To get your project underway, simply contact us and an expert will get in touch with you as soon as possible.