A lot has been said about the impact of WebRTC on telephony landscape and its technology stack. However, this particular article really got me thinking – can WebRTC really replace SIP?

The WebRTC battle is a major one, there’s a lot to talk about, and this article is in no way exhaustive. In this debate, any internet-related buzzword can find its place – net neutrality, data privacy, LTE, Internet of Things, etc. It’s the butterfly effect of bike-shedding (also called Parkinson’s law of triviality) as anything and everything internet-related has a direct impact on WebRTC’s future.

If this is the first time you hear about those weird acronyms, hang on tight, it’s not that complicated! WebRTC stands for Web Real-Time Communication. It’s an API, drafted by W3C built in to browsers that enables browser-to-browser communication for voice and video chat and P2P file sharing without the need of either internal or external plugins.

SIP stands for Session Initiation Protocol, it is a text-based protocol used in Internet telephony (VoIP) for signaling and controlling multimedia sessions.

So, SIP and WebRTC are different in their functions? Even though the two can overlap in use cases, SIP and WebRTC play different roles in communications. As Steve Anderson puts it beautifully: “It’s like the square and rectangle concept; all squares are rectangles, but not all rectangles are squares. SIP can exist without WebRTC, but WebRTC needs the help of a protocol to fully operate.” More specifically, WebRTC needs a signaling protocol to fully operate.

The WebRTC vs. SIP battle is actually a set of two different battles going on at once:

  1. SIP vs. Signaling Protocol X
  2. WebRTC vs. VoIP (Browser vs. PSTN)

The Protocol battle

The W3C draft of WebRTC is still a work in progress and has no protocol specified for certain features, like signaling. Some view this as a gap in the WebRTC standard, others view it as an opportunity to try different options and learn while adapting. Above all, this lack of signaling protocol brings telco people from all over the world together for a heated debate – they don’t even agree on how to spell “signalling/signaling”. In the end, the telecommunication environment is so diverse that having different ways of doing signaling at the moment is not so bad.

Let’s take a look at the different ways of doing signaling in WebRTC:

SIP over WebSockets:

SIP over WebSockets is a standard in VoIP and used across telephony due to its interoperability with the PSTN. Though its convenient semantics for the telco world and widespread usage, it was not invented with WebRTC in mind and applications need a custom stack or gateway. However, service providers are opening their networks to the Web by making SIP infrastructures accessible over WebSockets. Additionally, open source frameworks like JSSip or sipML5 are enabling the encoding of SIP messages to Javascript – making SIP signaling more accessible to the Web. SIP is so widespread and has an important community actively working on its integration to WebRTC, but should we limit ourselves to SIP? Will any other signaling system prevail over it?

JSON over WebSockets/Comet

JSON, Javascript Object Notation, is a subset of JavaScript and therefore is natively readable by web browsers. Therefore, it’s considered as the most intuitive signaling mechanism for WebRTC. This system requires no processing or encoding and is seen as a rising challenger to SIP over WebSockets. It’s biggest disadvantage is SIP’s main – the fact it needs a custom gateway or stack if it needs to be implemented to other types of communication services as there is no one JSON signaling protocol. Though its drawbacks, JSON signaling has a huge community supporting it by developing third-party frameworks, libraries and services such a PubNub, Firebase, or even dedicated federation like Matrix.org

Other signaling options include XMPP, which requires the use of the Jingle extension, or WebRTC’s data channel.

Why is it so important? Because of the heavy past of telecommunications (legacy telephony) and its very modular future (web development), locking WebRTC to one particular protocol could outcast it from on camp or the other and greatly hurt its potential. There is currently no standardization because there is no right answer: what might work for one use case might not be ideal for another. In the end, it will come down to the ease of implementation of the protocol’s frameworks and the support of its community. Since the number of candidates is not that great, there is no need for immediate standardization. Maybe what we will see is a decoupling of what developers will use and what telcos will use.  Eventually, they will all evolve to be compatible with one another and develop greatly unified communication systems.

The Browser vs. The PSTN

The standardization of a signaling protocol isn’t WebRTC’s only battle. As WebRTC’s browser and device compatibility has to potential to change the way we communicate, therefore potentially eclipsing the PSTN (think very long-term). However, WebRTC’s future, along with its cousin the PSTN, will highly depend on a lot of different factors which shape the way we even perceive communications in general – here are potentially a few of them: 

WebRTC Standardization & Browser Battles

We’ve discussed WebRTC’s fragmentation in signaling protocols. However, that is not all, WebRTC is seeing a huge, even more fierce codec war between VPn and H26n (n = version number). Again, this war is between the browsers’ needs (more like ‘wants’) and the need for WebRTC and telcos to be interoperable. In the future, this type of fragmentation along with the acceptance rate of WebRTC in browsers will define whether it will be easier to call our friends via the browser or just pick up the phone.

WebRTC Going Mobile

The communication world is changing, not only through new protocols but also new hardware. In a handful of decades, we went from a dial tone to having small computers in our pockets. WebRTC’s future is also defined by what the phone is becoming and what it will “mean” to communicate. A phone isn’t just a phone anymore, it’s a computer with a browser. As we move towards new communication devices like the wearables, they will need to be packed with speed and low-cost data plans in order to give a chance to WebRTC to surpass the convenience of the phone call – we’ve seen the rise of LTE services boosting internet speed on-the-go and bringing the concept of VoLTE. By offering services on both end (Data + voice), Carriers will be at the center of the WebRTC vs. PSTN battle.

Identity Management

One of the major roles of the PSTN was the emergence of the e164, which has become a vital part of the internet’s ecosystem mainly for its use in defining identities. A phone number, with context from the carrier, contains a ton of information and, combined with other sources of identities, can help create rich online user profiles. As for any of the other debates, some believe that the phone number could be a “universal identifier” regrouping billing, communications, personal data, and be linked to government ID databases. One the other end, people who think the phone number will become extinct and we’ll all just have a personal WebRTC URL, SIP uris, or social ids. “Identity” on the web is completely fragmented, no one has one identity. This is actually a good thing because it enables contextual identity – offering selected quantities of information to be shared depending on the website/application/context. Because of this convenience, phone numbers could intersect with WebRTC especially for voice chats. For example: as a fallback or secondary access to unified applications or conferencing. “There won’t be one ID to rule them all”  – identity will be a combination of phone, OAuth, emails, URIs, and WebRTC links.

Many other questions remain that will shape the outcome of this long battle. How and how much will we be communicating? What will we communicate? Are we going to share more media or just voice and text? How will IoT affect our means of communication? Will we ditch the PSTN? So far we’ve seen great examples of unified communication combining both WebRTC and the PSTN. The future may look bright for both.

Obviously, WebRTC has pushed the limits of telecommunications and is disrupting the market in some way. However, this isn’t a cliché Western movie where only one sheriff can stay in this town. This is the Internet, having both WebRTC and the PSTN around, or SIP and JSON, adds more power to telecommunications. We can combine their powers to develop more affordable, accessible, unified, and contextual communication.

In his article, Andrew puts it this way:

“For me, the ultimate answer to the question of one over the other comes down to this: Do you like apples more than oranges? While both are fruit, they are very different and each serves a role that the other cannot fill.”

“In other words, winner, winner, chicken dinner. We really can get along, after all.”

Thanks to Dan BurnettVictor Pascual,  Andrew ProkopTorrey Searle, and Vincent Morsiani for their enlightenment and guidance