Monday, February 26, 2007

VoIP Protocols in a Nutshell & Skype’s Hidden Story

Voice on the Internet has been introduced by different new protocols which are broadly known as Voice Over Internet Protocol (VoIP). The functionality of VoIP has been achieved by protocols such as Session Initiation Protocol (SIP), H.323 and IAX etc. Some of them are as follows:

Skype
Skype is one of the most popular VoIP clients. Skype’s clients are self-contained and create a p2p network. Skype maintains central login server for authentication. Skype’s protocol is proprietary and messaging is encrypted. Thus it does not communicate with other protocols. Skype nodes continuously maintain UDP connections to surrounding nodes. However, TCP is used for call setup. As Skype clients need to do continuous processing, best of CPU and battery available today for mobile devices are not perfect match for Skype client. Continuous data flow of Skype clients is not encouraging for the cell phone carriers as well. Skype architecture is not suitable to provide E911 services. Skype’s voice quality is admirable. However, all the Skype users contribute to Skype’s revenue generation not only by paying for SkypeOut or SkypeIn but also by offering own resources to Skype. Skype’s heavy client uses one’s resources to serve him/her and to other users, in order to overcome NAT and firewall problem. Skype clients with public IP and enough bandwidth (known as supernode) help other users for NAT and firewall traversal. Any node in the public domain can become supernode. Users do not have control over it. However, with Skype version 3.0 one can prevent a client to become supernode. Most of the users are not aware of the supernode concept. Thus Skype has no fear of supernode shortages. Even though no network maintains are required at the client end, any solution such as Skype is not a good choice for an organization. Because with Skype an organization does not own the technology, thus have very less control. For example Skype’s price hike in the middle of budget year of an organization is to concern the management.

H.323
H.323 is an umbrella recommendation from International Telecommunication Union (ITU) that covers all aspects of multimedia communication over the IP network. It is a part of the H.32x series of protocols that describes multimedia communication over other networks such as Integrated Services Digital Network (ISDN) and Public Switched Telephone Network (PSTN). Its binary encoding makes the development harder. However, binary encoding minimizes the needs of number of bits to transport over the wire. Requirement for extra configuration such as Gatekeepers and Multipoint Control Units (MCUs) makes it complex. H.323 is more suitable to interface with PSTN than to the Internet. H.323 uses TCP as transport protocol. It also lacks user traceability in case of emergency.

H.323 APIs: ooH323c, OpenH323. To my knowledge there are not enough good APIs in JAVA (One may like to try J323 Engine) for H.323.

SIP
Session Initiation Protocol (SIP) is a client-server, text based lightweight protocol that works both on UDP and TCP. SIP was developed by Internet Engineering Task Force (IETF) to setup, modify and tear down multimedia sessions over the Internet. Similar to H.323, SIP architecture requires extra hardware and software in the network such as proxy servers, redirect servers and registration servers. However, it is more affable with other Internet protocols. SIP is not a transfer protocol like HTTP, designed to carry large amounts of data. It does not define any specific mechanism for E911 service, NAT and firewall traversal. SIP's aim for switching to a p2p architecture will increase its complexity similar to Skype. However for real voice data both H.323 and SIP use Real-Time Transport Protocol (RTP). Recent industry trend shows that H.323 is losing the race with SIP.

SIP APIs: JAIN-SIP, jSIP.

IAX
Inter-Asterisk eXchange (IAX) is a protocol used by Asterisk. Asterisk is an open source Private Branch Exchange (PBX) system from Digium. IAX is now known as IAX2, with its second version. It enables VoIP connections between Asterisk servers and IAX2 clients. Even though IAX2 is not an official standard protocol, as yet, it is well known for its less bandwidth consumption. IAX2 uses in-bound data streams i.e. both signaling and media information are transmitted by the same channel. Whereas, in SIP and H.323 both signaling and media are independent of each other. That makes IAX2 NAT friendly protocol. However, that implies that the PBX needs to separate voice from signaling. IAX2 “trunking” allows one IP packet to contain information for more than one active call. Thus minimizes the use of bandwidth.

All these standalone, different voice architectures exist each with its advantages and disadvantages.

Some Important Links :

Skype vs SIP [1, 2]
SIP vs H.323 [1, 2]
SIP vs IAX [1]