Saturday, February 20, 2010

FAQ on using Flash Player to make phone calls

I present my answers to some frequently asked questions (FAQ) on using Flash Player to make phone calls.

1. Is Flash Application a good choice for VOIP?

Depends, the RTMP based application is not a good choice, whereas new RTMFP application is good for Flash to Flash Internet voice applications. For Flash to Phone applications, Flash is not a good choice as it is. Flash is good at user interface and ubiquitous availability but the TCP-based RTMP is not suitable for real-time interactive media, and UDP-based RTMFP is proprietary so cannot interwork with existing SIP-based VoIP systems.

Secondly, Flash Player is missing some of the crucial VoIP pieces such as good silence suppression and echo cancellation, so Flash based VoIP client becomes useless without a headset.

Thirdly, Although Flash Player supports open standard Speex audio codec, many existing VoIP providers do not support Speex, and expect only traditional voice codecs like G.729 and G.723.1. So you may also need to incorporate transcoding which is CPU intensive. Video transcoding is more difficult because of the proprietary video codec in Flash Player.

2. Will there be any performance degradation when the call goes through the following paths? (Flash Client -> Media Server ->RTMP to SIP Converter -> VOIP Server -> VoIP/PSTN Gateway -> PSTN Network -> Telephone)

Yes. If you can avoid intermediaries to cut down on media path latency, it will help a lot. Typically the VoIP Server (or SIP proxy server) is independent of the media path so that doesn't affect. But the media path goes through Media Server (FMS?) and RTMP to SIP converter, and that too over TCP. This degrades the quality a lot. One way could be to remove the "Media Server" from your path by having Flash Client directly connect to the RTMP to SIP converter. Also if you can reduce the network distance between the Flash Client and RTMP to SIP Converter, that will help a lot.

Secondly, with Flash Player you may need to do audio transcoding in your RTMP to SIP converter. This further degrades the performance and limits the scalability of your converter.

3. Some experts says that the development in C or C++ is prefered for VOIP call to phone instead of Flash Player for performance reason. Is that true?

A native VoIP client is preferred over Flash Player because the media packets can go directly from the client to the telephone instead of going through the RTMP to SIP converter. The advantage is because (1) the native client can use UDP instead of restricted to TCP-based RTMP, and (2) the network distance is lower for a direct path. Even if your converter is on good network and close to your client so that the network distance is not much of an issue, the UDP-vs-TCP makes a great impact in improving the quality of native VoIP client implementation over Flash Player.

In general the network component affects the quality more than the programming language. So whether you use C/C++, Python, Java or some other language, it doesn't matter much. But if you can have end-to-end media path over UDP between the two clients, or between the client and the gateway, it is much better. Obviously with Flash Player you cannot have the packets go directly unless your RTMP to SIP converter is local to the Flash Client.

All the existing good quality systems (Skype, GTalk) tend to use end-to-end media-path over UDP as much as possible.

4. There are different media servers available. like Adobe Flash Media server (FMS), Wowza, Red5 etc. Which one is the best choice?

Do you still want to pursue RTMP to SIP converter? Anyways: In terms of performance I would guess that FMS is the best choice. But if your aim to build a RTMP to SIP converter than probably Red5 is the the best. FMS is proprietary with not much customization/programming choices available, so you cannot easily integrate a SIP stack or a RTMP to SIP converter to FMS. On the other hand Red5 is completely open source and in Java so allows easy integration with other Java based SIP stack. Additionally you could integrate SIP stacks written in other advanced languages such as Python or Ruby because Red5 allows applications in those languages, whereas an FMS application is restricted to ActionScript 1.0.

I haven't worked with or used Wowza so I cannot comment on that. I have worked with FMS and Red5 though, as well as Python based rtmplite and siprtmp projects.

6. We are now in a confusion whether to develop our VOIP application in Flash technology or QT/Java/C#. What will be your choice?

I think that decision mostly comes from your business case. But I would suggest non-Flash technology if possible and if your business demands very good quality of voice service. If your VoIP client will be assisting your main business, then people won't mind downloading and installing the VoIP client. The advantage Flash has is that it is already available on most people's browser so doesn't require additional download or installation. So if your VoIP application is only a small part of your main web-based business, then Flash technology will be better I think.

Another option is to use the Gmail video/voice architecture described in my article. Basically it uses Flash Player for user interface, but all the networking or voice related processing happens using their native GoogleTalk plugin.


Henry said...

It would be interesting to update this blog in the light of AIR 2.0 indicating support for UDP and Google Android. Here are some links:

ChottuRock said...

I want to know if it is possible to use SIP in Adobe RTMFP networks. I am new to this technology and really interested to know if we could route the voice through third party software like Solicall pro to improve the voice quality and do AEC.I have built a project on this using adobe stratus and FMS.The voice quality is not that good and there is echo problems in it. Please suggest any methods trough which I can improve the voice quality using Third party softwares ,using the present adobe stratus network.

Henry said...

I suggest to subscribe to the IETF Dispatch list ( and follow the RTC-Web thread. Here is a sample from today:

Stefan HÃ¥kansson LK
[dispatch] RTC-Web Use cases

Kundan Singh said...

RTMFP is still a closed/proprietary protocol, unlike RTMP which has open specification now. So, unless someone figures out how RTMFP wire protocol works, it may not be possible to interoperate with SIP.

On the other hand, you can definitely use SIP to initiate an RTMFP session, by putting "rtmfp://..." URL in the SDP's media lines, but this will not interoperate with existing SIP phones.

Second option could be to use adobe's Flash Media Gateway (a SIP gateway), but you need to verify with Adobe sales whether it will work with RTMFP.

I looked at Solicall web site, and I think it should work well in solving the echo problem. I have seen other projects (forgot the names) that install another plugin for audio instead of using Flash Player to solve the echo problem.

Finally, as Henry mentioned, based on RTC-Web work, in a few years we should have standards based voice from within the browser with all other voice quality improvements.