Wednesday, July 06, 2016

Command line Twilio client

[This post is neither contributed not endorsed by Twilio]

Twilio client [1] enables embedding voice conversation in web and mobile apps, by creating a voice pipe between your browser or mobile device and the service. One thing missing there is the ability to create such a voice pipe from non-mobile or non-web programs, such as a command line application. There is a shell script [2] to place a call for testing from command line, but it uses pre-defined text or recorded file for media, and looses the real-time interactive nature of the voice path. In particular, it does not create a voice pipe between the local machine and the service.

Motivation

Ability to connect real-time interactive voice path from an application opens doors to wide range of other use cases such as media path processing and analysis, e.g., for real-time transcription, or to bridge call between diverse services, e.g., translate between IM and voice call. Secondly, such as mechanism is independent of a specific browser or mobile platform, and can work in headless mode for automated testing or client-side programmability of voice call or its media path.

At the high level, there are four potential ways to create such a voice pipe. Two of these can be accomplished using my rtclite project, that we will describe in more detail in this article.

  1. WebRTC - using Twilio 1.3+ web client with command line WebRTC app
  2. Using Twilio mobile client interface ported to command line app
  3. RTMP - using Twilio 1.2 web client API with command line RTMP client
  4. SIP - send/receive SIP call to/from Twilio [3]

The first two approaches essentially implement a command line version of the web and mobile SDKs. For example, a WebRTC stack compiled for Linux/OS X may be used to accomplish (1). The third approach uses a command line RTMP client and an older version of client API, and the fourth one uses a command line SIP endpoint to dial into the service.

We describe how to do the last two using software pieces from our open source rtclite project [5]. The following video demonstration shows the command line call initiation. Don't forget to view in full screen!



Connect to Twilio from command line SIP endpoint

The approach is described on the provider's website [3] including the steps for creating the SIP domain/endpoint such as yourname.sip.twilio.com. The description there is targeted for VoIP providers rather than client. Currently it is not possible [4] to configure your SIP softphone to use the Twilio service as SIP server.  Once the SIP domain is created, you can send request to "sip:something@yourname.sip.twilio.com". The sender's IP address needs to be white-listed or the caller's credential needs to be preapproved for authentication. we have only tried the IP address white-listing, and not the SIP authentication using credentials.

After configuring a SIP endpoint, you should be able to see the SIP domain and its associated voice URL on the provider's website, e.g., "yourname.sip.twilio.com" mapped to voice URL of "http://yourserver/yourtwiml.xml".A simple call forwarding can be done using the following TwiML. These steps can be used to connect to any other TwiML application from the command line SIP endpoint, if you configure your SIP endpoint's voice URL accordingly.

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Dial timeout="20" callerId="1415xxxxxxx" method="POST">415yyyyyyy</Dial>
</Response>

You can also create programmable server side script to derive the target number using the called SIP address. For example, send call to sip:1415yyyyyyy@yourname.sip.twilio.com, and in your script, extract the user name part of the URL to populate the Dial tag's content.

Once the initial configuration is complete, use the SIP endpoint available in the rtclite project to initiate a call. More details on the command line SIP endpoint are available in my previous blog post as well as on the project website [6]. After setting up the initial dependencies of the project, run the caller module as follows.

$ python -m rtclite.app.sip.caller --domain=example.net --use-lf --samplerate=48000 \
         --to=sip:something@yourname.sip.twilio.com

The "domain", "use-lf" and "samplerate" options are described on the project website. The "to" option is important, and represents the target address to call to. Once the call is established, the py-audio project's modules are used to interface with the audio device, and to send and receive audio in the call.

Some important notes follow. For a demo account with the provider, your callerId and target number in the Dial tag must both be verified for your account. Some Internet Service Providers (ISP) may block a SIP request on port 5060. In fact, I sometime experience wifi reset of my residential equipment, when I send a SIP packet out from my home machine to outside VoIP service. Router interference of SIP messages may also be the reason for incorrect CRLF handling, and the need for use-lf option. Using TLS may be an option to work around router interference. Make sure that the sample rate specified on the command matches the allowed sample rate of the audio device on your client machine. On Mac OS X, audio capture is usually done with 48kHz.

Connect to Twilio from command line RTMP client

Here, we exploit the older Flash-based Twilio Web Client SDK version 1.2. First, we describe how to figure out the client-server RTMP message flow, and next we show how to do this using the rtmpclient module from the rtclite project.

The web client SDK 1.2 used this twilio.js javascript file. A quick look indicates that it internally loads another twilio.js file.

This second JavaScript file shows that the Flash-based client-server connection is accomplished using the MediaStream class which internally attempts RTMP connection to the service.

Using the hello-client-monkey.php example from the provider, but hosted on your website, and running wireshark on default RTMP port 1935, you can find more information about this client-server exchange. Make sure to use http instead of https for your hosted web app to avoid encryption. The following screenshots show a few example RTMP messages, and their order, and the parameters sent in the initial connect request.

In summary, the NetConnection's connect is done to "rtmp://chunder.twilio.com/chunder", with additional 6 arguments. The first argument looks like the capability token generated by the helper library. The second and third are null and empty string respectively. The fourth one is a JSON formatted string with some client side attributes. The fifth looks like the account SID. And last one looks like the client SDK version. After connect is complete, there are two createStream calls and a startCall RPC method. This is followed by a received RPC method callsid from the server. Next, the "input" stream name is published, and "output" stream name is played. This is followed by bunch of audio data.

We wrote an example application, client.py [7], which uses the rtmpclient module from the rtclite project, and automates client-server exchange process. It then connects the audio sent/received on the two streams to the local audio device. Although, not well tested, there are other command line options in this application that allow recording the received stream, or playing a file to the sent stream.

The command line application takes three mandatory parameters, which if not supplied, will be prompted for. These parameters are for account SID, auth token and application SID. The client application then connects to the service using RTMP and uses those supplied parameters. An example command line invocation is shown below, where the three parameters are fake - use the correct values in your test!

$ python -m rtclite.vnd.twilio.client --account=ACXXXXX --token=YYYYY --app=APZZZZZ

Use ctrl-C to terminate the client application.

Comparison

Here, we attempt to compare our two approaches for command line client: SIP and RTMP.

The audio codec used are G.711 in SIP and Speex in RTMP. Thus bandwidth requirement is more for SIP, but can theoretically give better quality, e.g., for real-time transcription. It may be possible to send G.711 in RTMP to server, but is not clear how to force the server to send back G.711 in RTMP. The media path is over UDP (RTP) for SIP vs. TCP for RTMP - making RTMP one more suseptible to network issues such as latency in interactive conversations. While both these are voice only at this time, using WebRTC or Mobile SDK based approach may allow video pipe. The py-webrtc project may become useful for this, once it is completed.

The SIP approach has provider supplied documentation on how to do incoming calls, but the RTMP approach requires more work to figure that out. The SIP approach seems to target server-to-server call flows, e.g., to connect your soft PBX to provider service. Due to router interference of SIP messages, it may not work in all the cases or all the time from a client machine. On the other hand, the RTMP approach is inspired by the client SDK, and hence suitable for clients. However, Flash-based client has been deprecated by the provider, and the corresponding service may no longer be available in near future. Moreover, RTMP is kind of an obsolete technology. Using WebRTC or Mobile SDK approach will work better in that case. Once SIP registration becomes available on the provider service, a standard command line SIP endpoint [6] should be enough for send/receive of calls.

References

  1. Twilio client: embed voice conversation in Web and Mobile Apps, https://www.twilio.com/client
  2. Twilio Labs: place a Twilio call from the shell, https://www.twilio.com/labs/bash
  3. Programmable voice SIP, https://www.twilio.com/docs/api/twilio-sip
  4. Can I configure my soft phone to work with the Twilio SIP endpoint, https://www.twilio.com/help/faq/voice/can-i-configure-my-soft-phone-to-work-with-the-twilio-sip-endpoint
  5. Rtclite: light weight implementations of real-time communication protocols and applications in Python, https://github.com/theintencity/rtclite
  6. Command line SIP endpoint, https://github.com/theintencity/rtclite/blob/master/rtclite/app/sip/caller.md
  7. Command line Twilio client, https://github.com/theintencity/rtclite/blob/master/rtclite/vnd/twilio/client.py

Tuesday, July 05, 2016

How to make phone calls from command line?

This article presents a command line SIP endpoint available as part of my rtclite [1] project. It is useful in a number of scenarios such as:
  • dialing out a phone number from command line, 
  • performing automated VoIP system tests, 
  • showing quick demos of communication systems, or 
  • experimenting with media processing on the voice path, e.g., for speech recognition, recording or text-to-speech. 
These tasks cannot easily be done using existing user interface based web, installed or mobile apps. I had implemented something like this, named sipua [2], about 15 years ago in C/C++ at Columbia University. There are other projects such as sipp [3], pjsua [4] or sipcmd [5] that implement some version of command line SIP user agent, but may have limitations such as lack of support for audio capture device, or hard to extend to add new media processing capability such as text-to-speech. This article describes a SIP endpoint written in Python [6] as part of my open source project.

Click here to see the full description of this SIP endpoint written in Python [6].

The project page shows how to use the SIP endpoint to send/receive instant messages and voice call. The command line options allow you to configure various attributes such as whether to register with a SIP server, how to respond to incoming requests, how to work around SIP entities that incorrectly handle CRLF for line endings, how to test signaling without media path, and so on. The project page also describes how to use the module in your own Python project.

Following video demo of the SIP endpoint shows interoperability with X-lite terminal and dialing out toll free numbers using a VoIP provider. It shows how to dial a phone number using a VoIP provider, how to send DTMF digits from terminal, and some experimental features such as text-to-speech and speech recognition. Do not forget to watch in full screen!


References

  1. Rtclite: light weight implementations of real-time communication protocols and applications in Python, http://www.rtclite.com, https://github.com/theintencity/rtclite
  2. Sipua: a SIP test user agent for Solaris, Linux, FreeBSD and Windows NT, http://www1.cs.columbia.edu/irt/cinema/doc/sipua.html
  3. SIPp: free open source test tool/traffic generator for SIP, http://sipp.sourceforge.net/ 
  4. Pjsua: open source command line SIP user agent (softphone), http://www.pjsip.org/pjsua.htm 
  5. sipcmd: the command line SIP/H.323/RTP softphone, https://github.com/tmakkonen/sipcmd
  6. caller: SIP application to initiate or receive VoIP calls from command line. https://github.com/theintencity/rtclite/blob/master/rtclite/app/sip/caller.md