Wednesday, July 06, 2016

Command line Twilio client

[This post is neither contributed not endorsed by Twilio]

Twilio client [1] enables embedding voice conversation in web and mobile apps, by creating a voice pipe between your browser or mobile device and the service. One thing missing there is the ability to create such a voice pipe from non-mobile or non-web programs, such as a command line application. There is a shell script [2] to place a call for testing from command line, but it uses pre-defined text or recorded file for media, and looses the real-time interactive nature of the voice path. In particular, it does not create a voice pipe between the local machine and the service.


Ability to connect real-time interactive voice path from an application opens doors to wide range of other use cases such as media path processing and analysis, e.g., for real-time transcription, or to bridge call between diverse services, e.g., translate between IM and voice call. Secondly, such as mechanism is independent of a specific browser or mobile platform, and can work in headless mode for automated testing or client-side programmability of voice call or its media path.

At the high level, there are four potential ways to create such a voice pipe. Two of these can be accomplished using my rtclite project, that we will describe in more detail in this article.

  1. WebRTC - using Twilio 1.3+ web client with command line WebRTC app
  2. Using Twilio mobile client interface ported to command line app
  3. RTMP - using Twilio 1.2 web client API with command line RTMP client
  4. SIP - send/receive SIP call to/from Twilio [3]

The first two approaches essentially implement a command line version of the web and mobile SDKs. For example, a WebRTC stack compiled for Linux/OS X may be used to accomplish (1). The third approach uses a command line RTMP client and an older version of client API, and the fourth one uses a command line SIP endpoint to dial into the service.

We describe how to do the last two using software pieces from our open source rtclite project [5]. The following video demonstration shows the command line call initiation. Don't forget to view in full screen!

Connect to Twilio from command line SIP endpoint

The approach is described on the provider's website [3] including the steps for creating the SIP domain/endpoint such as The description there is targeted for VoIP providers rather than client. Currently it is not possible [4] to configure your SIP softphone to use the Twilio service as SIP server.  Once the SIP domain is created, you can send request to "". The sender's IP address needs to be white-listed or the caller's credential needs to be preapproved for authentication. we have only tried the IP address white-listing, and not the SIP authentication using credentials.

After configuring a SIP endpoint, you should be able to see the SIP domain and its associated voice URL on the provider's website, e.g., "" mapped to voice URL of "http://yourserver/yourtwiml.xml".A simple call forwarding can be done using the following TwiML. These steps can be used to connect to any other TwiML application from the command line SIP endpoint, if you configure your SIP endpoint's voice URL accordingly.

<?xml version="1.0" encoding="UTF-8"?>
  <Dial timeout="20" callerId="1415xxxxxxx" method="POST">415yyyyyyy</Dial>

You can also create programmable server side script to derive the target number using the called SIP address. For example, send call to, and in your script, extract the user name part of the URL to populate the Dial tag's content.

Once the initial configuration is complete, use the SIP endpoint available in the rtclite project to initiate a call. More details on the command line SIP endpoint are available in my previous blog post as well as on the project website [6]. After setting up the initial dependencies of the project, run the caller module as follows.

$ python -m --use-lf --samplerate=48000 \

The "domain", "use-lf" and "samplerate" options are described on the project website. The "to" option is important, and represents the target address to call to. Once the call is established, the py-audio project's modules are used to interface with the audio device, and to send and receive audio in the call.

Some important notes follow. For a demo account with the provider, your callerId and target number in the Dial tag must both be verified for your account. Some Internet Service Providers (ISP) may block a SIP request on port 5060. In fact, I sometime experience wifi reset of my residential equipment, when I send a SIP packet out from my home machine to outside VoIP service. Router interference of SIP messages may also be the reason for incorrect CRLF handling, and the need for use-lf option. Using TLS may be an option to work around router interference. Make sure that the sample rate specified on the command matches the allowed sample rate of the audio device on your client machine. On Mac OS X, audio capture is usually done with 48kHz.

Connect to Twilio from command line RTMP client

Here, we exploit the older Flash-based Twilio Web Client SDK version 1.2. First, we describe how to figure out the client-server RTMP message flow, and next we show how to do this using the rtmpclient module from the rtclite project.

The web client SDK 1.2 used this twilio.js javascript file. A quick look indicates that it internally loads another twilio.js file.

This second JavaScript file shows that the Flash-based client-server connection is accomplished using the MediaStream class which internally attempts RTMP connection to the service.

Using the hello-client-monkey.php example from the provider, but hosted on your website, and running wireshark on default RTMP port 1935, you can find more information about this client-server exchange. Make sure to use http instead of https for your hosted web app to avoid encryption. The following screenshots show a few example RTMP messages, and their order, and the parameters sent in the initial connect request.

In summary, the NetConnection's connect is done to "rtmp://", with additional 6 arguments. The first argument looks like the capability token generated by the helper library. The second and third are null and empty string respectively. The fourth one is a JSON formatted string with some client side attributes. The fifth looks like the account SID. And last one looks like the client SDK version. After connect is complete, there are two createStream calls and a startCall RPC method. This is followed by a received RPC method callsid from the server. Next, the "input" stream name is published, and "output" stream name is played. This is followed by bunch of audio data.

We wrote an example application, [7], which uses the rtmpclient module from the rtclite project, and automates client-server exchange process. It then connects the audio sent/received on the two streams to the local audio device. Although, not well tested, there are other command line options in this application that allow recording the received stream, or playing a file to the sent stream.

The command line application takes three mandatory parameters, which if not supplied, will be prompted for. These parameters are for account SID, auth token and application SID. The client application then connects to the service using RTMP and uses those supplied parameters. An example command line invocation is shown below, where the three parameters are fake - use the correct values in your test!

$ python -m rtclite.vnd.twilio.client --account=ACXXXXX --token=YYYYY --app=APZZZZZ

Use ctrl-C to terminate the client application.


Here, we attempt to compare our two approaches for command line client: SIP and RTMP.

The audio codec used are G.711 in SIP and Speex in RTMP. Thus bandwidth requirement is more for SIP, but can theoretically give better quality, e.g., for real-time transcription. It may be possible to send G.711 in RTMP to server, but is not clear how to force the server to send back G.711 in RTMP. The media path is over UDP (RTP) for SIP vs. TCP for RTMP - making RTMP one more suseptible to network issues such as latency in interactive conversations. While both these are voice only at this time, using WebRTC or Mobile SDK based approach may allow video pipe. The py-webrtc project may become useful for this, once it is completed.

The SIP approach has provider supplied documentation on how to do incoming calls, but the RTMP approach requires more work to figure that out. The SIP approach seems to target server-to-server call flows, e.g., to connect your soft PBX to provider service. Due to router interference of SIP messages, it may not work in all the cases or all the time from a client machine. On the other hand, the RTMP approach is inspired by the client SDK, and hence suitable for clients. However, Flash-based client has been deprecated by the provider, and the corresponding service may no longer be available in near future. Moreover, RTMP is kind of an obsolete technology. Using WebRTC or Mobile SDK approach will work better in that case. Once SIP registration becomes available on the provider service, a standard command line SIP endpoint [6] should be enough for send/receive of calls.


  1. Twilio client: embed voice conversation in Web and Mobile Apps,
  2. Twilio Labs: place a Twilio call from the shell,
  3. Programmable voice SIP,
  4. Can I configure my soft phone to work with the Twilio SIP endpoint,
  5. Rtclite: light weight implementations of real-time communication protocols and applications in Python,
  6. Command line SIP endpoint,
  7. Command line Twilio client,

No comments: