Voice API Apps
PHP

Introduction

The Voice API Apps are a set of pre-configured apps for outbound scenarios. Currently, the set consists of the following three apps:

  • Notification
  • One Time Password (OTP)
  • Request DTMF

These apps can be initiated by sending a request to the CM Server(s), which will initiate the call and once answered, will complete the process (flow) of the app.

Prerequisites

Obtaining an account

An account for the Voice API, or the Voice API Apps, can be obtained from our Support Team (Tel: +31 76 572 4082, E-mail: support@cmtelecom.com, available 24/7).

Connection details

In order to send instructions to initiate the different apps, the CM Server(s) can be reached at:

https://voiceapi.cmtelecom.com/v2.0/

Please note that each app has a different end-point, as described in the relevant chapter of this document.

Custom audio files

If custom audio files are to be used, they should preferably meet the following specification:

  • Bit rate: 64 kbps
  • Sample size: 8 bit
  • Channels: Mono
  • Audio sample rate: 8 kHz (8000 Hz)
  • Audio codec: G711-A-law (PCM)
  • Filename: *.wav


Other formats might work, but mileage may vary.

In order to use custom audio files, they need to be available to the Voice API. This is facilitated using an SFTP server, which allows you to upload your custom audio files onto our servers. The URL and credentials for this server will be supplied upon registration.

You are free to create any directory structure on this server, just make sure you supply the whole path from the root of your SFTP account when sending an instruction that requires an audio file.

If you want to use custom spelling audio, these files must be placed in the correct structure, as explained in the next chapter.

Custom audio files for the OTP app

In order to use custom audio files with the OTP app, you need to upload a set of audio files in the following directory structure on the SFTP server:

/spelling/en-GB/*.wav

Where ‘en-GB’ is the language (including locale) of the set. This must be a 5-character string. Inside this folder, you need to upload a .wav file for every number or letter you want to be able to read aloud, like:

0.wav 1.wav 2.wav a.wav b.wav c.wav Note that these file names are all lower case.

Text-To-Speech

The Voice API supports Text-To-Speech (or TTS) in all instructions where you can provide a prompt to the caller/callee. When using TTS, you can provide the voice you want to use. Currently we support the following voices:

Language / Locale Gender Number of voices available
cy-GB Female 1
da-DK Female 1
da-DK Male 1
de-DE Female 2
de-DE Male 1
en-AU Female 1
en-AU Male 1
en-GB Female 2
en-GB Male 2
en-IN Female 1
en-US Female 5
en-US Male 2
es-ES Female 1
es-ES Male 1
es-US Female 1
es-US Male 1
fr-CA Female 1
fr-FR Female 1
fr-FR Male 1
is-IS Female 1
is-IS Male 1
it-IT Female 1
it-IT Male 1
ja-JP Female 1
nb-NO Female 1
nl-NL Female 1
nl-NL Male 1
pl-PL Female 2
pl-PL Male 2
pt-BR Female 1
pt-BR Male 1
pt-PT Female 1
pt-PT Male 1
ro-RO Female 1
ru-RU Female 1
ru-RU Male 1
sv-SE Female 1
tr-TR Female 1

When using TTS (or the Spelling Instruction), you can provide the voice to use in the JSON body. The voice part of the JSON body has the following variables:

Variable definition

Variable Data type Length Required Description
language Alphanumeric 5 No (Default: en-GB) The language of the voice to use
gender Alphanumeric 6 No (Default: Female) The gender of the voice, either 'Male' or 'Female'
number Numeric 3 No (Default: 1) The number of the voice to use, if the given combination of language and gender provides multiple voices.

"voice": {
    "language": "nl-NL",
    "gender": "Male",
    "number": 1
}

The Apps

The next chapters will describe the different apps available and the paramaters required / available to use them. All the apps share the same base structure and the same security system, so all the Apps are initiated by sending a POST command to the CM Server(s) and passing it a JSON body with the parameters and their values.

The CM server will only accept your requests if they contains the correct information in the Authorization header, please see section Security for more info.

Notification App

In order to initiate a Notification call, you can use the CM endpoint at:

https://voiceapi.cmtelecom.com/v2.0/Notification

The flow of the call is:

  1. Call callee
  2. Wait for answer
  3. Play prompt
  4. Hangup

Variable definition

Variable Data type Length Required Description
instruction-id Alphanumeric 64 No A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server.
callee Alphanumeric 24 Yes The number to dial in international format.
caller Alphanumeric 24 Yes The number to show as the caller in international format.*
anonymous Boolean 1 No (Default: false) The caller number is hidden when set to true.
prompt Alphanumeric 500 Yes The text to say (TTS) or the (path and) name of the file to play. The path is always relative to the root of the FTP folder of the customer.
prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
voice JSON * No The voice to use if using TTS. See Text-To-Speech.

* Please note that it is technically possible to supply any caller id, but you are not allowed (by law) to (ab)use telephone numbers not owned by you.

{
    "callee": "0031761234567",
    "caller": "0031769876543",
    "anonymous": false,
    "prompt": "The following is an automated voice message. We would like to inform you that you have a dentist appointment for next Tuesday. We hope to see you then. Best regards, your dentist.",
    "prompt-type": "TTS",
    "voice": {
        "language": "en-GB",
        "gender": "Male",
        "number": 1
    }
}

Notification synchronously

As an alternative to the callback scenario, the Notification app can also be used synchronously. In this case, the POST request waits for the call to finish and then return a call-finished event containing the final call information.

In order to use this alternative, you can use the following endpoint:

https://voiceapi.cmtelecom.com/v2.0/Notification/sync

And send the exact same request as to the normal Notification endpoint.

The resulting JSON message will contain information on the completed call.

Variable definition

Variable Data type Length Required Description
type Alphanumeric 32 Yes The type of the instruction. Always “call-finished” for an event that completes an App call.
call-id UUID / GUID 36 Yes The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests.
instruction-id Alphanumeric 64 Yes The instruction identifier as supplied with the instruction this event belongs to.
result JSON * Yes The final result of the call.
started-on Alphanumeric 27 Yes The timestamp* (in UTC) of the start of the call
answered-on Alphanumeric 27 No (can be null) The timestamp* (in UTC) of the time the call was answered
finished-on Alphanumeric 27 Yes The timestamp* (in UTC) of the end of the call

* Timestamps are all in the format yyyy-MM-ddTHH:mm:ss.ffffffZ

{
  "type": "call-finished",
  "call-id": "f4de7e75-492a-4647-8684-6633f7756b35",
  "instruction-id": "MY TEST",
  "result": {
    "code": 10,
    "description": "Finished succesfully"
  },
  "started-on": "2017-04-12T09:23:10.4888099Z",
  "answered-on": "2017-04-12T09:23:14.3037423Z",
  "finished-on": "2017-04-12T09:23:31.9903801Z"
}
One Time Password (OTP) App

In order to initiate a OTP call, you can use the CM endpoint at:

https://voiceapi.cmtelecom.com/v2.0/OTP

The flow of the app is as following:

  1. Call the callee
  2. Wait for answer
  3. Play the intro-prompt (i.e. This is the CM OTP Service)
  4. Play the code-prompt (i.e. Your login code is)
  5. Play the code (i.e. A B C 1 2 3)
  6. Play the replay-prompt (i.e. Press 1 to repeat the code)
  7. Play the outro-prompt(i.e. Thank you for using this service, the call will now be disconnected.)
  8. Hang up

If the callee presses 1 during step 6, the flow will repeat from step 4.

Variable definition

Variable Data type Length Required Description
instruction-id Alphanumeric 64 No A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server.
callee Alphanumeric 24 Yes The number to dial in international format.
caller Alphanumeric 24 Yes The number to show as the caller in international format.*
anonymous Boolean 1 No (Default: false) The caller number is hidden when set to true.
intro-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play as the intro. The path is always relative to the root of the FTP folder of the customer.
intro-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
code-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play before reading the code. This prompt is repeated when the code is repeated. The path is always relative to the root of the FTP folder of the customer.
code-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
code Alphanumeric 64 Yes The code to read to the caller. Note that this code is read character per character, not as a word or number.
code-type Alphanumeric 8 No (default = Default) The type of audio to use, either default prompts (Default), your own prompts (Custom) or a TTS voice (TTS).
replay-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play, to instruct the callee to press 1 to repeat the code. The path is always relative to the root of the FTP folder of the customer.
replay-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
outro-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play before ending the call. The path is always relative to the root of the FTP folder of the customer.
outro-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
max-replays Numeric 1 No (default = 3) The maximum number of times the code can be repeated.
auto-replay Boolean 1 No (default = false) The code-prompt and code will replay automatically repeat if true, it will play replay-prompt and wait for a press on the "1" otherwise.
voice JSON * No The voice to use if using TTS. See Text-To-Speech.

* Please note that it is technically possible to supply any caller id, but you are not allowed (by law) to (ab)use telephone numbers not owned by you.

{
    "instruction-id": "893a663c-0a9c-4d85-b717-ce2787b3937f",
    "callee": "0031761234567",
    "caller": "0031769876543",
    "anonymous": false,
    "intro-prompt": "/myprompts/Welcome.wav",
    "intro-prompt-type": "File",
    "code": "123abc",
    "code-type": "TTS",
    "replay-prompt": "/myprompts/Press1ToReplay.wav",
    "replay-prompt-type": "File",
    "outro-prompt": "Thank you for using this service.",
    "outro-prompt-type": "TTS",
    "max-replays": 3,
    "auto-replay": false,
    "voice": {
        "language": "en-GB",
        "gender": "Female",
        "number": 1
    }
}
OTP synchronously

As an alternative to the callback scenario, the OTP app can also be used synchronously. In this case, the POST request waits for the call to finish and then return a call-finished event containing the final call information.

In order to use this alternative, you can use the following endpoint:

https://voiceapi.cmtelecom.com/v2.0/OTP/sync

And send the exact same request as to the normal Notification endpoint.

The resulting JSON message will contain information on the completed call.

Variable definition

Variable Data type Length Required Description
type Alphanumeric 32 Yes The type of the instruction. Always “call-finished” for an event that completes an App call.
call-id UUID / GUID 36 Yes The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests.
instruction-id Alphanumeric 64 Yes The instruction identifier as supplied with the instruction this event belongs to.
result JSON * Yes The final result of the call.
started-on Alphanumeric 27 Yes The timestamp* (in UTC) of the start of the call
answered-on Alphanumeric 27 No (can be null) The timestamp* (in UTC) of the time the call was answered
finished-on Alphanumeric 27 Yes The timestamp* (in UTC) of the end of the call

* Timestamps are all in the format yyyy-MM-ddTHH:mm:ss.ffffffZ

{
  "type": "call-finished",
  "call-id": "d4726774-047b-4e71-9184-3d9517d130f8",
  "instruction-id": "893a663c-0a9c-4d85-b717-ce2787b3937f",
  "result": {
    "code": 10,
    "description": "Finished successfully"
  },
  "started-on": "2017-04-12T09:54:19.9467117Z",
  "answered-on": "2017-04-12T09:54:26.4824263Z",
  "finished-on": "2017-04-12T09:55:16.0153205Z"
}
Request-DTMF App

In order to initiate a Request-DTMF call, you can use the CM endpoint at:

https://voiceapi.cmtelecom.com/v2.0/DTMF

The flow of a call is as following:

  1. Call the callee
  2. Wait for answer
  3. Play the prompt (i.e. Please enter some digits)
  4. Play the valid-prompt or the invalid-prompt, depending of the input compared to the regex and minimum and maximum digits. (i.e. Thank you for your input, or That's not right)
  5. Hang up

If the input was not valid, step 3 will repeat after step 4, until either the callee entered some valid input, or there were max-attempts attempts.

Variable definition

Variable Data type Length Required Description
instruction-id Alphanumeric 64 No A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server.
callee Alphanumeric 24 Yes The number to dial in international format.
caller Alphanumeric 24 Yes The number to show as the caller in international format.*
anonymous Boolean 1 No (Default: false) The caller number is hidden when set to true.
prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play as the intro. The path is always relative to the root of the FTP folder of the customer.
prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
valid-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play when received valid input. This prompt is repeated when the code is repeated. The path is always relative to the root of the FTP folder of the customer.
valid-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
invalid-prompt Alphanumeric 500 No The text to say (TTS) or the (path and) name of the file to play when received invalid input. This prompt is repeated when the code is repeated. The path is always relative to the root of the FTP folder of the customer.
invalid-prompt-type Alphanumeric 4 No (default = File) The type of the prompt, either TTS (Text-To-Speech) or File.
min-digits Numeric 2 Yes The minimum number of digits required as input from the callee.
max-digits Numeric 2 Yes The maximum number of digits required as input from the callee.
max-attempts Numeric 2 Yes The maximum number of attempts the callee gets to supply valid input.
timeout Numeric 8 No (default = 5000) The max. time in ms between the end of the prompt audio and the first digit, or between digits. If no digit is received before this timeout, it is counted as an attempt and the prompt is restarted. Value must be between 1000 and 10000 ms.
terminators Alphanumeric 8 No (default = #) A list of digits that cause the input to be terminated. Used in cases where you want to state “Enter your … number, ending with a #”.
regex Alphanumeric 64 No (default is [0-9]*) The regex to match the input against. An attempt will fail if the input does not match this regular expression. Please note that you may need to escape certain characters in JSON.
voice JSON * No The voice to use if using TTS. See Text-To-Speech.
callback-url Alphanumeric 256 No The url (including http(s)://) for the callback. Defaults to the configured callback url if this variable is not supplied.

* Please note that it is technically possible to supply any caller id, but you are not allowed (by law) to (ab)use telephone numbers not owned by you.

{
    "instruction-id": "DTMF 234-ed7", 
    "callee": "0031769876543",
    "caller": "0031761234567",
    "anonymous": false,
    "prompt": "This is an automated voice response system from C M. Please enter some digits.",
    "prompt-type": "TTS",
    "valid-prompt": "Thank you for your input.",
    "valid-prompt-type": "TTS",
    "invalid-prompt": "That's not right!",
    "invalid-prompt-type": "TTS",
    "min-digits": 2,
    "max-digits": 5,
    "max-attempts": 3,
    "timeout": 5000,
    "terminators": "#*",
    "regex": "^[1-3]+[90]$",
    "voice": {
        "language": "en-GB",
        "gender": "Male",
        "number": 2
    }
}
Request-DTMF App callback

When the call is completed, the CM Server(s) will send a POST command to your server (at the supplied callback url), with a JSON body, with the following content:

Variable definition

Variable Data type Length Required Description
type Alphanumeric 32 Yes The type of the instruction. Always “dtmf” for an event that returns the dtmf digits received as the result of a get-dtmf instruction.
call-id UUID / GUID 36 Yes The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests.
instruction-id Alphanumeric 64 Yes The instruction identifier as supplied with the instruction this event belongs to.
digits Alphanumeric 64 Yes This is the DTMF data that was received, excluding the terminator symbol if it was used (usually #). Empty if no (correct) dtmf input was received from the caller.
{
  "type": "dtmf",
  "call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
  "instruction-id": "DTMF 234-ed7",
  "digits": "1234"
}
Request-DTMF synchronously

As an alternative to the callback scenario, the Request-DTMF app can also be used synchronously. In this case, the POST request waits for the call to finish and then return a DTMF-event containing the digits entered by the callee.

In order to use this alternative, you can use the following endpoint:

https://voiceapi.cmtelecom.com/v2.0/DTMF/sync

Variable definition

Variable Data type Length Required Description
type Alphanumeric 32 Yes The type of the instruction. Always “call-finished” for an event that completes an App call.
call-id UUID / GUID 36 Yes The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests.
instruction-id Alphanumeric 64 Yes The instruction identifier as supplied with the instruction this event belongs to.
result JSON * Yes The final result of the call.
started-on Alphanumeric 27 Yes The timestamp* (in UTC) of the start of the call
answered-on Alphanumeric 27 No (can be null) The timestamp* (in UTC) of the time the call was answered
finished-on Alphanumeric 27 Yes The timestamp* (in UTC) of the end of the call
digits Alphanumeric 64 Yes This is the DTMF data that was received, excluding the terminator symbol if it was used (usually #). Empty (null) if no (correct) dtmf input was received from the caller.

* Timestamps are all in the format yyyy-MM-ddTHH:mm:ss.ffffffZ

{
  "type": "call-finished",
  "call-id": "121c6803-a127-43f1-ab20-ed963513c526",
  "instruction-id": "MY TEST",
  "result": {
    "code": 10,
    "description": "Finished succesfully"
  },
  "started-on": "2017-04-12T07:09:25.078274Z",
  "answered-on": "2017-04-12T07:09:27.1507692Z",
  "finished-on": "2017-04-12T07:10:00.0157796Z",
  "digits": "54321"
}
Result codes

When you do a synchronous call, or if you receive a result on a callback url, you will find a part of the message contains the result of the call.

The result is a piece of JSON code, containing a code and a description.

Variable definition

Variable Data type Length Required Description
code Numeric 2 Yes The code of the result
description Alphanumeric 128 Yes The description of the result

Possible results

Code Description Explanation
9 Cancelled The call was not answered and was cancelled after a timeout hit (60 seconds)
10 Finished successfully The call was placed and answered before ending.
11 Failed The call could not be made
12 Call rejected The callee actively refused the call, this could also mean the telecom provider owning the number did, so please check the validity of the number.
"result": {
    "code": 10,
    "description": "Finished successfully"
  },

Authentication

In order to provide security, every single message sent between the Voice API and your server is signed with a HMAC-SHA-256 signature. This only applies to the HTTP-request, the reply does not need to be signed, as the signature is only to prove the originator of the request, the reply is logically from the server, especially since HTTPS is required.

This signature is constructed using a shared password and the body of the request, using the HMAC-SHA-256 algorithm. The key for the hashing algorithm is the shared key (which you receive from CM).

In order to validate the request, the signature is placed in the "Authorization" header of the request. When the message is sent from your server to CM, it must also contain your username, following the syntax: username=<username>;signature=<signature>, example:

username=myusername1234;signature=a2e806968a3e163ef56e12ad812f4350fe889cb559049af6860406a4c9e468b9


If the Authorization header is missing, or if the signature does not match the one calculated by the CM Server(s), you will receive a 401 - Unauthorized.

If the message is sent from CM to your server, the header will only contain the signature, following the syntax: signature=<signature>, example:

signature=a2e806968a3e163ef56e12ad812f4350fe889cb559049af6860406a4c9e468b9
Check authentication

In order to test your authentication logic, the VoiceAPI hosts a special "Check authentication" endpoint at:

https://voiceapi.cmtelecom.com/v2.0/CheckAuthentication

This endpoint accepts POST requests, containing any body you like. You have to add the Authorization header as described in this chapter, signing the chosen body for the request.

The service will respond with either a 200 - OK (in case the signature is correct) or a 401 - Unauthorized message.

For this example, let us assume your username is 'myusername' and your HMAC key (Shared Key) is KWWppDsf1bm8nZZqmnCtl/RZR&CB2wHq.

So, for instance, you might send the following body:

check authentication


Using this body as the HMAC body and your shared key as the HMAC Key, we get the following hash:

dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e


Combining that with your username, this will give the following Authorization header:

username=myusername;signature=dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e


Sending this to the endpoint will result in a 200 - OK (which it actually does not, since this user does not exist). If you change anything to either the body or the signature, the endpoint will return a 401 - Unauthorized.

curl --request POST \
  --url https://voiceapi.cmtelecom.com/v2.0/CheckAuthentication \
  --header 'authorization: username=myusername;signature=dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e' \
  --data 'check authentication'
Example 1 - a Notification request

The signature for this example is constructed as the HMAC-SHA256 hash over the following JSON string:

HMAC Body =

{ "callee": "0031123456789", "caller": "0031765727000", "anonymous": false, "prompt": "The following is an automated voice message. We would like to inform you that you have a dentist appointment for next Tuesday. We hope to see you then. Best regards, your dentist.", "prompt-type": "TTS", "voice": { "language": "en-GB", "gender": "Male", "number": 1 }}

HMAC key (Shared Key) = Kq5@h^S%wcC0i1=IL\1pbT20n^A3Ha-s

Resulting in the following hash:

11f7abc6dd11e3f62c3cea8227821a17d3f87acf558d43fac2ad6435de118346


Which would make the complete Authorization header:

username=myusername1234;signature=11f7abc6dd11e3f62c3cea8227821a17d3f87acf558d43fac2ad6435de118346

Please note that the header is containing both your username and the calculated signature.

curl --request POST \
  --url https://voiceapi.cmtelecom.com/v2.0/Notification/ \
  --header 'authorization: username=myusername1234;signature=11f7abc6dd11e3f62c3cea8227821a17d3f87acf558d43fac2ad6435de118346' \
  --header 'cache-control: no-cache' \
  --header 'content-type: application/json' \
  --data '{ "callee": "0031123456789", "caller": "0031765727000", "anonymous": false, "prompt": "The following is an automated voice message. We would like to inform you that you have a dentist appointment for next Tuesday. We hope to see you then. Best regards, your dentist.", "prompt-type": "TTS", "voice": { "language": "en-GB", "gender": "Male", "number": 1 }}'
Example 2 - a DTMF result

NOTE: The synchronous result message is not signed, only the postback url version is.

The signature for this example is constructed as the HMAC-SHA256 hash over the following:

HMAC body =

{ "type": "dtmf", "call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e", "instruction-id": "DTMF 234-ed7", "digits": "1234" }

HMAC key (Shared Key) = TeZ,-KZ$^Q5<A~<P-&zsqy>WsFq<*zn!

Resulting in the following hash:

7015b2c8c6109c7e765ef2e728347feda0b7199baaa8ef6d42f870ead1074fd2


And an Authorization header with the following content:

signature=7015b2c8c6109c7e765ef2e728347feda0b7199baaa8ef6d42f870ead1074fd2

Please note that the header is only containing the signature, not your username.

Tips

When testing with Postman, it is advisable to use a JSON body without newlines. If you do use newlines, you might end up with authorization issues, as the newlines in Postman (or other tools) might be different (usually \r\n) from the newlines used in the (online) tool you use to calculate the signature. In code, it does not matter if you use newlines, as the exact body your code should use for calculating the signature, is the body the CM servers receive and thus will result in the same signature.