The Voice API is a system that enables you to easily write IVR (Interactive Voice Response) applications without setting up complicated telephone systems.
The Voice API is actually not a web API, but rather a web client, as it will call your server to inform it of updates and ask for the next step(s) to perform. Only the endpoint that allows you to initiate an outbound call is an actual web API.
The Voice API server will call your http(s) server using a POST command and it will send JSON data containing information on a new incoming call, a newly setup outgoing call or a status update on a call (done playing audio file for instance). Your server will have to acknowledge this new information and reply with the next steps, such as "play an audio file", "make a voice recording" or "get DTMF (number) input".
When the Voice API has performed these steps, it will again contact your server with updates on these steps and your server will again give it the next step(s), etc.
Only when the Voice API sends a "disconnected" message will it not be expecting a new step to take, it will just expect a 200 - OK message.
Authentication for the Voice API uses your CM API key (also known as your product token).
An API Key can be obtained from the API Settings menu in the Voice Management app To access this app, you need at least a prepaid CM.com platform account.
In order to authenticate your request, add the following header, containing your API Key:
X-CM-PRODUCTTOKEN
The location (URL) of your server needs to be configured in the system, along with the inbound phone number(s) associated (if any). Using this information, the Voice API knows what incoming phone call to connect to what server.
For outbound calls, only the URL of your server has to be known.
Right now, this is something that CM has to do, but a portal is on its way!
If custom audio files are to be used, they should preferably meet the following specification:
Other formats might work, but mileage may vary.
Uploading files using Audio Manager
In order to use custom audio files, they need to be available to the Voice API. You can use the Audio Manager CM Platform app for this (https://audiomanager.cmtelecom.com). This app is available to all customers with a Voice API / apps account. If in any case you have no access to the app, please contact support@cm.nl
You are free to create any directory structure on this server, just make sure you supply the whole path from the root of the Audio Manager when sending an instruction that requires an audio file. E.g. /test-folder/testaudio.wav
If you want to use custom spelling audio, these files must be placed in the correct structure, as explained in the next chapter.
In order to use files with the Spell instruction (See chapter Spell instruction), you need to upload a set of audio files in the following directory structure on the SFTP server:
/spelling/en-GB/*.wav
Where ‘en-GB’ is the language (including locale) of the set. This must be a 5-character string. Inside this folder, you need to upload a .wav file for every number or letter you want to be able to read aloud, like:
0.wav1.wav2.wava.wavb.wavc.wavNote that these file names are all lower case.
The Voice API supports Text-To-Speech (or TTS) in all instructions where you can provide a prompt to the caller/callee. When using TTS, you can provide the voice you want to use. Currently we support the following voices:
| Language / Locale | Gender | Number of voices available |
|---|---|---|
| cy-GB | Female | 1 |
| da-DK | Female | 1 |
| da-DK | Male | 1 |
| de-DE | Female | 2 |
| de-DE | Male | 1 |
| en-AU | Female | 1 |
| en-AU | Male | 1 |
| en-GB | Female | 2 |
| en-GB | Male | 2 |
| en-IN | Female | 1 |
| en-US | Female | 5 |
| en-US | Male | 2 |
| es-ES | Female | 1 |
| es-ES | Male | 1 |
| es-US | Female | 1 |
| es-US | Male | 1 |
| fr-CA | Female | 1 |
| fr-FR | Female | 1 |
| fr-FR | Male | 1 |
| hi-IN | Female | 1 |
| is-IS | Female | 1 |
| is-IS | Male | 1 |
| it-IT | Female | 1 |
| it-IT | Male | 1 |
| ja-JP | Female | 1 |
| ja-JP | Male | 1 |
| ko-KR | Female | 1 |
| nb-NO | Female | 1 |
| nl-NL | Female | 1 |
| nl-NL | Male | 1 |
| pl-PL | Female | 2 |
| pl-PL | Male | 2 |
| pt-BR | Female | 1 |
| pt-BR | Male | 1 |
| pt-PT | Female | 1 |
| pt-PT | Male | 1 |
| ro-RO | Female | 1 |
| ru-RU | Female | 1 |
| ru-RU | Male | 1 |
| sv-SE | Female | 1 |
| tr-TR | Female | 1 |
| zh-CHS | Female | 1 |
When using TTS (or the Spelling Instruction), you can provide the voice to use in the JSON body. The voice part of the JSON body has the following variables:
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| language | Alphanumeric | 5 | No (Default: en-GB) | The language of the voice to use |
| gender | Alphanumeric | 6 | No (Default: Female) | The gender of the voice, either 'Male' or 'Female' |
| number | Numeric | 3 | No (Default: 1) | The number of the voice to use, if the given combination of language and gender provides multiple voices. |
| volume | Numeric | 1 | No (Default: 0) | The volume level of the voice. Must be between -4 and 4. |
"voice": {
"language": "nl-NL",
"gender": "Male",
"number": 1,
"volume": 2,
}
Your server, which handles the POST commands of the Voice API should be a HTTP(S) server with a basic POST handler. If you want to encrypt the data sent between the Voice API and your server (which we highly recommend), this server has to be a HTTPS server, using its own SSL certificate.
All the different POST commands will be sent to the same endpoint (same URL), no matter the contents. The difference is purely in the contents of the JSON body.
The server must respond as quickly as possible (ideally within 300 ms), any delay makes the call feel awkward and unnatural to the caller. If the server does not respond within 5000 ms (5 seconds), an error prompt will be played to the caller and the call will be disconnected. A Disconnected Event will also be sent to your server.
Every communication (except instructions to initiate an outbound call) between the Voice API server and your server is initiated by the Voice API Server, which sends a POST command your server. The next step(s) is/are specified in the response to this POST command.
The following events are supported by the Voice API:
The Voice API supports sending and receiving multiple events and instructions at once.
When a call is received for your phone number, a HTTP POST will be sent to your server, basically informing it of the new call and asking for a first instruction to perform on this call.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “new-call” when a new call is received. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| caller | Alphanumeric | 25 | Yes | This is the phone number of the caller, if known, “anonymous” otherwise. Phone numbers are always in international format E.164. |
| callee | Alphanumeric | 25 | Yes | The phone number called by the caller. Phone numbers are always in international format E.164. |
| direction | Alphanumeric | 8 | Yes | The direction of the call, either "inbound" or "outbound". |
{
"type": "new-call",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"caller": "+31...",
"called": "+31...",
"direction": "inbound"
}
This HTTP POST is sent after the last instruction has been completed, for instance the audio file has been succesfully played to the caller, as requested.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “done” for events that simply indicate that an instruction has been performed. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | The instruction identifier as supplied with the instruction this event belongs to. |
| voicemail-detected | Boolean | 1 | No | Only sent when instructed to do voicemail detection. See Play-instruction for more details on voicemail detection. |
{
"type": "done",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "PLAY welcome.wav INTRO 234d23q"
}
This HTTP POST will be sent as a response to a ‘get-dtmf’ instruction, after we have received DTMF data from the caller.
In case no input, or no correct input was received, the field “digits” will contain an empty string.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “dtmf” for an event that returns the dtmf digits received as the result of a get-dtmf instruction. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | The instruction identifier as supplied with the instruction this event belongs to. |
| digits | Alphanumeric | 64 | Yes | This is the DTMF data that was received, excluding the terminator symbol if it was used (usually #). Empty if no (correct) dtmf input was received from the caller. |
{
"type": "dtmf",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "DTMF 234-ed7",
"digits": "1234"
}
This HTTP POST is sent after a recording has been made. It sends the name of the file (a UUID + .wav), which can be downloaded from the FTPS server in the /recordings folder. Please note that the recording might not be immediately available on the server, the event simply indicates that the end user has finished the recording and should be presented with the next step in the flow.
You can easily read back the recording to the user by issuing a Play File instruction using the just recorded file (i.e. /recordings/filename.wav) as the file to be played.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “recorded” for the event informing the server that a recording has been made. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | The instruction identifier as supplied with the instruction this event belongs to. |
| file-name | Alphanumeric | 40 | Yes | The filename, which is made up as a UUID + .wav. |
{
"type": "recorded",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "RECORD NAME",
"file-name": "96c6cf33-5da0-4612-870a-00e7ba6dddc2.wav"
}
This HTTP POST is sent after a bridge has been attempted.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “bridged” for the event informing the server that a call has been bridged. |
| connected | boolean | 1 | Yes | True means the other party is connected, false means no connection could be made. |
{
"type": "bridged",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "7",
"connected": true
}
This HTTP POST is sent whenever the connection with the caller is lost. Only if it is a result of a disconnect instruction by the customer will it include an instruction-id.
Be advised, this event can happen at any time during the call so please make sure your software can handle this event at any moment.
Whenever a caller disconnects before an instruction has been completed, a disconnect will be sent to the server, but no done or other event indicating that the last instruction was completed.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “disconnected” for the event informing the server that the call has been disconnected. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | The instruction identifier as supplied with the instruction this event belongs to, if this disconnect is the result of such an instruction. Omitted if the disconnect was initiated by the caller or the result of an error. |
{
"type": "disconnected",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "end-call 273487"
}
When the Voice API received multiple instructions in a single reply from the server at the customer, the resulting events of these instructions will be combined in a single POST afterwards. For instance:
This could be a POST after receiving instructions for playing a file, retrieving some dtmf and disconnecting afterwards.
Please note that even though the instructions are combined, they still each have their unique instruction-id.
For a series of instructions, if the caller disconnects during the execution of the instructions, only the completed ones will have an event in the POST, together with the disconnect event.
[
{
"type": "done",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "PLAY WELCOME welcome.wav"
},
{
"type": "dtmf",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "GET-DTMF 007",
"digits": "1234"
},
{
"type": "disconnected",
"call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e",
"instruction-id": "END-OF-CALL 1237 FINAL"
}
]
Upon receiving a POST command from the Voice API server, your server needs to reply with the next step to take. This chapter describes the possible steps and their possible parameters.
Please note that the instruction-id’s have to be generated on your server and will be used in the result that will be sent once the instruction has been performed by the API.
The following instructions are supported:
Since the Voice API is capable of sending arrays of events and instructions, instructions are always encapsulated by an array called 'instructions'.
This instructs the Voice API server to play an audio file to the caller. The file needs to be available on the FTP server at CM.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “play” for an instruction to play an audio file or a TTS prompt to the caller. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| prompt | Alphanumeric | 500 | Yes | The text to say (TTS) or the (path and) name of the file to play. The path is always relative to the root of the FTP folder of the customer. |
| prompt-type | Alphanumeric | 4 | No (default = File) | The type of the prompt, either TTS (Text-To-Speech) or File. |
| voice | JSON | * | No | The voice to use if using TTS. See Text-To-Speech. |
| call-leg | Alphanumeric | 4 | No (default = Both) | When the call is bridged to another, this determines what leg to play the audio on. A = first connected party, B = second connected party, Both = both parties. |
| terminators | Alphanumeric | 8 | No (default = *) | The key(s) that can be pressed to stop the playback. |
Notice
Due to the fact that voicemail detection was not up to CM standard, it has been switched off for the time being. We are working on a new solution.
Default value when voicemail-response is not set in your request. |
|
{
"type": "play",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "instruct-007",
"prompt": "prompts/en/hello.wav",
"prompt-type": "File",
"terminators": "#"
}
This instructs the Voice API server to ask for and receive DTMF input from the caller. It will play the given prompt file, which should contain the instruction for the caller and records the DTMF that the caller sends during or after this instruction. Note that the instruction will stop playing on input by the caller.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “get-dtmf” for an instruction to ask the caller for dtmf input. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| min-digits | Numeric | 8 | No (default = 1) | Minimum number of digits to receive. Value must be between 1 and 64. |
| max-digits | Numeric | 8 | No (default = 1) | Maximum number of digits to receive. Value must be between 1 and 64, but greater than or equal to min-digits. |
| max-attempts | Numeric | 8 | No (default = 1) | Maximum number of retries before receiving DTMF is cancelled. A fail can be when the user enters too few digits before pressing the terminator, or the input does not match with the regex. Value must be between 1 and 10. |
| timeout | Numeric | 8 | No (default = 5000) | The max. time in ms between the end of the prompt audio and the first digit, or between digits. If no digit is received before this timeout, it is counted as an attempt and the prompt is restarted. Value must be between 1000 and 10000 ms. |
| terminators | Alphanumeric | 8 | No (default = #) | A list of digits that cause the input to be terminated. Used in cases where you want to state “Enter your … number, ending with a #”. |
| prompt | Alphanumeric | 128 | Yes | The text to say (TTS) or the (path and) name of the file to play. The path is always relative to the root of the FTP folder. |
| prompt-type | Alphanumeric | 4 | No (default = File) | The type of the prompt, either TTS (Text-To-Speech) or File. |
| invalid-prompt | Alphanumeric | 128 | Yes | The text to say (TTS) or the (path and) name of the file to play when invalid dtmf was received. The path is always relative to the root of the FTP folder. |
| invalid-prompt-type | Alphanumeric | 4 | No (default = File) | The type of the prompt, either TTS (Text-To-Speech) or File. |
| voice | JSON | * | No | The voice to use if using TTS. See Text-To-Speech. |
| regex | Alphanumeric | 64 | No (default is [0-9]*) | The regex to match the input against. An attempt will fail if the input does not match this regular expression. Please note that you may need to escape certain characters in JSON. |
{
"type": "get-dtmf",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "i0012",
"min-digits": 1,
"max-digits": 4,
"max-attempts": 3,
"timeout": 1000,
"terminators": "#*",
"prompt": "Please enter some digits.",
"prompt-type": "TTS",
"invalid-prompt": "That was not correct.",
"invalid-prompt-type": "TTS",
"voice: {
"language": "en-GB",
"gender": "Female",
"number": 2
},
"regex": "[1-9]\\d*"
}
This instruction spells out a given code to the caller. Please note that the code is read character per character, so 123 is read as “one, two, three”, not as “one hundred and twenty-three”.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “spell” for an instruction to spell out a code to the caller. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| code | Alphanumeric | 64 | Yes | The code to read to the caller. Note that this code is read character per character, not as a word or number. |
| code-type | Alphanumeric | 8 | No (default = Default) | The type of audio to use, either default prompts (Default), your own prompts (Custom) or a TTS voice (TTS). |
| voice | JSON | * | No | The voice to use (for all types of code). See also Text-To-Speech. |
The supported languages for the Default prompt-type at the time of writing are:
| Language | Parameter Value | Available Characters |
|---|---|---|
| English (Default) | en-GB | 0-9, A-Z |
| Dutch | nl-NL | 0-9, A-Z |
| Spanish | es-ES | 0-9 |
| Italian | it-IT | 0-9 |
| German | de-DE | 0-9 |
| French | fr-FR | 0-9 |
If you want to use your own custom prompts, see chapter Custom audio files for spell instruction for more information.
{
"type": "spell",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "SPELL12357",
"code": "12357",
"code-type": "Custom",
"voice: {
"language": "en-GB"
},
}
This instruction makes a recording of the voice of the caller. This can be used to have the caller say his name, or his place of residence.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “record” for an instruction to make a recording. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| max-recording-time | Numeric | 8 | Yes | The maximum time (in seconds) to record. The value must be between 1 and 120 seconds. |
| silence-time | Numeric | 8 | No (default = 3) | The time (in seconds) the caller needs to be silent for the recording to stop. Value must be between 1 and 30 seconds. |
| silence-threshold | Numeric | 8 | No (default = 200) | The "sound energy" below which audio is seen as "silent". A higher value will help ending the recording with silence-time in noisy environments. Value must be between 1 and 1000. |
| terminators | Alphanumeric | 8 | No (default = *) | The key(s) that can be pressed to stop the recording. |
| prompt | Alphanumeric | 500 | Yes | The text to say (TTS) or the (path and) name of the file to play. The path is always relative to the root of the FTP folder. |
| prompt-type | Alphanumeric | 4 | No (default = File) | The type of the prompt, either TTS (Text-To-Speech) or File. |
| voice | JSON | * | No | The voice to use if using TTS. See Text-To-Speech. |
{
"type": "record",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "RECORD-NAME",
"max-recording-time": 30,
"silence-time": 3,
"silence-threshold": 500,
"terminators": "#",
"prompt": "prompts/en-GB/SayYourName.wav",
"prompt-type": "File"
}
This instructs the Voice API server to bridge (forward) the call to another callee.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “bridge” for an instruction to bridge a call |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| callee | Alphanumeric | 24 | Yes | The number to dial in international format. |
| caller | Alphanumeric | 24 | Yes | The number to show as the caller in international format.* |
| anonymous | Boolean | 1 | No (default = false) | The caller number will not be shown to the callee if this is set to true. Please note that you still need to supply a valid caller. |
| max-ring-time | Numeric | 2 | No (default = 30) | The maximum time (in seconds) for the phone of the callee to ring. |
| ring-back | JSON | * | No (default = European tone) | The ringback sound to play to the first party while the phone of the callee is ringing. |
* Please note that it is technically possible to supply any caller id, but you are not allowed (by law) to (ab)use telephone numbers not owned by you.
{
"type": "bridge",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "BRIDGE-TO-31761234567",
"callee": "0031761234567",
"caller": "0031769876543",
"max-ring-time": 30,
"ringback": [
{
"beep-duration": 1000,
"primary-beep-frequency": 425.0,
"secondary-beep-frequency": 0.0,
"pause-duration": 3500
}
]
}
The ringback is a separate piece of JSON, constructed as an array of tones, each defined with 4 properties:
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| beep-duration | Numeric | 4 | No (default = 1000) | The duration of the beep |
| primary-beep-frequency | Numeric | 4 + 1 decimal | No (default = 425.0) | The primary frequency of the beep |
| secondary-beep-frequency | Numeric | 4 + 1 decimal | No (default = 0.0) | The secondary frequency of the beep |
| pause-duration | Numeric | 4 | No (default = 3500) | The pause after the beep |
These beeps are played after each other, in an endless loop, until either the callee answers or the max-ring-time is reached. Leaving everything at the default setting will result in the standard European ringback.
[
{
"beep-duration": 400,
"primary-beep-frequency": 400.0,
"secondary-beep-frequency": 425.0,
"pause-duration": 200
},
{
"beep-duration": 400,
"primary-beep-frequency": 400.0,
"secondary-beep-frequency": 425.0,
"pause-duration": 2200
}
]
This instructs the Voice API server to just wait and do nothing. Usually only used to have a bridged call just go for a given time.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “wait” for an instruction to just wait. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| duration | Numeric | 4 | Yes | The duration (in seconds) to wait. |
The wait instruction will result in a Done event.
{
"type": "wait",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "42",
"duration": 120
}
This instruction ends the connection with the caller. This should normally only be done when the IVR flow has completed and preferably following an audio file that explains the fact that the conversation is over and the connection will be ended, giving the caller the opportunity to do so before the system does.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “disconnect” for an instruction to disconnect the call. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | Yes | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
{
"type": "disconnect",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "end-call 56739"
}
In order to send a sequence of instructions that need to be executed in the given order, you can combine multiple instructions in a JSON array when replying to a POST command. For instance:
Please note that even though the instructions are bundled, they still each need their own (unique) instruction-id.
[
{
"type": "play",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "instruct-007",
"prompt": "prompts/en/hello.wav",
"prompt-type": "File",
"terminators": "#"
},
{
"type": "get-dtmf",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "DTMF-75-Q8",
"min-digits": 1,
"max-digits": 4,
"max-attempts": 3,
"timeout": 1000,
"terminators": "#*",
"prompt": "Please enter some digits.",
"prompt-type": "TTS",
"invalid-prompt": "That was not correct.",
"invalid-prompt-type": "TTS",
"voice": {
"language": "en-GB",
"gender": "Female",
"number": 2
},
"regex": "[1-9]\\d*"
},
{
"type": "disconnect",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "END-CALL 78374"
}
]
In order to initiate an outbound call, you can use the CM endpoint at:
https://voiceapi.cmtelecom.com/v2.0/VoiceApi (which is the endpoint for an outbound call for the VoiceAPI)
The CM server will only accept your request if it contains the correct information in the Authorization header, please see section Authentication - example 2 for more info.
In order to initiate an outbound call, you can send a place-call instruction to the Voice API server(s). The API-call will immediately return, returning the call-id for the new call (if the instruction is accepted).
When (and if) the phone call is answered, a POST command will be sent to your server, equal to the flow of an incoming call. Basically the only difference is the field direction, which will now contain the word outbound rather than inbound.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| instruction-id | Alphanumeric | 64 | No | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. |
| callee | Alphanumeric | 24 | Yes | The number to dial in international format. |
| caller | Alphanumeric | 24 | Yes | The number to show as the caller in international format.* |
| callback-url | Alphanumeric | 256 | No | The url (including http(s)://) for the callback from the VoiceAPI. Defaults to the configured callback url if this variable is not supplied. |
| anonymous | Boolean | 1 | No (Default: false) |
The caller number is hidden when set to true. |
In contrast to the other instructions, the place-call instruction does not have a type field - since the type is determined by the endpoint used - and the instruction-id is not required - since the resulting event is logically linked to the instruction.
There are also other instructions you can send to the CM server(s) like the place-call instruction, which will initiate a pre-configured flow, without the need of a server to handle POST-commands. These instructions are explained in the Voice API Apps documents.
* Please note that it is technically possible to supply any caller id, but you are not allowed (by law) to (ab)use telephone numbers not owned by you.
{
"instruction-id": "Dial out to 0031765727001",
"callee": "0031765727000",
"caller": "0031765727001",
"callback-url": "https://voiceapicallback.cm.com:1234"
"anonymous": false
}
The call-queued event is the only event that is actually sent as a response to a POST command (namely the place-call instruction).
The call-queued instruction only informs your software of the fact that the call has been accepted, is currently queued to be dialled and has the given call-id assigned to it. Further processing is done in the exact same way as for inbound calls, with the only difference being the value of the direction field in the new-call event, which is now outbound of course.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the instruction. Always “call-queued” for the event informing the server that the call has been initiated. |
| call-id | Alphanumeric | 24 | Yes | The number to dial. |
| instruction-id | Alphanumeric | 64 | No | A string to identify the instruction, useful to match events to the instruction they belong to, generated by the customer’s server. Only available if supplied in the Place Call instruction. |
| success | Boolean | 1 | Yes | True if the number was dialled, false otherwise |
{
"type": "call-queued",
"call-id": "c67f305e-48f7-4019-9bc1-63a36532b448",
"instruction-id": "PLACE outbound call",
"success": true
}
When an exception occurs in the Voice API, it will send a POST command informing the server at the customer of this exception.
Whenever the Voice API receives an instruction that could not be properly parsed from the JSON text, it will send a new POST command with an Invalid JSON exception.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the event. Always exception for an exception. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | No | The instruction identifier as supplied with the instruction this exception belongs to, if it was available and could be read from the JSON. |
| code | Numeric | 8 | Yes | Code for the exception, for an Invalid JSON exception, this is always 400. |
| title | Alphanumeric | 32 | Yes | Title of the exception. For an Invalid JSON exception this will always read “invalid json”. |
| message | Alphanumeric | 1000 | Yes | Readable description of the exception. |
Please note that – in contrast to all other exceptions – the field instruction-id is optional for this exception. If the received JSON was so malformed that the Voice API could not get this value from the JSON string, this field will be omitted.
{
"type": "exception",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"code": 400,
"title": "invalid json",
"message": "The JSON could not be properly parsed."
}
Whenever the JSON string can be parsed, but the contents do not represent a valid instruction, i.e. the type of instruction is unknown, the Voice API will send a new POST command with an invalid instruction exception.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the event. Always exception for an exception. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | No | The instruction identifier as supplied with the instruction this exception belongs to, if it was available and could be read from the JSON. |
| code | Numeric | 8 | Yes | Code for the exception, for an invalid instruction exception, this is always 405. |
| title | Alphanumeric | 32 | Yes | Title of the exception. For an Invalid Instruction exception this will always read “invalid instruction”. |
| message | Alphanumeric | 1000 | Yes | Readable description of the exception. |
{
"type": "exception",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "DOES THIS WORK",
"code": 405,
"title": "invalid instruction",
"message": "The type of instruction could not be mapped."
}
When the Voice API misses a required parameter for an instruction, or finds an invalid value for a parameter, it will send a new POST command with an Invalid Parameter exception.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the event. Always exception for an exception. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | No | The instruction identifier as supplied with the instruction this exception belongs to, if it was available and could be read from the JSON. |
| code | Numeric | 8 | Yes | Code for the exception, for an invalid parameter exception, this is always 406. |
| title | Alphanumeric | 32 | Yes | Title of the exception. For an Invalid Instruction exception this will always read “invalid parameter”. |
| message | Alphanumeric | 1000 | No | Readable description of the exception. |
{
"type": "exception",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "TEST123",
"code": 406,
"title": "invalid parameter",
"message": "The value for min-digits needs to be numeric."
}
When the Voice API receives an instruction to play a file, but it cannot find the file specified, it will send a new POST command with information on what file could not be found.
| Variable | Data type | Length | Required | Description |
|---|---|---|---|---|
| type | Alphanumeric | 32 | Yes | The type of the event. Always exception for an exception. |
| call-id | UUID / GUID | 36 | Yes | The 36 character (lowercase, including dashes) hexadecimal representation of the call identifier. This number is included in all requests. |
| instruction-id | Alphanumeric | 64 | No | The instruction identifier as supplied with the instruction this exception belongs to, if it was available and could be read from the JSON. |
| code | Numeric | 8 | Yes | Code for the exception, for a file not found exception, this is always 404. |
| title | Alphanumeric | 32 | No | Title of the exception. Always “file not found” for a File Not Found exception. |
| message | Alphanumeric | 1000 | Yes | Readable description of the exception |
{
"type": "exception",
"call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885",
"instruction-id": "9510d84e-58e8-4836-839b-c05ba4615571",
"code": 404,
"title": "file not found",
"message": "The following file could not be found: prompts/en/helo.wav."
}
In order to provide security, every single message sent between the Voice API and your server is signed with a HMAC-SHA-256 signature. This only applies to the HTTP-request, the reply does not need to be signed, as the signature is only to prove the originator of the request, the reply is logically from the server, especially since HTTPS is required.
This signature is constructed using a shared password and the body of the request, using the HMAC-SHA-256 algorithm. The key for the hashing algorithm is the shared key (which you receive from CM).
In order to validate the request, the signature is placed in the "Authorization" header of the request. If the message is sent from CM to your server, the header will only contain the signature, following the syntax: signature=<signature>, example:
signature=a2e806968a3e163ef56e12ad812f4350fe889cb559049af6860406a4c9e468b9
When the message is sent from your server to CM, it must also contain your username, following the syntax: username=<username>;signature=<signature>, example:
username=myusername1234;signature=a2e806968a3e163ef56e12ad812f4350fe889cb559049af6860406a4c9e468b9
If the Authorization header is missing, or if the signature does not match the one calculated by the CM Server(s), you will receive a 401 - Unauthorized.
In order to test your authentication logic, the VoiceAPI hosts a special "Check authentication" endpoint at:
https://voiceapi.cmtelecom.com/v2.0/CheckAuthentication
This endpoint accepts POST requests, containing any body you like. You have to add the Authorization header as described in this chapter, signing the chosen body for the request.
The service will respond with either a 200 - OK (in case the signature is correct) or a 401 - Unauthorized message.
For this example, let us assume your username is 'myusername' and your HMAC key (Shared Key) is KWWppDsf1bm8nZZqmnCtl/RZR&CB2wHq.
So, for instance, you might send the following body:
check authentication
Using this body as the HMAC body and your shared key as the HMAC Key, we get the following hash:
dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e
Combining that with your username, this will give the following Authorization header:
username=myusername;signature=dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e
Sending this to the endpoint will result in a 200 - OK (which it actually does not, since this user does not exist). If you change anything to either the body or the code, the endpoint will return a 401 - Unauthorized.
curl --request POST \ --url https://voiceapi.cmtelecom.com/v2.0/CheckAuthentication \ --header 'authorization: username=myusername;signature=dc05cbba45eb2276fecc3e723413113e7edd6721ff2df8ce12c5828ef513a57e' \ --data 'check authentication'
The signature for this example is constructed as the HMAC-SHA256 hash over the following:
HMAC body =
{ "type": "dtmf", "call-id": "586b1c6a-3e7c-41a6-bc27-80c2360f842e", "instruction-id": "4a5114dd-4fb3-47d2-947a-1d4599a5023f", "digits": "1234" }
HMAC key (Shared Key) = >=1WbAS5=uZC>GzC?c8Ow:$b@f>qBezC
Resulting in the following hash:
840430e6e3b67a54cae22345c399a0a6d4208559341956c16a5f25401334979a
And an Authorization header with the following content:
signature=840430e6e3b67a54cae22345c399a0a6d4208559341956c16a5f25401334979a
Please note that the header is only containing the signature, not your username.
The signature for this example is constructed as the HMAC-SHA256 hash over the following JSON string:
HMAC Body =
{ "type": "get-dtmf", "call-id": "81536d6f-6a9f-4906-8ef8-cb1e5643f885", "instruction-id": "8a39e321-e832-4dd5-8c73-d244e0fff7b4", "min-digits": 1, "max-digits": 4, "max-attempts": 3, "timeout": 1000, "terminators": "#*", "prompt": "prompts/en/EnterSomething.wav", "prompt-type": "File", "invalid-prompt": "prompts/en/Retry.wav", "invalid-prompt-type": "File", "regex": "[1-9]\\d*" }
HMAC key (Shared Key) = Jq5+mr0ORnw?AjY5X;@FH=ke>x9!+*L=
Resulting in the following hash:
1063e00569c743ec016a8acc958e67df5c3d986c174074a8b92fccfb1d3198e0
Which would make the complete Authorization header:
username=myusername1234;signature=1063e00569c743ec016a8acc958e67df5c3d986c174074a8b92fccfb1d3198e0
Please note that now, the header is containing both your username and the calculated signature.
When testing with Postman, it is advisable to use a JSON body without newlines. If you do use newlines, you might end up with authorization issues, as the newlines in Postman (or other tools) might be different (usually \r\n) from the newlines used in the (online) tool you use to calculate the signature.
In code, it does not matter if you use newlines, as the exact body your code should use for calculating the signature, is the body the CM servers receive and thus will result in the same signature.