Alexa

Developing an Alexa Skill with the Jovo framework


Reading Time: 5 minutes

As a mobile developer, cross-platform development can be hard. Using different codebases for the voice apps development world.

Each project can be a real pain, that’s why React Native is one of the best options for many developers. Here in Magmalabs, we are not the exception, we are truly JavaScript lovers, and if you are up to date in the software development industry, you know JavaScript is everywhere.

Jovo was the first open-source framework that allowed you to build voice apps for Amazon Alexa and Google Assistant with one codebase. You can see Jovo as I do (the React Native of the voice apps development), and as React Native does. Jovo makes it easier for us to develop voice apps, and the way it works is pretty easy to understand.

Requests and responses, input and outputs; all of this is what developing an Alexa Skill consists of.  Inputs are basically the user’s speech which interacts with Alexa, they consist of utterances or instructions made by a user. Alexa sends the raw audio to the Alexa API where the speech is transcribed to text. The result is interpreted with the Voice User Interface configuration set in the skill interface in order to provide a meaning from the user input and understand what the user wants to express; then the API service sends this information to our Jovo App server through a JSON request.

Once we receive that JSON request to our server, this is when the fun really starts. As developers, we need to figure out what to respond to the user using the information of the input. Sending back a JSON response including a text with the desired response. The Alexa API will handle the text and transform it to voice with a text to speech process. Jovo helps you to keep track of the context of every request to not stop the thread of the conversation.

A Jovo project uses the express framework for running a server and it commonly has two main elements:

  • Skill Service: The actual code of your voice app that is later hosted somewhere
  • Skill Interface: Voice user interface configuration, including project.js, models and platforms folders

Basic concepts

Invocation name

Users call upon a skill’s invocation name to begin an interaction with a particular custom skill. The invocation is the first element of the Jovo Language Model and it sets the invocation name of your voice application which is what the user would be saying in order to initiate a conversation with your skill.

Intents

It is inevitable to talk about Intents when it comes to voice app development. An intent represents an action that fulfills a user’s spoken request. They are specified in the Interaction Model File. These intents represent the core functionality of a skill. For example, in Jovo, this is how a simple Intent declaration looks.

Where the name attribute contains the name of the intent and inside phrases, this is where are set a few sample utterances that specify the words and phrases users can say to invoke those intents. You map these utterances to your intents. This mapping forms the interaction model for the skill.  

Slots

Intents can optionally have arguments, these are called slots. You can use them if your skill needs to retrieve specific information that can be used for a better experience. For example:

 

Slots are defined with different types. The name slot in the above example uses Amazon’s built-in AMAZON.US_FIRST_NAME type to convert words that indicate names (such as “John Conner” and “Bruce Wayne”). Amazon provides built-in support for a large library of slot types. This includes:

  • Types that convert data such as dates and numbers.
  • Types that provide recognition for lists of values. For example, first names or cities names.
  • All built-in types have the prefix AMAZON.

Install Jovo

Jovo provides a CLI and the easiest way to install it is by using npm.

$ npm install -g jovo-cli

Creating a new project

After the installation of the Jovo CLI, you can type jovo to test if everything works as expected. You will see several command options the Jovo CLI provides.


  Commands:

    help [command...]             Provides help for a given command.
    exit                         Exits application.
    new [options] [directory]      Create a new Jovo project
    init [options] [platform]      Initializes platform-specific projects in
                                 app.json.
    build [options]               Build platform-specific language models
                                 based on jovo models folder.
    deploy [options]              Deploys the project to the voice
                                 platform.
    get [options]                 Downloads an existing platform project
                                 into the platforms folder.
    run [options]                 Runs a local development server
                                 (webhook).

Using the command Jovo new [project_name] is possible to create a new Jovo project with a default template which all it has is the Hello Word code.

$ jovo new HelloWorld

I'm setting everything up

√ Creating new directory /HelloWorld

√ Downloading and extracting template helloworld

√ Installing npm dependencies

Installation completed.

Run your code

Jovo currently supports an Express server, so the project can be run by using node index.js or using the command jovo run. The terminal will show something like this:


$ jovo run

Example server listening on port 3000!

This is your webhook url: https://webhook.jovo.cloud/...
# Copy the link above!

One of the most useful tools Jovo provides is its debugger. It is a powerful tool used to test the logic of your code. When the code is run, Jovo gives you a webhook url to be used as a local development server provided by Jovo which will be interacting with your local server being run on port 3000.

 

Understanding the Jovo project structure.

The folder structure (see the docs) looks like this, For now, you only have to touch the app.js file. This is where logic will take place.

The app.js file contains a handler which is where you will spend most of your time when you are building the logic behind your voice app. It already has a “HelloWorldIntent,” as you can see below:


app.setHandler({
    LAUNCH() {
        this.toIntent('HelloWorldIntent');
    },

    HelloWorldIntent() {
        this.ask('Hello World! What\'s your name?', 'Please tell me your name.');
    },

    MyNameIsIntent() {
        this.tell('Hey ' + this.$inputs.name.value + ', nice to meet you!');
    },
});

Conclusion

This blogpost was centered on the explanations of some concepts and project setup. Jovo provides a lot more features that ease our development. Special components such as SpeeachBuilder, make it easier to add text, pause and play MP3 files among Alexa responses; the VisualBuilder component provides help with screen displays. Jovo even provides unit testing using Jest.

Jovo helps developers to save time which saves money to the company allowing us to build a voice app faster, using our time to indeed ‘create’ instead of just figuring out the technical details.

Development
Hey user, what are you up to?
Android
Set up your CI and CD for React Native Part 2
AEM
How to use internal redirects in AEM?