Let's Work Together

Putting a Chat Bot with Amazon Lex in a mobile application

Image by Brian Telintelo

In December, Amazon announced Amazon Lex, and we’ve been excited about building an Amazon Lex chat-bot inside a mobile app ever since. I’ve even put together a quick tutorial below about how to let consumers talk to your mobile app by typing or voice.

But let’s back up a few steps. First, what is Lex? It’s a new service inside the Amazon Web Services (AWS ) platform that allows developers to take advantage of the same features used by Amazon Alexa, the popular consumer device that lets people control everything from their lights to music with simple voice commands.

Lex gives mobile developers like us the ability to create sophisticated chat-bots right inside an Android or iOS application. We’re able to tap into the brains of Lex, which contain much more than speech-to-text translation. Lex tries to understand user input in meaningful ways, so an application can return appropriate results without much work. It also integrates with other AWS services, such as as lambdas, gateways, and ec2 containers to give feedback to the user.

So how do you get started building an Amazon Lex chat-bot?

Building Blocks of a Chat Bot

At first glance, building a chat bot seems rather complicated. Taking any kind of user input and returning something meaningful back seems challenging. But the goal should be handling simple amounts of data to quickly complete a task quickly for a user. Anything more and a user probably will be overwhelmed. Simple is best for these bots.

Intents

Lex Create Intent
Amazon Lex's Interface for creating an intent

Where to start with a bot?  Determining a bot’s intents is the first step.  An intent in a chat-bot is defined as “the goal the user wants to achieve”. Let’s use a pizza ordering chat-bot as an example.  One intent could be defined as “Ordering a Pizza” while another intent could be “What Toppings Are Available”.  There can be many intents per bot.  Intents should be kept basic, small, and well defined to a goal so they don’t over complicate a user’s experience.

Utterances

Lex Utterances
Amazon Lex's Interface for adding sample utterances

An utterance is a spoken or typed phrase that invokes the intent.  “I’d like to order a pizza” or “Can I order a Meat Lovers pizza for delivery?” These are phrases that would invoke the “Ordering a Pizza” intent.  By figuring out the user’s desired intent from an utterance, the intent can now be used to ask necessary questions to fulfill the goal of the user.

Slots

Now that the intent has been defined, as “Ordering a Pizza”, the chat bot needs to fill in all the necessary data slots.  Slots are an input, a string, date, boolean, number etc that are needed to reach the goal of the intent.  The chat bot will need to be smart enough to figure out what questions to ask the user in order to satisfy all the slots.  For example, the “Ordering a pizza” intent might need to ask the following to complete the intent:

  1. What type of pizza?
    1. Pepperoni
    2. Veggie
    3. Meat Lovers
  2. Delivery Address?
    1. Address Input Type
  3. Payment Method?
    1. Cash
    2. Credit Card

Here is an example text chat bot on how the bot fulfills the slots:

User: I’d like to order a pizza

Bot: Ok, what kind of pizza would you like to order?

User: I would really like to try out your Meat Pizza please!!!

Bot: Ok great! One Meat Lovers pizza has been added to your order.

Bot: Where should I deliver the pizza?

User: 5155 Financial Way

Bot: I’ll deliver to 5155 Financial Way Mason, OH 45040

Bot: How do you plan on paying for the pizza?

User: Cash

Bot: Great you’re all set!  Your order will be delivered in approximately 30 minutes.

Lex Slots
Amazon Lex's Interface for adding slot data parameters

Each slot has a name, slot type, version, a prompt, and is it required.  The prompt is what Lex will use to ask the user for the correct input.  The slot types are the valid values a user can respond with.  Slot types can be either custom defined or one of the Amazon built in values.  Notice in the example the use of the built in StreetAddress slot type.  Lex can help determine and sanitize a user’s address without any code.  The pizzaType and paymentMethod slot types are defined on another screen and are simply lists of string values that are valid responses.

Lex Slot Types

What Lex Provides

Lex is the engine that has to figure out what the user wants.  The user says they want a “Meat Pizza”, but it is up to the Lex bot to translate that to “Meat Lovers”.  The lex bot also must ignore unimportant input and recognize multiple slot responses.  For example, the user might have said, “I’d like to pay cash for a pepperoni pizza please”.  This processing is up to Lex to figure out that the user actually fulfilled two slots of data input with one voice/text input.

Business Logic Takes Over

Now that all the necessary data is gathered from the chat-bot, it can just be passed over in a normal HTTP request, or lambda function to be processed.  For example, if the application server had a URL endpoint, there might be three parameters: pizzaType, deliveryAddress, and paymentMethod. The slots setup from the intent can now be used to execute that method.  This can be done within the AWS console by passing data to a Lamda function, or the parameters can be returned to the client application that then calls a REST endpoint.

Integrating AWS Lex with an iOS or Android application

AWS provides libraries for iOS and Android dev to integrate with their services.  With the announcement of Lex, AWS also provided client libraries for the bot service.  In this blog post, an iOS example of Lex integration will be demonstrated, but the concepts for Android are similar.

Configuration - iOS/Swift

1
2
3
4
5
6
7
let credentialsProvider = AWSCognitoCredentialsProvider
(regionType: "AWSRegionType.usEast1", identityPoolId: CognitoIdentityId)

// Lex currently only in us-east-1
let configuration = AWSServiceConfiguration(region: .usEast1, credentialsProvider: credentialsProvider)

AWSServiceManager.default().defaultServiceConfiguration = configuration

The above code sets up our application to talk to our AWS account(the CognitoIdenityId is specific to an AWS account).  Once authentication and region configuration is complete, the app can integrate text or voice bot services.

1
2
3
4
5
6
7
//Name the bot to connect with and the version of the chat bot.  
//As a chat bot is updated and changed, a version history is kept.  
//$LATEST will tell the app to always use the most recent bot.
let chatConfig = AWSLexInteractionKitConfig.defaultInteractionKitConfig(withBotName: "PizzaBot", botAlias: "$LATEST")
AWSLexInteractionKit.register(with: configuration!, interactionKitConfiguration: chatConfig, forKey: "AWSLexVoiceButton")
chatConfig.autoPlayback = false
AWSLexInteractionKit.register(with: configuration!, interactionKitConfiguration: chatConfig, forKey: "chatConfig")

Invoking the Bot with Voice or Text

There is actually very little work to do for the app itself.  Simply invoke the bot with either text or voice and handle the output parameters when the user is done.

Voice Chat Button - View Controller

Use AWSLexVoiceButton and the corresponding AWSLexVoiceButtonDelegate for output callbacks

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
class VoiceChatViewController: UIViewController, AWSLexVoiceButtonDelegate {

    @IBOutlet weak var voiceButton: AWSLexVoiceButton!
    @IBOutlet weak var output: UILabel!

    override func viewDidLoad() {
        super.viewDidLoad()
        self.voiceButton.delegate = self
    }

    func voiceButton(_ button: AWSLexVoiceButton, on response: AWSLexVoiceButtonResponse) {
        DispatchQueue.main.async(execute: {
            print("output \(response.outputText)")
            self.output.text = response.outputText
        })
    }

    public func voiceButton(_ button: AWSLexVoiceButton, onError error: Error) {
        print("error \(error)")
    }

}

The above code simply links a button from the storyboard and provides success/error callbacks with the bot’s response.

Text Chat

A text view controller works similar to a voice view controller.  To send text to a bot, use AWSLexInteractionKit. It has several methods on it to accept text input and respond with text output or even audio output.  This class will also help store state, such as the current slots filled by the user’s input and the intent being invoked.

Conclusion

While Amazon Lex is still in BETA, it is becoming increasingly easy to create chat bots within a mobile application.  This can be one more way for an app or product to reach consumers in a meaningful way.