Tokenize
A chat message must first be tokenized into its constituent natural language constructs such as words and punctuation marks. This allows meaning to subsequently be attached to each part of the message.
Basic request
Send the following chat request:
POST /zoo-chatbot/tokenize
{
"message": "any giraffes?"
}
Response
Notice how the message is split into two words and a question mark, with high probabilities of correctness:
Response
{
"tokens": [
"any",
"giraffes",
"?"
],
"probabilities": [
{
"token": "any",
"probability": 1
},
{
"token": "giraffes",
"probability": 0.9925965989
},
{
"token": "?",
"probability": 1
}
]
}
Erroneous request
Chatbots must deal with typos and grammatical mistakes. Send the following request which contains such errors:
POST /zoo-chatbot/tokenize
{
"message": "giraffes like to eat? leaves,oh"
}
Response
Notice that the probabilities of correctness of the tokens are lower for the erroneous parts of the message:
Response
{
"tokens": [
"giraffes",
"like",
"to",
"eat",
"?",
"leaves",
",oh"
],
"probabilities": [
{
"token": "giraffes",
"probability": 1.0
},
{
"token": "like",
"probability": 1.0
},
{
"token": "to",
"probability": 1.0
},
{
"token": "eat",
"probability": 0.9912258614
},
{
"token": "?",
"probability": 1.0
},
{
"token": "leaves",
"probability": 0.9503636232
},
{
"token": ",oh",
"probability": 0.9828333777
}
]
}