StupidRPG: Natural Language Processing

2015-07-25

Language parsing is hard. The parser in SRPG took me a couple months to get working the way I wanted, and it’s a very simple parser compared to others I’ve seen. I’ve previously listed the command patterns (or sentence patterns) SRPG can understand; the list has grown gradually during the course of SRPG’s development, but at this point it’s unlikely to change much prior to the first release. It covers 99% of the patterns I’m interested in supporting.

This will be a long, moderately-technical post.

Pattern List Reference

// Pattern list
// Cannot start with a modifier
// In order of priority (highest to lowest):
'patterns' = [
    'VERB',						        // inventory
    'VERB NOUN', 				        // eat baby
    'VERB MODIFIER',			        // saunter west
    'VERB MODIFIER NOUN',		        // get in closet
    'VERB NOUN MODIFIER NOUN',	        // attack goblin with hammer
    'VERB MODIFIER MODIFIER NOUN',      // look north through telescope
    'VERB MODIFIER NOUN MODIFIER NOUN', // look at bob through telescope
    // Overflow patterns
    'VERB MODIFIER NOUN MODIFIER TEXT', // talk to goblin about greatest fears
    'VERB MODIFIER MODIFIER TEXT',      // look north through telescope saucily
    'VERB NOUN MODIFIER TEXT',          // ask bob about back pain
    'VERB MODIFIER TEXT'                // talk about floops
];

Patterns are only the tip of the iceberg though. A lot of work goes on behind the scenes to convert arbitrary text from the player into a meaningful action within the game. The end result functions similarly to a lexer/parser in that it processes a string via a set of grammar rules. The parser performs the following steps:

Do basic sanitization (e.g. remove extra whitespace).
If a command interrupt is set, give it a chance to take over.
For each pattern, do:
- Convert the input string to lowercase for easier parsing.
- Break the new string into tokens (words).
- Do sanity checks.
- For each token in the pattern, try to match to the next available command token(s).
If no VERB was matched, return an error. All commands have to contain a verb.
If a portion of the input wasn’t matched, return an error.
Apply filter restrictions (more on that later), and return an error if the filters fail.
Dispatch the action to the relevant object or verb.

Some steps are more significant than others. I’ll go over the more important/interesting ones in detail.

Pattern Loop (Step 3)

The pattern loop compares each command pattern to the input tokens. Most of the work is done in steps 6-7, but before the process gets there, the input string is segmented and a couple checks are performed. Notably, if the number of input tokens is less than the number of pattern tokens, it skips to the next pattern without doing any more (wasted) work. If a halt has been triggered by the token matching, the pattern processing will stop rather than continuing to the next pattern.

Token Matching (Steps 6-7)

For each token in the pattern, the NLP attempts to match as many of the input tokens as possible.

Here’s a simple example: GO NORTH. This is going to match the VERB MODIFIER pattern (the third pattern). First, however, the NLP will try to use the first pattern, VERB. It will look for a verb called ‘go north’, and upon failing to find one, it will continue to the next pattern: VERB NOUN. This time it will successfully match the verb ‘go’, but it will fail on the second pattern token, NOUN, after failing to find an object called ‘north’. Finally, it will get to the third pattern: VERB MODIFIER. As in the 2nd attempt, it will match the verb ‘go’, and then match the modifier ‘north’ (available for the ‘go’ verb).

Longer input strings take a lot more work. Keep in mind that the NLP will always try to match as much text as possible in each step. A long command like LOOK AT THE MOON THROUGH THE TELESCOPE will end up making the following comparisons for the command pattern VERB NOUN MODIFIER NOUN:

Is ‘look at the moon through the telescope’ a verb? No
Is ‘look at the moon through the’ a verb? No
Is ‘look at the moon through’ a verb? No
Is ‘look at the moon’ a verb? No
Is ‘look at the’ a verb? No
Is ‘look at’ a verb? Yes!
Is ‘the moon through the telescope’ a noun? No
Is ‘the moon through the’ a noun? No
Is ‘the moon through’ a noun? No
Is ‘the moon’ a noun? Yes!
Is ‘through the telescope’ a modifier? No
Is ‘through the’ a modifier? No
Is ‘through’ a modifier? Yes!
Is ‘the telescope’ a noun? Yes!

If you’re feeling cross-eyed, don’t worry, that’s perfectly normal.

The verb matching function is listed below. The functions for matching verbs, nouns, and modifiers all work in basically the same way. They get a subset of the provided tokens, starting with all of the tokens, and shrink the set until they get a match or run out of tokens.

NLP.matchVerb()

'matchVerb' = function(string) {
    var tokens = string.split(' ');

    // Loop through token list, try to parse longest first (most tokens)
    for(var i = tokens.length; i > 0; i--)
    {
        var verb = tokens.slice(0, i).join(' ');
        var action = ECS.getAction(verb);
        
        if(action != null)
        {
            return {'match':action,'string':tokens.slice(i).join(' ')};
        }
    }
    
    return {'match':null,'string':string};
};

On line 8, the function asks the ECS for a verb with the given alias. Most verbs have multiple aliases. Here’s the list for the ‘look’ command: ‘aliases’:[‘look’,’l’,’look at’,’peer’,’glance’,’inspect’,’examine’,’x’].

If a verb was found, the match is returned in two parts: 1) the verb object, and 2) the unmatched remainder of the text. In the above example sequence, matchVerb would return ‘the moon through the telescope’ along with the ‘look’ verb.

Eventually, the NLP matches all the input tokens or determines that not all tokens can be matched. The latter case leads into step 9.

Partial Comprehension (Step 9)

Good feedback is important in the input processor. It should always be as clear as possible to the player why a command failed, so they can resolve it with minimal fuss. Let’s look at two examples:

EAT THE CAMEL
GO NORTH WHILE CARTWHEELING

The first example is handled by the NLP directly. A generic response is provided, because the verb ‘eat’ was understood, but ‘the camel’ wasn’t identified as an object. The response will be: I understood everything up until ‘the camel’. You want to eat, plus something.

Ok, that’s a good start. The phrasing is a little clunky in this context (probably in most contexts, honestly), but it gets the message across. The player should probably verify that there is indeed a camel in the current location. We can do better though. The NLP allows verbs to specify their own callback for handling failed inputs.

The second example uses the ‘move’ verb (‘go’ is an alias for ‘move’, and ‘north’ is one of the available modifiers). Entering the command above will produce this response, custom-tailored by the verb itself: I understand you want to go somewhere, but I don’t know how to go (while cartwheeling).

Better.

Filters (Step 10)

I’ll have to save a more thorough discussion of Filters for another post. The basic idea is that the NLP’s parsing can be interrupted if certain conditions are met. For example, the LOOK and TAKE verbs have a ‘darkness’ filter applied to them. If the current location is dark, the filter fails (canceling the action).

Dispatching Actions (Step 11)

Alright. Tokens are parsed, command patterns attempted, filters passed. We’ve got a verb, nouns (maybe), modifiers (maybe). Time to actually trigger the command. If we parsed any nouns, we also decide which one is the target and try to hand control off to it. In general, most actions in the game can be customized per-object. If the object doesn’t provide a callback for the verb, or the object’s callback declined to handle the action, control is instead handled to the verb’s default callback.

In either case, an object containing the matched noun(s), matched modifier(s), and the full input string, is passed to the callback.

That’s it. The NLP’s job is done.

Areas for Improvement

I don’t plan on doing any major refactoring of the NLP in the near future. I’ve already spent more time on it than I should, and it handles pretty much every type of useful command I can think of right now. That said, it has a few notable shortcomings and areas for improvement.

Precedence: if the player enters a command like TALK ABOUT SELF (which matches the very last command pattern in the list), the NLP will first attempt to match the command against every other pattern in the list. It’s designed to ‘fail fast’ if the string can’t possibly match the current pattern, but it still feels a bit inefficient to me. I can’t justify more time tinkering with it right now, as it works fine I haven’t identified any performance issues. Just a feeling there’s probably a better way.

Action Targets: currently, the NLP interprets the first noun matched as the target of the action. This leads to problems with commands like LOOK THROUGH TELESCOPE AT BOB vs. LOOK AT BOB THROUGH TELESCOPE, which will interpret the telescope and Bob as the target of the command, respectively. When control is passed to the telescope (first case), the command works fine. When control is passed to Bob (second case), the action will fail (or at least not give the expected result). This is a tricky one to fix.

Disambiguation: the NLP does not currently perform disambiguation when two objects have similar or identical names. The classic Infocom games display a message like “Do you mean the red ball or the blue ball?“ and then interpret the player’s next input as a clarification. Hmm, that sounds a lot like a command interrupt…

Complex Identifiers: once again, the Infocom parser has some cool tricks. For certain verbs, it allows lists of objects to be parsed (useful for picking up multiple items, for example). Advanced cases like TAKE ALL EXCEPT HAMMER AND SCREWDRIVER are understood. Raw text and numbers can also be used mid-input, whereas in SRPG any non-standard text is handled in the ‘overflow’, which always comes at the end of the input string.

I could go on, but at this point I’m just listing all the really smart things Infocom and Inform do that SRPG doesn’t.