Autocorrect in CAIC
Autocorrect is a feature similar to spell check that automatically corrects misspelled words in user input. When a user submits a question or text, the system checks the words against a set of dictionaries and corrects them whenever possible.
The Autocorrect function compares the words in the user’s input to two types of dictionaries:
Generic Language-Specific Library: This is a standard dictionary for the language.
Custom Project-Specific Library: This dictionary is built from all the words in the project’s Questions and Answers (Q&A) and is automatically updated whenever changes are made to them. This library offers more specialized word coverage compared to the generic library.
It’s important to note that while Autocorrect tries to fix misspelled words, it doesn’t always succeed. The system has strict rules and will only correct words that closely match known entries in either of the dictionaries. A word will be left unchanged if it is too different from any known word. Additionally, the system can occasionally make errors and turn a valid word into an incorrect one, especially if that word doesn’t appear in either dictionary. In such cases, adding the word to a Q&A will ensure that it is recognized and corrected properly in the future.
In the Entity Editor, any manual additions made to the dictionaries are automatically normalized, including the lemmatization process. However, the Autocorrect feature itself does not apply to manual entries.
Impact of Autocorrect on Your Project
Autocorrect is automatically available for all new projects created in Conversational AI Cloud. While it handles most corrections, there may still be cases where certain words need to be manually added to entities. This is especially true if Autocorrect does not automatically recognize or replace them.
How Autocorrect Affects Words Missing from Entity Recognition
Before Autocorrect is enabled for a project, any words with typos usually appear on the list of "Words Missing from Entity Recognition." Once Autocorrect is active, the process works as follows:
Corrected Words: If a misspelled word is corrected to a valid word in the dictionary, it will no longer appear on the "Words Missing from Entity Recognition" list because it is now recognized as part of the entity.
Incorrectly Corrected Words: If a misspelled word is corrected to an incorrect word (e.g., "abcense" is changed to "absence"), that incorrect word will show up on the "Words Missing from Entity Recognition" list.
Ultimately, the key factor is the "result"—whether the corrected word matches an entity—rather than the original misspelling in the user input.