I allow users to type Russian words in Latin letters. If user misspells Russian word in Latin letters, I want Solr spellchecker to suggest correct word in Cyrillic (Russian words in the index is in Cyrillic). However, if user misspells not a Russian word (for example a brand name), it should be corrected in Latin letters (not russian words in the index is in Latin).
For example, tilevizor smasung should be fixed to телевизор samsung
Now I'm using the following configuration:
<fieldType name="spell_ru" class="solr.TextField" positionIncrementGap="100" omitNorms="true">
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ICUTransformFilterFactory" id="Any-Cyrillic; NFD; [^\p{Alnum}] Remove" />
</analyzer>
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.LengthFilterFactory" min="3" max="256" />
</analyzer>
</fieldType>
It converts query to Cyrillic letters, so Russian words correction works. But Latin doesn't. (tilevizor to телевизор works, but smasung to samsung doesn't).
Any ideas, how can I make spellchecker to correct both Cyrillic and Latin words?