NOTE: You are looking at documentation for an older release. For the latest information, see the current release documentation.
phonetic token filter
edit
IMPORTANT: This documentation is no longer updated. Refer to Elastic's version policy and the latest documentation.
phonetic
token filter
editThe phonetic
token filter takes the following settings:
-
encoder
-
Which phonetic encoder to use. Accepts
metaphone
(default),double_metaphone
,soundex
,refined_soundex
,caverphone1
,caverphone2
,cologne
,nysiis
,koelnerphonetik
,haasephonetik
,beider_morse
,daitch_mokotoff
. -
replace
-
Whether or not the original token should be replaced by the phonetic
token. Accepts
true
(default) andfalse
. Not supported bybeider_morse
encoding.
PUT phonetic_sample { "settings": { "index": { "analysis": { "analyzer": { "my_analyzer": { "tokenizer": "standard", "filter": [ "lowercase", "my_metaphone" ] } }, "filter": { "my_metaphone": { "type": "phonetic", "encoder": "metaphone", "replace": false } } } } } } GET phonetic_sample/_analyze { "analyzer": "my_analyzer", "text": "Joe Bloggs" }
Double metaphone settings
editIf the double_metaphone
encoder is used, then this additional setting is
supported:
-
max_code_len
-
The maximum length of the emitted metaphone token. Defaults to
4
.
Beider Morse settings
editIf the beider_morse
encoder is used, then these additional settings are
supported:
-
rule_type
-
Whether matching should be
exact
orapprox
(default). -
name_type
-
Whether names are
ashkenazi
,sephardic
, orgeneric
(default). -
languageset
-
An array of languages to check. If not specified, then the language will
be guessed. Accepts:
any
,common
,cyrillic
,english
,french
,german
,hebrew
,hungarian
,polish
,romanian
,russian
,spanish
.