Class PhoneticFilterFactory

java.lang.Object
org.apache.lucene.analysis.AbstractAnalysisFactory
org.apache.lucene.analysis.TokenFilterFactory
org.apache.lucene.analysis.phonetic.PhoneticFilterFactory
All Implemented Interfaces:
org.apache.lucene.util.ResourceLoaderAware

public class PhoneticFilterFactory extends org.apache.lucene.analysis.TokenFilterFactory implements org.apache.lucene.util.ResourceLoaderAware
Factory for PhoneticFilter.

Create tokens based on phonetic encoders from Apache Commons Codec.

This takes one required argument, "encoder", and the rest are optional:

encoder
required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex", "Caverphone" (v2.0), "ColognePhonetic" or "Nysiis" (case insensitive). If encoder isn't one of these, it'll be resolved as a class name either by itself if it already contains a '.' or otherwise as in the same package as these others.
inject
(default=true) add tokens to the stream with the offset=0
maxCodeLength
The maximum length of the phonetic codes, as defined by the encoder. If an encoder doesn't support this then specifying this is an error.
 <fieldType name="text_phonetic" class="solr.TextField" positionIncrementGap="100">
   <analyzer>
     <tokenizer class="solr.WhitespaceTokenizerFactory"/>
     <filter class="solr.PhoneticFilterFactory" encoder="DoubleMetaphone" inject="true"/>
   </analyzer>
 </fieldType>
Since:
3.1
See Also:
SPI Name (case-insensitive: if the name is 'htmlStrip', 'htmlstrip' can be used when looking up the service).
"phonetic"
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final String
    parameter name: either a short name or a full class name
    static final String
    parameter name: true if encoded tokens should be added as synonyms
    static final String
    parameter name: restricts the length of the phonetic code
    static final String
    SPI name

    Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory

    LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
  • Constructor Summary

    Constructors
    Constructor
    Description
    Default ctor for compatibility with SPI
    Creates a new PhoneticFilterFactory
  • Method Summary

    Modifier and Type
    Method
    Description
    create(org.apache.lucene.analysis.TokenStream input)
     
    protected org.apache.commons.codec.Encoder
    Must be thread-safe.
    void
    inform(org.apache.lucene.util.ResourceLoader loader)
     

    Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory

    availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters

    Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory

    defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

    • PhoneticFilterFactory

      public PhoneticFilterFactory(Map<String,String> args)
      Creates a new PhoneticFilterFactory
    • PhoneticFilterFactory

      public PhoneticFilterFactory()
      Default ctor for compatibility with SPI
  • Method Details

    • inform

      public void inform(org.apache.lucene.util.ResourceLoader loader) throws IOException
      Specified by:
      inform in interface org.apache.lucene.util.ResourceLoaderAware
      Throws:
      IOException
    • getEncoder

      protected org.apache.commons.codec.Encoder getEncoder()
      Must be thread-safe.
    • create

      public PhoneticFilter create(org.apache.lucene.analysis.TokenStream input)
      Specified by:
      create in class org.apache.lucene.analysis.TokenFilterFactory