Class StempelFilter

java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.stempel.StempelFilter
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.lucene.util.Unwrappable<org.apache.lucene.analysis.TokenStream>

public final class StempelFilter extends org.apache.lucene.analysis.TokenFilter
Transforms the token stream as per the stemming algorithm.

Note: the input to the stemming filter must already be in lower case, so you will need to use LowerCaseFilter or LowerCaseTokenizer farther down the Tokenizer chain in order for this to work properly!

  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

    org.apache.lucene.util.AttributeSource.State
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Minimum length of input words to be processed.

    Fields inherited from class org.apache.lucene.analysis.TokenFilter

    input

    Fields inherited from class org.apache.lucene.analysis.TokenStream

    DEFAULT_TOKEN_ATTRIBUTE_FACTORY
  • Constructor Summary

    Constructors
    Constructor
    Description
    StempelFilter(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer)
    Create filter using the supplied stemming table.
    StempelFilter(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer, int minLength)
    Create filter using the supplied stemming table.
  • Method Summary

    Modifier and Type
    Method
    Description
    boolean
    Returns the next input Token, after being stemmed

    Methods inherited from class org.apache.lucene.analysis.TokenFilter

    close, end, reset, unwrap

    Methods inherited from class org.apache.lucene.util.AttributeSource

    addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Field Details

    • DEFAULT_MIN_LENGTH

      public static final int DEFAULT_MIN_LENGTH
      Minimum length of input words to be processed. Shorter words are returned unchanged.
      See Also:
  • Constructor Details

    • StempelFilter

      public StempelFilter(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer)
      Create filter using the supplied stemming table.
      Parameters:
      in - input token stream
      stemmer - stemmer
    • StempelFilter

      public StempelFilter(org.apache.lucene.analysis.TokenStream in, StempelStemmer stemmer, int minLength)
      Create filter using the supplied stemming table.
      Parameters:
      in - input token stream
      stemmer - stemmer
      minLength - For performance reasons words shorter than minLength characters are not processed, but simply returned.
  • Method Details

    • incrementToken

      public boolean incrementToken() throws IOException
      Returns the next input Token, after being stemmed
      Specified by:
      incrementToken in class org.apache.lucene.analysis.TokenStream
      Throws:
      IOException