Class MorfologikFilter

java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.morfologik.MorfologikFilter
All Implemented Interfaces:
Closeable, AutoCloseable, org.apache.lucene.util.Unwrappable<org.apache.lucene.analysis.TokenStream>

public class MorfologikFilter extends org.apache.lucene.analysis.TokenFilter
TokenFilter using Morfologik library to transform input tokens into lemma and morphosyntactic (POS) tokens. Applies to Polish only.

MorfologikFilter contains a MorphosyntacticTagsAttribute, which provides morphosyntactic annotations for produced lemmas. See the Morfologik documentation for details.

See Also:
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource

    org.apache.lucene.util.AttributeSource.State
  • Field Summary

    Fields inherited from class org.apache.lucene.analysis.TokenStream

    DEFAULT_TOKEN_ATTRIBUTE_FACTORY
  • Constructor Summary

    Constructors
    Constructor
    Description
    MorfologikFilter(org.apache.lucene.analysis.TokenStream in)
    Creates a filter with the default (Polish) dictionary.
    MorfologikFilter(org.apache.lucene.analysis.TokenStream in, morfologik.stemming.Dictionary dict)
    Creates a filter with a given dictionary.
  • Method Summary

    Modifier and Type
    Method
    Description
    final boolean
    Retrieves the next token (possibly from the list of lemmas).
    void
    Resets stems accumulator and hands over to superclass.

    Methods inherited from class org.apache.lucene.analysis.TokenFilter

    close, end, unwrap

    Methods inherited from class org.apache.lucene.util.AttributeSource

    addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString

    Methods inherited from class java.lang.Object

    clone, finalize, getClass, notify, notifyAll, wait, wait, wait
  • Constructor Details

    • MorfologikFilter

      public MorfologikFilter(org.apache.lucene.analysis.TokenStream in)
      Creates a filter with the default (Polish) dictionary.
    • MorfologikFilter

      public MorfologikFilter(org.apache.lucene.analysis.TokenStream in, morfologik.stemming.Dictionary dict)
      Creates a filter with a given dictionary.
      Parameters:
      in - input token stream.
      dict - Dictionary to use for stemming.
  • Method Details

    • incrementToken

      public final boolean incrementToken() throws IOException
      Retrieves the next token (possibly from the list of lemmas).
      Specified by:
      incrementToken in class org.apache.lucene.analysis.TokenStream
      Throws:
      IOException
    • reset

      public void reset() throws IOException
      Resets stems accumulator and hands over to superclass.
      Overrides:
      reset in class org.apache.lucene.analysis.TokenFilter
      Throws:
      IOException