Class MorfologikFilter
java.lang.Object
org.apache.lucene.util.AttributeSource
org.apache.lucene.analysis.TokenStream
org.apache.lucene.analysis.TokenFilter
org.apache.lucene.analysis.morfologik.MorfologikFilter
- All Implemented Interfaces:
Closeable,AutoCloseable,org.apache.lucene.util.Unwrappable<org.apache.lucene.analysis.TokenStream>
public class MorfologikFilter
extends org.apache.lucene.analysis.TokenFilter
TokenFilter using Morfologik library to transform input tokens into lemma and
morphosyntactic (POS) tokens. Applies to Polish only.
MorfologikFilter contains a MorphosyntacticTagsAttribute, which provides
morphosyntactic annotations for produced lemmas. See the Morfologik documentation for details.
- See Also:
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.util.AttributeSource
org.apache.lucene.util.AttributeSource.State -
Field Summary
Fields inherited from class org.apache.lucene.analysis.TokenStream
DEFAULT_TOKEN_ATTRIBUTE_FACTORY -
Constructor Summary
ConstructorsConstructorDescriptionMorfologikFilter(org.apache.lucene.analysis.TokenStream in) Creates a filter with the default (Polish) dictionary.MorfologikFilter(org.apache.lucene.analysis.TokenStream in, morfologik.stemming.Dictionary dict) Creates a filter with a given dictionary. -
Method Summary
Modifier and TypeMethodDescriptionfinal booleanRetrieves the next token (possibly from the list of lemmas).voidreset()Resets stems accumulator and hands over to superclass.Methods inherited from class org.apache.lucene.analysis.TokenFilter
close, end, unwrapMethods inherited from class org.apache.lucene.util.AttributeSource
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
-
Constructor Details
-
MorfologikFilter
public MorfologikFilter(org.apache.lucene.analysis.TokenStream in) Creates a filter with the default (Polish) dictionary. -
MorfologikFilter
public MorfologikFilter(org.apache.lucene.analysis.TokenStream in, morfologik.stemming.Dictionary dict) Creates a filter with a given dictionary.- Parameters:
in- input token stream.dict- Dictionary to use for stemming.
-
-
Method Details
-
incrementToken
Retrieves the next token (possibly from the list of lemmas).- Specified by:
incrementTokenin classorg.apache.lucene.analysis.TokenStream- Throws:
IOException
-
reset
Resets stems accumulator and hands over to superclass.- Overrides:
resetin classorg.apache.lucene.analysis.TokenFilter- Throws:
IOException
-