Class UkrainianMorfologikAnalyzer

java.lang.Object
org.apache.lucene.analysis.Analyzer
org.apache.lucene.analysis.StopwordAnalyzerBase
org.apache.lucene.analysis.uk.UkrainianMorfologikAnalyzer
All Implemented Interfaces:
Closeable, AutoCloseable

public final class UkrainianMorfologikAnalyzer extends org.apache.lucene.analysis.StopwordAnalyzerBase
A dictionary-based Analyzer for Ukrainian.
Since:
6.2.0
  • Nested Class Summary

    Nested classes/interfaces inherited from class org.apache.lucene.analysis.Analyzer

    org.apache.lucene.analysis.Analyzer.ReuseStrategy, org.apache.lucene.analysis.Analyzer.TokenStreamComponents
  • Field Summary

    Fields inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

    stopwords

    Fields inherited from class org.apache.lucene.analysis.Analyzer

    GLOBAL_REUSE_STRATEGY, PER_FIELD_REUSE_STRATEGY
  • Constructor Summary

    Constructors
    Constructor
    Description
    Builds an analyzer with the default stop words.
    UkrainianMorfologikAnalyzer(org.apache.lucene.analysis.CharArraySet stopwords)
    Builds an analyzer with the given stop words.
    UkrainianMorfologikAnalyzer(org.apache.lucene.analysis.CharArraySet stopwords, org.apache.lucene.analysis.CharArraySet stemExclusionSet)
    Builds an analyzer with the given stop words.
  • Method Summary

    Modifier and Type
    Method
    Description
    protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents
    Creates a Analyzer.TokenStreamComponents which tokenizes all the text in the provided Reader.
    static org.apache.lucene.analysis.CharArraySet
    Returns the default stopword set for this analyzer
    protected Reader
    initReader(String fieldName, Reader reader)
     

    Methods inherited from class org.apache.lucene.analysis.StopwordAnalyzerBase

    getStopwordSet, loadStopwordSet, loadStopwordSet, loadStopwordSet

    Methods inherited from class org.apache.lucene.analysis.Analyzer

    attributeFactory, close, getOffsetGap, getPositionIncrementGap, getReuseStrategy, initReaderForNormalization, normalize, normalize, tokenStream, tokenStream

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • UkrainianMorfologikAnalyzer

      public UkrainianMorfologikAnalyzer()
      Builds an analyzer with the default stop words.
    • UkrainianMorfologikAnalyzer

      public UkrainianMorfologikAnalyzer(org.apache.lucene.analysis.CharArraySet stopwords)
      Builds an analyzer with the given stop words.
      Parameters:
      stopwords - a stopword set
    • UkrainianMorfologikAnalyzer

      public UkrainianMorfologikAnalyzer(org.apache.lucene.analysis.CharArraySet stopwords, org.apache.lucene.analysis.CharArraySet stemExclusionSet)
      Builds an analyzer with the given stop words. If a non-empty stem exclusion set is provided this analyzer will add a SetKeywordMarkerFilter before stemming.
      Parameters:
      stopwords - a stopword set
      stemExclusionSet - a set of terms not to be stemmed
  • Method Details

    • getDefaultStopwords

      public static org.apache.lucene.analysis.CharArraySet getDefaultStopwords()
      Returns the default stopword set for this analyzer
    • initReader

      protected Reader initReader(String fieldName, Reader reader)
      Overrides:
      initReader in class org.apache.lucene.analysis.Analyzer
    • createComponents

      protected org.apache.lucene.analysis.Analyzer.TokenStreamComponents createComponents(String fieldName)
      Creates a Analyzer.TokenStreamComponents which tokenizes all the text in the provided Reader.
      Specified by:
      createComponents in class org.apache.lucene.analysis.Analyzer
      Returns:
      A Analyzer.TokenStreamComponents built from an StandardTokenizer filtered with LowerCaseFilter, StopFilter , SetKeywordMarkerFilter if a stem exclusion set is provided and MorfologikFilter on the Ukrainian dictionary.