Interface DocumentClassifier<T>
- All Known Implementing Classes:
KNearestNeighborDocumentClassifier,SimpleNaiveBayesDocumentClassifier
public interface DocumentClassifier<T>
A classifier, see
http://en.wikipedia.org/wiki/Classifier_(mathematics), which
assign classes of type T to a Documents- WARNING: This API is experimental and might change in incompatible ways in the next release.
-
Method Summary
Modifier and TypeMethodDescriptionassignClass(org.apache.lucene.document.Document document) Assign a class (with score) to the givenDocumentgetClasses(org.apache.lucene.document.Document document) Get all the classes (sorted by score, descending) assigned to the givenDocument.getClasses(org.apache.lucene.document.Document document, int max) Get the firstmaxclasses (sorted by score, descending) assigned to the given text String.
-
Method Details
-
assignClass
ClassificationResult<T> assignClass(org.apache.lucene.document.Document document) throws IOException Assign a class (with score) to the givenDocument- Parameters:
document- aDocumentto be classified. Fields are considered features for the classification.- Returns:
- a
ClassificationResultholding assigned class of typeTand score - Throws:
IOException- If there is a low-level I/O error.
-
getClasses
List<ClassificationResult<T>> getClasses(org.apache.lucene.document.Document document) throws IOException Get all the classes (sorted by score, descending) assigned to the givenDocument.- Parameters:
document- aDocumentto be classified. Fields are considered features for the classification.- Returns:
- the whole list of
ClassificationResult, the classes and scores. Returnsnullif the classifier can't make lists. - Throws:
IOException- If there is a low-level I/O error.
-
getClasses
List<ClassificationResult<T>> getClasses(org.apache.lucene.document.Document document, int max) throws IOException Get the firstmaxclasses (sorted by score, descending) assigned to the given text String.- Parameters:
document- aDocumentto be classified. Fields are considered features for the classification.max- the number of return list elements- Returns:
- the whole list of
ClassificationResult, the classes and scores. Cut for "max" number of elements. Returnsnullif the classifier can't make lists. - Throws:
IOException- If there is a low-level I/O error.
-