Defines a program for analyzing a document, splitting it into normalized terms that can be fed to the strus IR engine.
More...
#include <documentAnalyzerMapInterface.hpp>
Defines a program for analyzing a document, splitting it into normalized terms that can be fed to the strus IR engine.
virtual strus::DocumentAnalyzerMapInterface::~DocumentAnalyzerMapInterface |
( |
| ) |
|
|
inlinevirtual |
virtual void strus::DocumentAnalyzerMapInterface::addAnalyzer |
( |
const std::string & |
mimeType, |
|
|
const std::string & |
scheme, |
|
|
DocumentAnalyzerInstanceInterface * |
analyzer |
|
) |
| |
|
pure virtual |
Declare a an analyzer to be used for the analysis of a specific document class.
- Parameters
-
[in] | mimetype | of the document to process with this analyzer (must be defined) |
[in] | scheme | scheme of the document to process with this analyzer (can be empty meaning not defined) |
[in] | analyzer | analyzer to use for the defined class of documents (with ownership) |
Segment and tokenize a document, assign types to tokens and metadata and normalize their values.
- Parameters
-
[in] | content | document content string to analyze |
[in] | dclass | description of the content type and encoding to process |
- Returns
- the analyzed document
Declare a an analyzer interface to instrument and and add with addAnalyzer.
- Parameters
-
[in] | mimetype | of the document for this analyzer, determines the document segmenter |
[in] | scheme | scheme of the document to determine the segmenter options (can be empty meaning not defined) |
- Returns
- the analyzer (with ownership)
Create the context used for analyzing multipart or very big documents.
- Parameters
-
[in] | dclass | description of the content type and encoding to process |
- Returns
- the document analyzer context (with ownership)
Get the analyzer interface assigned to a document class.
- Parameters
-
[in] | dclass | description of the content type and encoding to process |
- Returns
- a reference to the analyzer interface
Return a structure with all definitions for introspection.
- Returns
- the structure with all definitions for introspection
The documentation for this class was generated from the following file: