strusAnalyzer  0.17
Public Member Functions | List of all members
strus::ContentStatisticsInterface Class Referenceabstract

Defines a program for analyzing a document, splitting it into normalized terms that can be fed to the strus IR engine. More...

#include <contentStatisticsInterface.hpp>

Public Member Functions

virtual ~ContentStatisticsInterface ()
 Destructor. More...
 
virtual void addLibraryElement (const std::string &type, const std::string &regex, int priority, int minLength, int maxLength, TokenizerFunctionInstanceInterface *tokenizer, const std::vector< NormalizerFunctionInstanceInterface * > &normalizers)=0
 Declare an element of the library used to categorize features. More...
 
virtual void addVisibleAttribute (const std::string &name)=0
 Define an attribute to be visible in content statistics path conditions. More...
 
virtual void addSelectorExpression (const std::string &expression)=0
 Define a selector expression that is chosen for content elements that matches it. More...
 
virtual
ContentStatisticsContextInterface
createContext () const =0
 Create the context used for collecting document statitics. More...
 
virtual
analyzer::ContentStatisticsView 
view () const =0
 Return a structure with all definitions for introspection. More...
 

Detailed Description

Defines a program for analyzing a document, splitting it into normalized terms that can be fed to the strus IR engine.

Constructor & Destructor Documentation

virtual strus::ContentStatisticsInterface::~ContentStatisticsInterface ( )
inlinevirtual

Destructor.

Member Function Documentation

virtual void strus::ContentStatisticsInterface::addLibraryElement ( const std::string &  type,
const std::string &  regex,
int  priority,
int  minLength,
int  maxLength,
TokenizerFunctionInstanceInterface tokenizer,
const std::vector< NormalizerFunctionInstanceInterface * > &  normalizers 
)
pure virtual

Declare an element of the library used to categorize features.

Parameters
[in]typetype name of the feature
[in]regexregular expression that has to match on the whole segment in order to consider it as candidate
[in]prioritynon negative number specifying the priority given to matches, for multiple matches only the ones with the highest priority are selected
[in]minLengthminimum number of tokens or -1 for no restriction
[in]maxLengthmaximum number of tokens or -1 for no restriction
[in]tokenizertokenizer (ownership passed to this) to use for this feature
[in]normalizerslist of normalizers (element ownership passed to this) to use for this feature
virtual void strus::ContentStatisticsInterface::addSelectorExpression ( const std::string &  expression)
pure virtual

Define a selector expression that is chosen for content elements that matches it.

Parameters
[in]expressionexpression for selecting chunks
virtual void strus::ContentStatisticsInterface::addVisibleAttribute ( const std::string &  name)
pure virtual

Define an attribute to be visible in content statistics path conditions.

Parameters
[in]nameof the attribute to show in a path
virtual ContentStatisticsContextInterface* strus::ContentStatisticsInterface::createContext ( ) const
pure virtual

Create the context used for collecting document statitics.

Returns
the document content statistics context (with ownership)
virtual analyzer::ContentStatisticsView strus::ContentStatisticsInterface::view ( ) const
pure virtual

Return a structure with all definitions for introspection.

Returns
the structure with all definitions for introspection

The documentation for this class was generated from the following file: