strusAnalyzer  0.17
Public Member Functions | List of all members
strus::TokenizerFunctionInstanceInterface Class Referenceabstract

Interface for tokenization. More...

#include <tokenizerFunctionInstanceInterface.hpp>

Public Member Functions

virtual ~TokenizerFunctionInstanceInterface ()
 Destructor. More...
 
virtual bool concatBeforeTokenize () const =0
 Flag defined by tokenizer indicating that different segments defined by the tag hierarchy should be concatenated before tokenization. More...
 
virtual std::vector
< analyzer::Token
tokenize (const char *src, std::size_t srcsize) const =0
 Tokenize a segment into a list of tokens. More...
 
virtual analyzer::FunctionView view () const =0
 Get the definition of the function as structure for introspection. More...
 

Detailed Description

Interface for tokenization.

Constructor & Destructor Documentation

virtual strus::TokenizerFunctionInstanceInterface::~TokenizerFunctionInstanceInterface ( )
inlinevirtual

Destructor.

Member Function Documentation

virtual bool strus::TokenizerFunctionInstanceInterface::concatBeforeTokenize ( ) const
pure virtual

Flag defined by tokenizer indicating that different segments defined by the tag hierarchy should be concatenated before tokenization.

Returns
true, if the argument chunks should be passed as one concatenated string, else if no
Remarks
This flag is needed for context sensitive tokenization like for example for recognizing punctuation.
virtual std::vector<analyzer::Token> strus::TokenizerFunctionInstanceInterface::tokenize ( const char *  src,
std::size_t  srcsize 
) const
pure virtual

Tokenize a segment into a list of tokens.

Parameters
[in]srcpointer to segment to tokenize
[in]srcsizesize of the segment to tokenize in bytes
virtual analyzer::FunctionView strus::TokenizerFunctionInstanceInterface::view ( ) const
pure virtual

Get the definition of the function as structure for introspection.

Returns
structure for introspection

The documentation for this class was generated from the following file: