Module perlin::language [] [src]

This module will provide the necessary utilities to handle language well in regard of Information Retrieval.

It will contain methods for tokenization, stemming, normalization and so on.

At the moment though, it only provides a very basic analyzer method.

Functions

basic_analyzer

Analyzes a string and returns a vector of terms. Tokenizes at non-alphanumerical characters and turns the tokens to lowercase.