Expand description
§AdaRank: a boosting algorithm for information retrieval
AdaRank is a popular learning to rank algorithm that is based on the AdaBoost algorithm. See the original paper for more details.
AdaRank is a boosting algorithm that is used to learn a ranking function from a set of features. The algorithm is based on the AdaBoost algorithm, which is a popular ensemble method that is used to learn a strong classifier from a set of weak classifiers.
ⓘ
use adarank::AdaRank;
use adarank::eval::map::MAP;
use adarank::loader::svmlight::SVMLight;
let corpus = std::path::Path::new("benchmarks/OHSUMED").join("Data/All/OHSUMED.txt");
// Load a SVMLight dataset.
let ohsumed_dataset = SVMLight::load(corpus.to_str().unwrap()).unwrap();
// Clone a `RankList` to test later...
let test_sample = ohsumed_dataset[0].clone();
// Create an AdaRank learner with MAP as the evaluation metric, 50 iterations,
// 3 max consecutive selections, and 0.003 tolerance.
let mut adarank = AdaRank::new(ohsumed_dataset, Box::new(MAP), 50, 3, 0.003, None, None);
// Fit the learner to the dataset.
adarank.fit().unwrap();
// Get the test `DataPoint` from the `RankList`.
let dp = test_sample.get(0).unwrap();
// Predict the score for the test `DataPoint`.
let doc_label = adarank.predict(&test_sample.get(0).unwrap());
println!("Document {} has the score {:.2} for query {}",
dp.get_description().unwrap(),
doc_label,
dp.get_query_id());A good place for you to get started is to check out the example source code)
Modules§
- datapoint
- Define a core primitive for the library:
DataPoint.
ADataPointis a element of aRankListin aDataSet. - ensemble
- Define a class of
Rankers based on ensemble methods. - error
- Define the error type for the library.
- eval
- Define evaluators for the library.
Evaluators are used to evaluate the performance of a
Learner. - learner
- Define the
Learnerprimitive. ALearneris define operationss required to train aRanker. - loader
- Define the loader for the library. A
Loaderis used to load aDataSetfrom a IO stream. - ranker
- Define the
Rankerprimitive. All AI algorithms in the library areRankers, which means they can be used to predict the score ofDataPoints inRankLists . - ranklist
- Define a core primitive for the library:
RankList. ARankListis a list ofDataPoints and provides methods for ranking them. - utils
- Utility functions for the library.
These functions are not part of the core API, but are useful inside the library.
Macros§
- dp
- A macro to create a new DataPoint.
This macro is useful when creating a new DataPoint with a given label, the query_id, the
features and the description.
The features are given as a vector of
Features. The description is optional. - rl
- A macro to create a
RankListfrom a vector ofDataPoints represented by a tuple of label, query_id, features and the optional description.
Type Aliases§
- DataSet
- For simplicity, we will use a DataSet as a vector of RankLists.