[][src]Crate text_analysis

Text_Analysis

Analyze text stored as *.txt or *pdf in chosen directory. Doesn't read files in subdirectories. Counting all words and then searching for every unique word in the vicinity (+-5 words). Stores results in file [date/time]results_word_analysis.txt

Usage: text_analysis path

Example

use text_analysis::{count_words, save_file, sort_map_to_vec, trim_to_words, words_near};
use std::collections::HashMap;

let content_string: String = "An example phrase including two times the word two".to_string();
let content_vec: Vec<String> = trim_to_words(content_string).unwrap();

let word_frequency = count_words(&content_vec).unwrap();
let words_sorted = sort_map_to_vec(word_frequency).unwrap();


let mut index_rang: usize = 0;
let mut words_near_map: HashMap<String, HashMap<String, u32>> = HashMap::new();
for word in &words_sorted {
    words_near_map.extend(words_near(&word, index_rang, &content_vec, &words_sorted).unwrap());
    index_rang += 1;
    }

let mut result_as_string = String::new();

for word in words_sorted {
    let (word_only, frequency) = &word;
    let words_near = &words_near_map[word_only];
    let combined = format!(
        "Word: {:?}, Frequency: {:?},\nWords near: {:?} \n\n",
        word_only,
        frequency,
        sort_map_to_vec(words_near.to_owned()).unwrap()
        );
    result_as_string.push_str(&combined);
}
println!("{:?}", result_as_string);

Functions

count_words

Count words included in given &Vec. Returns result as HashMap with <Word as String, Count as u32>. Returns result.

save_file

save file to path. Return result.

sort_map_to_vec

Sort words in HashMap<Word, Frequency> according to frequency into Vector. Returns result.

trim_to_words

Splits content of file into singe words as Vector. Returns result.

words_near

Search for words +-5 around given word. Returns result.