Expand description
fetcher is a flexible async framework designed to make it easy to create robust applications for building data pipelines to extract, transform, and deliver data from various sources to diverse destinations. In easier words, it makes it easy to create an app that periodically checks a source, for example a website, for some data, makes it pretty, and sends it to the users.
fetcher is made to be easily extensible to support as many use-cases as possible while providing tools to support most of the common ones out of the box.
§Architecture
At the heart of fetcher is the Task. It represents a specific instance of a data pipeline which consists of 3 main stages:
Source: Fetches data from an external source (e.g. HTTP endpoint, email inbox).Action: Applies transformations (filters, modifications, parsing) to the fetched data.Sink: Sends the transformed data to a destinations (e.g. Discord channel, Telegram bot, another program’s stdin).
An Entry is the unit of data flowing through the pipeline. It contains:
id: A unique identifier for the entry, used for tracking read/unread status and replies.raw_contents: The raw, untransformed data fetched from the source.msg: AMessagethat contains the formated and structured data, like title, body, link, that will end up sent to a sink.
A Job is a collections of tasks that are executed together, potentially on a schedule.
Jobs can also be run either concurrently or in parallel as a part of a JobGroup.
§Getting started
To use fetcher, you need to add it as a dependency to your Cargo.toml file:
[dependencies]
fetcher = "0.15"
tokio = { version = "1", features = ["full"] }For the smallest example on how to use fetcher, please see examples/simple.rs.
More complete examples can be found in the examples/ directory. They demonstrate how to:
- Fetch data from various sources.
- Transform and filter data using regular expressions, HTML parsing, JSON parsing.
- Send data to sinks like Telegram and Discord
- Implement custom sources, actions, sinks
- Persist the read filter state in an external storage system
§Contributing
Contributions are very welcome! Please feel free to submit a pull request or open issues for any bugs, feature requests, or general feedback.
Re-exports§
Modules§
- actions
- This module contains all
Actionsthat a list ofEntry’s can be run through to view/modify/filter it out - auth
- This module contains all external manual authentication implementations. For now it’s just
Google OAuth2 - ctrl_
c_ signal - This module contains the
CtrlCSignalChannel - entry
- This module contains the basic building blog of
fetcher-Entrythat is passed throughout the program and that all modules either create, modify, or consume - error
- This module contains all errors that
fetchercan emit - exec
- This module contains
Execsource and sink. It is re-exported in thecrate::sinksandcrate::sourcesmodules - external_
save - This module contains the
ExternalSavetrait that implementors can use to add a way to save read filter data and entry to message map externally, - job
- This module contains the
Jobstruct and the entryway to the library - maybe_
send - This module contains
MaybeSend,MaybeSync, andMaybeSendSynctraits - read_
filter - This module contains the
ReadFilterthat is used for keeping track of what Entry has been or not been read, including all of its stragedies - scaffold
- This module contains a “scaffold”, in other words, functions that pre-configure your application for common uses of
fetcher. - sinks
- This module contains
Sinkthat can be used to consume a composedMessage, as well as themessagemodule itself - sources
- This module contains
Sources that can fetch data and create newEntriesout of it - task
- This module contains the basic block of
fetcherthat is aTask. - utils
- Miscellaneous utility extention traits for external types
Structs§
- Static
Str - A string that always has a ’static lifetime.