pub struct AppState {
pub config: Arc<AppConfig>,
pub renderer: Arc<FallbackRenderer>,
pub crawl_jobs: Arc<RwLock<HashMap<Uuid, CrawlJob>>>,
pub extract_jobs: Arc<RwLock<HashMap<Uuid, ExtractRecord>>>,
pub crawl_semaphore: Arc<Semaphore>,
pub searxng: Option<Arc<SearxngClient>>,
pub url_filter: Option<Arc<UrlFilterCfg>>,
}Expand description
Shared application state.
Fields§
§config: Arc<AppConfig>§renderer: Arc<FallbackRenderer>§crawl_jobs: Arc<RwLock<HashMap<Uuid, CrawlJob>>>§extract_jobs: Arc<RwLock<HashMap<Uuid, ExtractRecord>>>/v2/extract jobs. Separate from crawl_jobs because an extract result
is a single merged JSON object, not a Vec<ScrapeData>.
crawl_semaphore: Arc<Semaphore>§searxng: Option<Arc<SearxngClient>>SearXNG client. None when [search].searxng_url is unset, in which
case /v1/search returns a clear search_disabled error.
url_filter: Option<Arc<UrlFilterCfg>>Server-wide default /map URL filter. None disables the filter
entirely (legacy behaviour). Per-request overrides may swap or
extend this at handler time.
Implementations§
Source§impl AppState
impl AppState
pub fn new(config: AppConfig) -> CrwResult<Self>
Sourcepub async fn start_crawl_job(&self, req: CrawlRequest) -> Uuid
pub async fn start_crawl_job(&self, req: CrawlRequest) -> Uuid
Start a new crawl job and return its UUID. Spawns a background task that acquires the crawl semaphore before running.
Sourcepub async fn start_batch_job(
&self,
urls: Vec<String>,
template: ScrapeRequest,
) -> Uuid
pub async fn start_batch_job( &self, urls: Vec<String>, template: ScrapeRequest, ) -> Uuid
Start a /v2/batch/scrape job over an explicit URL list and return its
UUID. Reuses the crawl-job machinery (crawl_jobs + CrawlState) but
scrapes the given URLs directly — no link discovery, no same-origin
filtering, no dedup; input order is recoverable via metadata.sourceURL.
Sourcepub async fn start_extract_job(
&self,
urls: Vec<String>,
template: ScrapeRequest,
) -> Uuid
pub async fn start_extract_job( &self, urls: Vec<String>, template: ScrapeRequest, ) -> Uuid
Start a /v2/extract job. Scrapes each URL with formats:[json] + the
shared schema (already set on template) and merges the per-URL json
objects into one — matching the live API’s single-object data shape.
Trait Implementations§
Auto Trait Implementations§
impl !RefUnwindSafe for AppState
impl !UnwindSafe for AppState
impl Freeze for AppState
impl Send for AppState
impl Sync for AppState
impl Unpin for AppState
impl UnsafeUnpin for AppState
Blanket Implementations§
Source§impl<T> BorrowMut<T> for Twhere
T: ?Sized,
impl<T> BorrowMut<T> for Twhere
T: ?Sized,
Source§fn borrow_mut(&mut self) -> &mut T
fn borrow_mut(&mut self) -> &mut T
Source§impl<T> CloneToUninit for Twhere
T: Clone,
impl<T> CloneToUninit for Twhere
T: Clone,
impl<A, B, T> HttpServerConnExec<A, B> for Twhere
B: Body,
Source§impl<T> Instrument for T
impl<T> Instrument for T
Source§fn instrument(self, span: Span) -> Instrumented<Self>
fn instrument(self, span: Span) -> Instrumented<Self>
Source§fn in_current_span(self) -> Instrumented<Self>
fn in_current_span(self) -> Instrumented<Self>
Source§impl<T> IntoEither for T
impl<T> IntoEither for T
Source§fn into_either(self, into_left: bool) -> Either<Self, Self>
fn into_either(self, into_left: bool) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left is true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read moreSource§fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
fn into_either_with<F>(self, into_left: F) -> Either<Self, Self>
self into a Left variant of Either<Self, Self>
if into_left(&self) returns true.
Converts self into a Right variant of Either<Self, Self>
otherwise. Read more