classify_url

Function classify_url 

Source
pub fn classify_url(url: &str) -> RetryScope
Expand description

Classifies a URL to determine which retry policy to apply.

This function performs URL-based classification to differentiate between EdgeFirst Studio API calls and File I/O operations (S3, CloudFront, etc.).

§Classification Algorithm

  1. Parse URL using proper URL parser (handles ports, query params, fragments)
  2. Check protocol: Only HTTP/HTTPS are classified as StudioApi (all others → FileIO)
  3. Check host: Must be edgefirst.studio or *.edgefirst.studio
  4. Check path: Must start with /api (exact match or /api/...)
  5. If all conditions met → StudioApi, otherwise → FileIO

§Edge Cases Handled

  • Port numbers: https://test.edgefirst.studio:8080/api → StudioApi
  • Trailing slashes: https://edgefirst.studio/api/ → StudioApi
  • Query parameters: https://edgefirst.studio/api?foo=bar → StudioApi
  • Subdomains: https://ocean.edgefirst.studio/api → StudioApi
  • Similar domains: https://edgefirst.studio.com/api → FileIO (not exact match)
  • Path injection: https://evil.com/edgefirst.studio/api → FileIO (host mismatch)
  • Non-API paths: https://edgefirst.studio/download → FileIO

§Security

The function uses proper URL parsing to prevent domain spoofing attacks. Only the URL host is checked, not the path, preventing injection via https://attacker.com/edgefirst.studio/api.

§Examples

use edgefirst_client::{RetryScope, classify_url};

// Studio API URLs
assert_eq!(
    classify_url("https://edgefirst.studio/api"),
    RetryScope::StudioApi
);
assert_eq!(
    classify_url("https://test.edgefirst.studio/api/datasets"),
    RetryScope::StudioApi
);
assert_eq!(
    classify_url("https://test.edgefirst.studio:443/api?token=abc"),
    RetryScope::StudioApi
);

// File I/O URLs (S3, CloudFront, etc.)
assert_eq!(
    classify_url("https://s3.amazonaws.com/bucket/file.bin"),
    RetryScope::FileIO
);
assert_eq!(
    classify_url("https://d123abc.cloudfront.net/dataset.zip"),
    RetryScope::FileIO
);
assert_eq!(
    classify_url("https://edgefirst.studio/download_model"),
    RetryScope::FileIO // Non-API path
);