impactsense-parser 0.1.0

Multi-language static analysis: parse codebases into an in-memory dependency graph for impact analysis
Documentation
ImpactSense Phase 1 Incremental Pipeline - Handoff Summary

Primary intent
- Build an accuracy-first, push-driven incremental pipeline:
  GitLab Push -> Webhook API -> RabbitMQ -> Consumer -> Mode Decide -> Delta Resolve -> Targeted Parse -> Incremental Neo4j Update -> Impact Query.
- Branch strategy: new work should diverge from erlang_issues (not main).

Repository and branch context
- Parser repo root: /Users/sujal.v/Desktop/impactSenseProject/parser
- Webhook service path: parser/impactsense-webhook
- Current branch line: erlang_issues -> incremental-pipeline (same tip when checked)
- main is on a separate line and behind this stream.

Jira updates
- Created subtask CRM-3568 under CRM-3482:
  "Phase 1: Implement repo-level incremental parsing and Neo4j graph updates"
- Added progress comment to CRM-3568 with completion and pending items.

Implemented in code
1) Webhook ingress service (modularized)
- Endpoints:
  - GET /healthz
  - POST /events/vcs/push
- Token validation via WEBHOOK_SECRET headers check.
- Payload normalization for GitLab/GitHub/unknown.
- Deterministic job_key generation.
- Branch gating via TRACKED_BRANCH.

2) RabbitMQ publishing
- Publishes normalized events to:
  - Exchange: impactsense.events
  - Routing key: push
- Uses persistent publish settings and confirm handling.

3) Modular file split (no monolithic main.rs)
- parser/impactsense-webhook/src/main.rs
- parser/impactsense-webhook/src/config.rs
- parser/impactsense-webhook/src/model.rs
- parser/impactsense-webhook/src/amqp.rs
- parser/impactsense-webhook/src/webhook.rs
- parser/impactsense-webhook/src/consumer.rs
- parser/impactsense-webhook/src/delta.rs

4) Consumer contract
- Consumes from AMQP_QUEUE (default impactsense.push).
- Deserializes event payload.
- Mode fallback:
  - before_sha == 0000000000000000000000000000000000000000 -> bootstrap
  - else -> incremental
- Logs contract details and ack/nack handling.

5) Delta resolve implementation
- Runs:
  git -C <repo> diff --name-status --find-renames <before_sha> <after_sha>
- Produces:
  - added_files
  - modified_files
  - deleted_files
  - renamed_files
  - parse_targets (A + M + R.new)
  - cleanup_targets (D + M + R.old)
- Wired into consumer for incremental mode.

6) Build quality
- cargo check passes.
- Lints clean.
- Deprecated tokio-amqp usage removed and modernized.

Current environment variables used
- BIND_ADDR (default 0.0.0.0:8080)
- WEBHOOK_SECRET
- TRACKED_BRANCH (default master)
- AMQP_ADDR (default amqp://127.0.0.1:5672/%2f)
- AMQP_EXCHANGE (default impactsense.events)
- AMQP_ROUTING_KEY (default push)
- AMQP_QUEUE (default impactsense.push)
- GIT_REPO_PATH (default .., expected parser repo root for delta resolve)

Infra/test status discussed
- GitLab webhook delivery tested with HTTP 202.
- RabbitMQ route test passed:
  - push routed to impactsense.push
  - unknown routing key not routed
- Windows deploy discussed on host 10.166.1.220, intended port moved to 8093 in troubleshooting.

Pending implementation (next steps)
1) Targeted parse integration
- Add scanner/parser entrypoint that accepts parse_targets for incremental mode.
- Keep current full scan path for bootstrap.

2) Incremental Neo4j update path
- Cleanup by cleanup_targets (deleted + renamed old + stale changed scope).
- Upsert from newly parsed changed files.

3) Retry and DLQ hardening
- Explicit retry policy, max attempts, and dead-letter flow in consumer runtime.

4) Idempotency persistence
- Persist processed job_key to prevent duplicate processing on redelivery.

5) End-to-end worker pipeline
- Consumer flow:
  mode -> delta -> targeted parse -> extract -> persist -> run summary metrics.

Operational notes and user preferences
- Accuracy is prioritized over aggressive optimization.
- Bootstrap path must remain available.
- Incremental path should be additive (do not break full path).
- Parser repo git history is source-of-truth for delta resolve.
- RabbitMQ chosen as queueing backbone.