Files
ss-tools/specs/021-llm-project-assistant/research.md

3.4 KiB

Phase 0 Research: LLM Chat Assistant for Project Operations

Feature: 021-llm-project-assistant
Input Spec: spec.md
Related UX: ux_reference.md

1) Intent parsing for operational commands (RU/EN)

Decision

Use hybrid parsing:

  1. Lightweight deterministic recognizers for high-signal commands/entities (IDs, env names, keywords).
  2. LLM interpretation for ambiguous/free-form phrasing.
  3. Confidence threshold gate (needs_clarification when below threshold).

Rationale

Deterministic layer reduces hallucination risk for critical operations; LLM layer preserves natural-language flexibility.

Alternatives considered

  • Pure LLM parser only: rejected due to unsafe variance for critical ops.
  • Pure regex/rules only: rejected due to low language flexibility and high maintenance.

2) Risk classification and confirmation policy

Decision

Introduce explicit risk levels per command template:

  • safe: read/status operations.
  • guarded: state-changing non-production operations.
  • dangerous: production deploy/migration and similarly impactful actions.

dangerous always requires explicit confirmation token before execution.

Rationale

This maps to user acceptance criteria and prevents accidental production-impact operations.

Alternatives considered

  • Prompt-only warning without hard confirmation gate: rejected as insufficiently safe.
  • Confirmation for every command: rejected due to severe UX friction.

3) Execution integration strategy

Decision

Assistant dispatch calls existing internal services/endpoints instead of duplicating business logic.

Target mappings:

  • Git -> existing git routes/service methods.
  • Migration/backup/LLM analysis -> task creation with existing plugin IDs.
  • Status/report -> existing tasks/reports APIs.

Rationale

Reusing existing execution paths preserves permission checks and reduces regression risk.

Alternatives considered

  • New parallel execution engine: rejected (duplicated logic, bypass risk).

4) Confirmation token lifecycle

Decision

Confirmation token model with one-time usage + TTL + user binding.

  • Token includes normalized command snapshot + risk metadata.
  • Confirm executes exactly once.
  • Expired/cancelled token cannot be reused.

Rationale

Prevents duplicate destructive execution and supports explicit audit trail.

Alternatives considered

  • Session flag confirmation without token: rejected due to weak idempotency guarantees.

5) Auditability and observability model

Decision

Log each assistant interaction as structured audit entry:

  • actor, raw message, parsed intent, confidence, risk level, decision, dispatch target, outcome, linked task_id if any.

Rationale

Required for post-incident analysis and security traceability.

Alternatives considered

  • Log only successful executions: rejected (misses denied/failed attempts and ambiguity events).

Consolidated Research Outcomes for Planning

  • Hybrid parser with confidence gating is selected.
  • Risk-classified confirmation flow is mandatory for dangerous operations.
  • Existing APIs/plugins are the only execution backends.
  • One-time confirmation token with TTL is required for safety and idempotency.
  • Structured assistant audit logging is required for operational trust.