ss-tools/specs/021-llm-project-assistant/research.md

# Phase 0 Research: LLM Chat Assistant for Project Operations

**Feature**: [`021-llm-project-assistant`](specs/021-llm-project-assistant)
**Input Spec**: [`spec.md`](specs/021-llm-project-assistant/spec.md)
**Related UX**: [`ux_reference.md`](specs/021-llm-project-assistant/ux_reference.md)

## 1) Intent parsing for operational commands (RU/EN)

### Decision
Use hybrid parsing:
1. Lightweight deterministic recognizers for high-signal commands/entities (IDs, env names, keywords).
2. LLM interpretation for ambiguous/free-form phrasing.
3. Confidence threshold gate (`needs_clarification` when below threshold).

### Rationale
Deterministic layer reduces hallucination risk for critical operations; LLM layer preserves natural-language flexibility.

### Alternatives considered
- **Pure LLM parser only**: rejected due to unsafe variance for critical ops.
- **Pure regex/rules only**: rejected due to low language flexibility and high maintenance.

---

## 2) Risk classification and confirmation policy

### Decision
Introduce explicit risk levels per command template:
- `safe`: read/status operations.
- `guarded`: state-changing non-production operations.
- `dangerous`: production deploy/migration and similarly impactful actions.

`dangerous` always requires explicit confirmation token before execution.

### Rationale
This maps to user acceptance criteria and prevents accidental production-impact operations.

### Alternatives considered
- **Prompt-only warning without hard confirmation gate**: rejected as insufficiently safe.
- **Confirmation for every command**: rejected due to severe UX friction.

---

## 3) Execution integration strategy

### Decision
Assistant dispatch calls existing internal services/endpoints instead of duplicating business logic.

Target mappings:
- Git -> existing git routes/service methods.
- Migration/backup/LLM analysis -> task creation with existing plugin IDs.
- Status/report -> existing tasks/reports APIs.

### Rationale
Reusing existing execution paths preserves permission checks and reduces regression risk.

### Alternatives considered
- **New parallel execution engine**: rejected (duplicated logic, bypass risk).

---

## 4) Confirmation token lifecycle

### Decision
Confirmation token model with one-time usage + TTL + user binding.
- Token includes normalized command snapshot + risk metadata.
- Confirm executes exactly once.
- Expired/cancelled token cannot be reused.

### Rationale
Prevents duplicate destructive execution and supports explicit audit trail.

### Alternatives considered
- **Session flag confirmation without token**: rejected due to weak idempotency guarantees.

---

## 5) Auditability and observability model

### Decision
Log each assistant interaction as structured audit entry:
- actor, raw message, parsed intent, confidence, risk level, decision, dispatch target, outcome, linked `task_id` if any.

### Rationale
Required for post-incident analysis and security traceability.

### Alternatives considered
- **Log only successful executions**: rejected (misses denied/failed attempts and ambiguity events).

---

## Consolidated Research Outcomes for Planning

- Hybrid parser with confidence gating is selected.
- Risk-classified confirmation flow is mandatory for dangerous operations.
- Existing APIs/plugins are the only execution backends.
- One-time confirmation token with TTL is required for safety and idempotency.
- Structured assistant audit logging is required for operational trust.