98 lines
3.4 KiB
Markdown
98 lines
3.4 KiB
Markdown
# Phase 0 Research: LLM Chat Assistant for Project Operations
|
|
|
|
**Feature**: [`021-llm-project-assistant`](specs/021-llm-project-assistant)
|
|
**Input Spec**: [`spec.md`](specs/021-llm-project-assistant/spec.md)
|
|
**Related UX**: [`ux_reference.md`](specs/021-llm-project-assistant/ux_reference.md)
|
|
|
|
## 1) Intent parsing for operational commands (RU/EN)
|
|
|
|
### Decision
|
|
Use hybrid parsing:
|
|
1. Lightweight deterministic recognizers for high-signal commands/entities (IDs, env names, keywords).
|
|
2. LLM interpretation for ambiguous/free-form phrasing.
|
|
3. Confidence threshold gate (`needs_clarification` when below threshold).
|
|
|
|
### Rationale
|
|
Deterministic layer reduces hallucination risk for critical operations; LLM layer preserves natural-language flexibility.
|
|
|
|
### Alternatives considered
|
|
- **Pure LLM parser only**: rejected due to unsafe variance for critical ops.
|
|
- **Pure regex/rules only**: rejected due to low language flexibility and high maintenance.
|
|
|
|
---
|
|
|
|
## 2) Risk classification and confirmation policy
|
|
|
|
### Decision
|
|
Introduce explicit risk levels per command template:
|
|
- `safe`: read/status operations.
|
|
- `guarded`: state-changing non-production operations.
|
|
- `dangerous`: production deploy/migration and similarly impactful actions.
|
|
|
|
`dangerous` always requires explicit confirmation token before execution.
|
|
|
|
### Rationale
|
|
This maps to user acceptance criteria and prevents accidental production-impact operations.
|
|
|
|
### Alternatives considered
|
|
- **Prompt-only warning without hard confirmation gate**: rejected as insufficiently safe.
|
|
- **Confirmation for every command**: rejected due to severe UX friction.
|
|
|
|
---
|
|
|
|
## 3) Execution integration strategy
|
|
|
|
### Decision
|
|
Assistant dispatch calls existing internal services/endpoints instead of duplicating business logic.
|
|
|
|
Target mappings:
|
|
- Git -> existing git routes/service methods.
|
|
- Migration/backup/LLM analysis -> task creation with existing plugin IDs.
|
|
- Status/report -> existing tasks/reports APIs.
|
|
|
|
### Rationale
|
|
Reusing existing execution paths preserves permission checks and reduces regression risk.
|
|
|
|
### Alternatives considered
|
|
- **New parallel execution engine**: rejected (duplicated logic, bypass risk).
|
|
|
|
---
|
|
|
|
## 4) Confirmation token lifecycle
|
|
|
|
### Decision
|
|
Confirmation token model with one-time usage + TTL + user binding.
|
|
- Token includes normalized command snapshot + risk metadata.
|
|
- Confirm executes exactly once.
|
|
- Expired/cancelled token cannot be reused.
|
|
|
|
### Rationale
|
|
Prevents duplicate destructive execution and supports explicit audit trail.
|
|
|
|
### Alternatives considered
|
|
- **Session flag confirmation without token**: rejected due to weak idempotency guarantees.
|
|
|
|
---
|
|
|
|
## 5) Auditability and observability model
|
|
|
|
### Decision
|
|
Log each assistant interaction as structured audit entry:
|
|
- actor, raw message, parsed intent, confidence, risk level, decision, dispatch target, outcome, linked `task_id` if any.
|
|
|
|
### Rationale
|
|
Required for post-incident analysis and security traceability.
|
|
|
|
### Alternatives considered
|
|
- **Log only successful executions**: rejected (misses denied/failed attempts and ambiguity events).
|
|
|
|
---
|
|
|
|
## Consolidated Research Outcomes for Planning
|
|
|
|
- Hybrid parser with confidence gating is selected.
|
|
- Risk-classified confirmation flow is mandatory for dangerous operations.
|
|
- Existing APIs/plugins are the only execution backends.
|
|
- One-time confirmation token with TTL is required for safety and idempotency.
|
|
- Structured assistant audit logging is required for operational trust.
|