Files
ss-tools/specs/021-llm-project-assistant/spec.md

188 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Feature Specification: LLM Chat Assistant for Project Operations
**Feature Branch**: `021-llm-project-assistant`
**Reference UX**: `ux_reference.md` (See specific folder)
**Created**: 2026-02-23
**Status**: Draft
**Input**: User description: "Создать чат-ассистента на базе LLM в веб-интерфейсе, чтобы управлять Git, миграциями/бэкапами, LLM-анализом и статусами задач через команды на естественном языке."
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Use chat instead of manual UI navigation (Priority: P1)
As an operator, I can open an in-app chat and submit natural-language commands so I can trigger key actions without navigating multiple pages.
**Why this priority**: Chat entrypoint is the core product value. Without it, no assistant workflow exists.
**Independent Test**: Open assistant panel, send a command like "создай ветку feature-abc для дашборда 42", and verify that assistant returns a structured actionable response.
**Acceptance Scenarios**:
1. **Given** an authenticated user on any main page, **When** they open the assistant and send a message, **Then** the message appears in chat history and receives a response.
2. **Given** assistant is available, **When** the user asks for supported actions, **Then** assistant returns clear examples for Git, migration/backup, analysis, and task status queries.
---
### User Story 2 - Execute core operations via natural language commands (Priority: P1)
As an operator, I can ask assistant to execute core system operations (Git, migration/backup, LLM analysis/docs, status/report checks) so I can run workflows faster.
**Why this priority**: Command execution across core modules defines feature completeness.
**Independent Test**: Send one command per capability group and verify that the corresponding backend task/API action is initiated with valid parameters.
**Acceptance Scenarios**:
1. **Given** user command targets Git operation, **When** assistant parses intent and entities, **Then** system starts corresponding Git operation (branch/commit/deploy) or returns explicit validation error.
2. **Given** user command targets migration or backup, **When** assistant executes command, **Then** system starts async task and returns created `task_id` immediately.
3. **Given** user command targets LLM validation/documentation, **When** assistant executes command, **Then** system triggers the relevant LLM task/plugin and returns launch confirmation with `task_id`.
4. **Given** user asks task/report status, **When** assistant handles request, **Then** it returns latest known status and linkable reference to existing task tracking UI.
---
### User Story 3 - Safe execution with RBAC and explicit confirmations (Priority: P1)
As a system administrator, I need assistant actions to respect existing permissions and require explicit user confirmation for risky operations.
**Why this priority**: Security and production safety are mandatory gate criteria.
**Independent Test**: Run restricted command with unauthorized user (expect deny), and run production deploy command (expect confirm-before-execute flow).
**Acceptance Scenarios**:
1. **Given** user lacks permission for requested operation, **When** assistant tries to execute command, **Then** execution is denied and assistant returns permission error without side effects.
2. **Given** command is classified as dangerous (e.g., deploy to production), **When** assistant detects it, **Then** assistant requires explicit user confirmation before creating task.
3. **Given** confirmation is requested, **When** user cancels or confirmation expires, **Then** operation is not executed.
---
### User Story 4 - Reliable feedback and progress tracking (Priority: P2)
As an operator, I need immediate operation feedback, clear errors, and traceable progress for long-running tasks.
**Why this priority**: Strong feedback loop is required for operational trust and usability.
**Independent Test**: Launch long migration via chat and verify immediate "started" message with `task_id`, then check progress in Task Drawer and reports page.
**Acceptance Scenarios**:
1. **Given** assistant starts long operation, **When** execution begins, **Then** assistant responds immediately with initiation status and `task_id`.
2. **Given** operation succeeds or fails, **When** task result becomes available, **Then** assistant can return outcome summary (success/error) on user request.
3. **Given** assistant cannot parse command confidently, **When** ambiguity is detected, **Then** assistant asks clarification question and does not execute action.
---
### Edge Cases
- User command matches multiple possible operations (ambiguous intent).
- User references non-existent dashboard/environment/task IDs.
- Duplicate command submission (double-send) for destructive operations.
- Confirmation token expires before user confirms.
- Provider/LLM unavailable during command interpretation.
- Task creation succeeds but downstream execution fails immediately.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST provide an in-application chat interface for assistant interaction.
- **FR-002**: System MUST preserve chat context per user session for at least the active session duration.
- **FR-003**: Assistant MUST support natural-language command parsing for these domains: Git, migration/backup, LLM analysis/documentation, task/report status.
- **FR-004**: Assistant MUST map parsed commands to existing internal operations/APIs and MUST NOT bypass current service boundaries.
- **FR-005**: Assistant MUST support Git commands for at least: branch creation, commit initiation, deploy initiation.
- **FR-006**: Assistant MUST support migration and backup launch commands and return created `task_id`.
- **FR-007**: Assistant MUST support LLM validation/documentation launch commands and return created `task_id`.
- **FR-008**: Assistant MUST support status queries for existing tasks by `task_id` and by recent user scope.
- **FR-009**: Assistant MUST enforce existing RBAC/permission checks before any operation execution.
- **FR-010**: Assistant MUST classify risky operations and require explicit confirmation before execution.
- **FR-011**: Confirmation flow MUST include timeout/expiry semantics and explicit cancel path.
- **FR-012**: Assistant MUST provide immediate response for long-running operations containing `task_id` and a short tracking hint.
- **FR-013**: Assistant responses MUST include operation result state: `success`, `failed`, `started`, `needs_confirmation`, `needs_clarification`, or `denied`.
- **FR-014**: Assistant MUST surface actionable error details without exposing secrets.
- **FR-015**: Assistant MUST log every attempted assistant command with actor, intent, parameters snapshot, and outcome for auditability.
- **FR-016**: Assistant MUST prevent duplicate execution for the same pending confirmation token.
- **FR-017**: Assistant MUST support multilingual command input at minimum for Russian and English operational phrasing.
- **FR-018**: Assistant MUST degrade safely when intent confidence is below threshold by requesting clarification instead of executing.
- **FR-019**: Assistant MUST link users to existing progress surfaces (Task Drawer and reports page) for task tracking.
- **FR-020**: Assistant MUST support retrieval of last N executed assistant commands for operational traceability.
### Key Entities *(include if feature involves data)*
- **AssistantMessage**: One chat message (user or assistant) with timestamp, role, text, and metadata.
- **CommandIntent**: Parsed intent structure containing domain, operation, entities, confidence, and risk level.
- **ExecutionRequest**: Validated command payload mapped to a concrete backend action.
- **ConfirmationToken**: Pending confirmation record for risky operations with TTL and one-time usage.
- **AssistantExecutionLog**: Audit trail entry for command attempts and outcomes.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: At least 90% of predefined core commands (Git/migration/backup/LLM/status) are correctly interpreted in acceptance test set.
- **SC-002**: 100% of risky operations (production deploy and equivalent) require explicit confirmation before execution.
- **SC-003**: 100% of assistant-started long-running operations return a `task_id` in the first response.
- **SC-004**: Permission bypass rate is 0% in security tests (unauthorized commands never execute).
- **SC-005**: At least 95% of assistant responses return within 2 seconds for parse/dispatch stage (excluding downstream task runtime).
- **SC-006**: At least 90% of operators can launch a target operation from chat faster than through manual multi-page navigation in usability checks.
---
## Assumptions
- Existing APIs/plugins for Git, migration, backup, LLM actions, and task status remain authoritative execution backends.
- Existing RBAC permissions (`plugin:*`, `tasks:*`, `admin:*`) remain the access model.
- Task Drawer and reports page remain current progress/result surfaces and will be reused.
- LLM assistant orchestration can use configured provider stack without introducing a separate auth model.
## Dependencies
- Existing routes/services for Git (`/api/git/...`), migration (`/api/execute`), task APIs (`/api/tasks/...`), and LLM provider/task flows.
- Existing authentication and authorization components from multi-user auth implementation.
- Existing task manager persistence/logging for async execution tracking.
---
## Test Data Fixtures *(recommended for CRITICAL components)*
### Fixtures
```yaml
assistant_command_git_branch:
description: "Valid RU command for branch creation"
data:
message: "сделай ветку feature/revenue-v2 для дашборда 42"
expected:
domain: "git"
operation: "create_branch"
entities:
dashboard_id: 42
branch_name: "feature/revenue-v2"
assistant_command_migration_start:
description: "Valid migration launch command"
data:
message: "запусти миграцию с dev на prod для дашборда 42"
expected:
domain: "migration"
operation: "execute"
requires_confirmation: true
assistant_command_prod_deploy_confirmation:
description: "Risky production deploy requires confirmation"
data:
message: "задеплой дашборд 42 в production"
expected:
state: "needs_confirmation"
confirmation_required: true
assistant_status_query:
description: "Task status lookup"
data:
message: "проверь статус задачи task-123"
expected:
domain: "status"
operation: "get_task"
entities:
task_id: "task-123"
```