Files
ss-tools/specs/021-llm-project-assistant/spec.md

11 KiB
Raw Blame History

Feature Specification: LLM Chat Assistant for Project Operations

Feature Branch: 021-llm-project-assistant Reference UX: ux_reference.md (See specific folder)
Created: 2026-02-23
Status: Draft
Input: User description: "Создать чат-ассистента на базе LLM в веб-интерфейсе, чтобы управлять Git, миграциями/бэкапами, LLM-анализом и статусами задач через команды на естественном языке."

User Scenarios & Testing (mandatory)

User Story 1 - Use chat instead of manual UI navigation (Priority: P1)

As an operator, I can open an in-app chat and submit natural-language commands so I can trigger key actions without navigating multiple pages.

Why this priority: Chat entrypoint is the core product value. Without it, no assistant workflow exists.

Independent Test: Open assistant panel, send a command like "создай ветку feature-abc для дашборда 42", and verify that assistant returns a structured actionable response.

Acceptance Scenarios:

  1. Given an authenticated user on any main page, When they open the assistant and send a message, Then the message appears in chat history and receives a response.
  2. Given assistant is available, When the user asks for supported actions, Then assistant returns clear examples for Git, migration/backup, analysis, and task status queries.

User Story 2 - Execute core operations via natural language commands (Priority: P1)

As an operator, I can ask assistant to execute core system operations (Git, migration/backup, LLM analysis/docs, status/report checks) so I can run workflows faster.

Why this priority: Command execution across core modules defines feature completeness.

Independent Test: Send one command per capability group and verify that the corresponding backend task/API action is initiated with valid parameters.

Acceptance Scenarios:

  1. Given user command targets Git operation, When assistant parses intent and entities, Then system starts corresponding Git operation (branch/commit/deploy) or returns explicit validation error.
  2. Given user command targets migration or backup, When assistant executes command, Then system starts async task and returns created task_id immediately.
  3. Given user command targets LLM validation/documentation, When assistant executes command, Then system triggers the relevant LLM task/plugin and returns launch confirmation with task_id.
  4. Given user asks task/report status, When assistant handles request, Then it returns latest known status and linkable reference to existing task tracking UI.

User Story 3 - Safe execution with RBAC and explicit confirmations (Priority: P1)

As a system administrator, I need assistant actions to respect existing permissions and require explicit user confirmation for risky operations.

Why this priority: Security and production safety are mandatory gate criteria.

Independent Test: Run restricted command with unauthorized user (expect deny), and run production deploy command (expect confirm-before-execute flow).

Acceptance Scenarios:

  1. Given user lacks permission for requested operation, When assistant tries to execute command, Then execution is denied and assistant returns permission error without side effects.
  2. Given command is classified as dangerous (e.g., deploy to production), When assistant detects it, Then assistant requires explicit user confirmation before creating task.
  3. Given confirmation is requested, When user cancels or confirmation expires, Then operation is not executed.

User Story 4 - Reliable feedback and progress tracking (Priority: P2)

As an operator, I need immediate operation feedback, clear errors, and traceable progress for long-running tasks.

Why this priority: Strong feedback loop is required for operational trust and usability.

Independent Test: Launch long migration via chat and verify immediate "started" message with task_id, then check progress in Task Drawer and reports page.

Acceptance Scenarios:

  1. Given assistant starts long operation, When execution begins, Then assistant responds immediately with initiation status and task_id.
  2. Given operation succeeds or fails, When task result becomes available, Then assistant can return outcome summary (success/error) on user request.
  3. Given assistant cannot parse command confidently, When ambiguity is detected, Then assistant asks clarification question and does not execute action.

Edge Cases

  • User command matches multiple possible operations (ambiguous intent).
  • User references non-existent dashboard/environment/task IDs.
  • Duplicate command submission (double-send) for destructive operations.
  • Confirmation token expires before user confirms.
  • Provider/LLM unavailable during command interpretation.
  • Task creation succeeds but downstream execution fails immediately.

Requirements (mandatory)

Functional Requirements

  • FR-001: System MUST provide an in-application chat interface for assistant interaction.
  • FR-002: System MUST preserve chat context per user session for at least the active session duration.
  • FR-003: Assistant MUST support natural-language command parsing for these domains: Git, migration/backup, LLM analysis/documentation, task/report status.
  • FR-004: Assistant MUST map parsed commands to existing internal operations/APIs and MUST NOT bypass current service boundaries.
  • FR-005: Assistant MUST support Git commands for at least: branch creation, commit initiation, deploy initiation.
  • FR-006: Assistant MUST support migration and backup launch commands and return created task_id.
  • FR-007: Assistant MUST support LLM validation/documentation launch commands and return created task_id.
  • FR-008: Assistant MUST support status queries for existing tasks by task_id and by recent user scope.
  • FR-009: Assistant MUST enforce existing RBAC/permission checks before any operation execution.
  • FR-010: Assistant MUST classify risky operations and require explicit confirmation before execution.
  • FR-011: Confirmation flow MUST include timeout/expiry semantics and explicit cancel path.
  • FR-012: Assistant MUST provide immediate response for long-running operations containing task_id and a short tracking hint.
  • FR-013: Assistant responses MUST include operation result state: success, failed, started, needs_confirmation, needs_clarification, or denied.
  • FR-014: Assistant MUST surface actionable error details without exposing secrets.
  • FR-015: Assistant MUST log every attempted assistant command with actor, intent, parameters snapshot, and outcome for auditability.
  • FR-016: Assistant MUST prevent duplicate execution for the same pending confirmation token.
  • FR-017: Assistant MUST support multilingual command input at minimum for Russian and English operational phrasing.
  • FR-018: Assistant MUST degrade safely when intent confidence is below threshold by requesting clarification instead of executing.
  • FR-019: Assistant MUST link users to existing progress surfaces (Task Drawer and reports page) for task tracking.
  • FR-020: Assistant MUST support retrieval of last N executed assistant commands for operational traceability.

Key Entities (include if feature involves data)

  • AssistantMessage: One chat message (user or assistant) with timestamp, role, text, and metadata.
  • CommandIntent: Parsed intent structure containing domain, operation, entities, confidence, and risk level.
  • ExecutionRequest: Validated command payload mapped to a concrete backend action.
  • ConfirmationToken: Pending confirmation record for risky operations with TTL and one-time usage.
  • AssistantExecutionLog: Audit trail entry for command attempts and outcomes.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: At least 90% of predefined core commands (Git/migration/backup/LLM/status) are correctly interpreted in acceptance test set.
  • SC-002: 100% of risky operations (production deploy and equivalent) require explicit confirmation before execution.
  • SC-003: 100% of assistant-started long-running operations return a task_id in the first response.
  • SC-004: Permission bypass rate is 0% in security tests (unauthorized commands never execute).
  • SC-005: At least 95% of assistant responses return within 2 seconds for parse/dispatch stage (excluding downstream task runtime).
  • SC-006: At least 90% of operators can launch a target operation from chat faster than through manual multi-page navigation in usability checks.

Assumptions

  • Existing APIs/plugins for Git, migration, backup, LLM actions, and task status remain authoritative execution backends.
  • Existing RBAC permissions (plugin:*, tasks:*, admin:*) remain the access model.
  • Task Drawer and reports page remain current progress/result surfaces and will be reused.
  • LLM assistant orchestration can use configured provider stack without introducing a separate auth model.

Dependencies

  • Existing routes/services for Git (/api/git/...), migration (/api/execute), task APIs (/api/tasks/...), and LLM provider/task flows.
  • Existing authentication and authorization components from multi-user auth implementation.
  • Existing task manager persistence/logging for async execution tracking.

Fixtures

assistant_command_git_branch:
  description: "Valid RU command for branch creation"
  data:
    message: "сделай ветку feature/revenue-v2 для дашборда 42"
    expected:
      domain: "git"
      operation: "create_branch"
      entities:
        dashboard_id: 42
        branch_name: "feature/revenue-v2"

assistant_command_migration_start:
  description: "Valid migration launch command"
  data:
    message: "запусти миграцию с dev на prod для дашборда 42"
    expected:
      domain: "migration"
      operation: "execute"
      requires_confirmation: true

assistant_command_prod_deploy_confirmation:
  description: "Risky production deploy requires confirmation"
  data:
    message: "задеплой дашборд 42 в production"
    expected:
      state: "needs_confirmation"
      confirmation_required: true

assistant_status_query:
  description: "Task status lookup"
  data:
    message: "проверь статус задачи task-123"
    expected:
      domain: "status"
      operation: "get_task"
      entities:
        task_id: "task-123"