# Feature Specification: LLM Analysis & Documentation Plugins **Feature Branch**: `017-llm-analysis-plugin` **Created**: 2026-01-28 **Status**: Draft **Input**: User description: "LLM Dashboard Validation Plugin для интеграции LLM в ss-tools. Плагин должен поддерживать анализ корректности работы дашбордов через мультимодальную LLM (скриншот + логи). Второй плагин должен заниматься документированием датасетов и дашбордов, используя LLM. Возможно включение в создание текста коммитов в плагине Git Поддержка провайдеров: OpenRouter, Kilo Provider, OpenAI API. Интеграция с существующей PluginBase архитектурой, Task Manager, WebSocket логами." ## Clarifications ### Session 2026-01-28 - Q: Notification Content Strategy → A: **Summary with Link**: Include status (Pass/Fail), key issues count, and a direct link to the full report in the UI. - Q: Dashboard Screenshot Source → A: **Hybrid (Configurable)**: Support both Headless Browser (accurate) and API/Thumbnail (fast) methods, allowing admin configuration. - Q: Dataset Documentation Output Format → A: **Direct Object Update**: Update the description fields of the dataset and its columns directly within the dataset object (persisted to backend/metadata). - Q: Git Commit Message Context → A: **Diff + Recent History**: Send diff, file names, and the last 3 commit messages to match style. - Q: LLM Failure Handling → A: **Retry then Fail**: Automatically retry 3 times with exponential backoff before failing. ## User Scenarios & Testing *(mandatory)* ### User Story 1 - Dashboard Health Analysis (Priority: P1) As a Data Engineer, I want to automatically analyze a dashboard's status using visual and log data directly from the Environments interface so that I can identify rendering issues or data errors without manual inspection. **Why this priority**: Core value proposition of the feature. Enables automated quality assurance. **Independent Test**: Can be tested by selecting a dashboard in the Environment list and clicking "Validate", or by scheduling a validation task. **Acceptance Scenarios**: 1. **Given** I am on the Environments page (Dashboard list), **When** I select a dashboard and click "Validate", **Then** the system triggers a validation task with the dashboard's context. 2. **Given** the validation task is running, **When** it completes, **Then** I see the analysis report (visual + logs) in the task output history. 3. **Given** I want regular checks, **When** I configure a schedule for the validation task (similar to Backup plugin), **Then** the system runs the check automatically at the specified interval. 4. **Given** a validation issue is found, **When** the task completes, **Then** the system sends a notification (Email/Pulse) if configured. --- ### User Story 2 - Automated Dataset Documentation (Priority: P1) As a Data Steward, I want to generate documentation for datasets and dashboards using LLMs so that I can maintain up-to-date metadata with minimal manual effort. **Why this priority**: significantly reduces maintenance overhead for data governance. **Independent Test**: Can be tested by selecting a dataset and triggering the documentation task, then checking if a description is generated. **Acceptance Scenarios**: 1. **Given** a dataset identifier, **When** I run the Documentation task, **Then** the system fetches the dataset's schema and metadata. 2. **Given** the metadata is fetched, **When** sent to the LLM, **Then** a structured description/documentation text is returned. 3. **Given** the documentation is generated, **When** the task completes, **Then** the result is available for review (e.g., in the task log or saved to a file/db). --- ### User Story 3 - LLM Provider Configuration (Priority: P1) As an Administrator, I want to configure different LLM providers (OpenAI, OpenRouter, Kilo) so that I can switch between models based on cost or capability. **Why this priority**: Prerequisite for any LLM functionality. **Independent Test**: Can be tested by entering API keys in settings and verifying a connection/test call. **Acceptance Scenarios**: 1. **Given** the Settings page, **When** I select an LLM provider (e.g., OpenAI) and enter an API key, **Then** the system saves the configuration. 2. **Given** a configured provider, **When** I run an analysis task, **Then** the system uses the selected provider for API calls. --- ### User Story 4 - Git Commit Message Suggestion (Priority: P3) As a Developer, I want the system to suggest commit messages based on changes directly within the Git plugin interface so that I can maintain consistent history with minimal effort. **Why this priority**: Enhances the existing Git workflow and improves commit quality. **Independent Test**: Can be tested by staging files in the Git plugin and clicking the "Generate Message" button. **Acceptance Scenarios**: 1. **Given** staged changes in the Git plugin, **When** I click "Generate Message", **Then** the system analyzes the diff using the configured LLM and populates the commit message field with a suggested summary. ### Edge Cases - What happens when the LLM provider API is down or times out? (System should retry or fail gracefully with a clear error message). - What happens if the dashboard screenshot cannot be generated? (System should proceed with logs only or fail depending on configuration). - What happens if the context (logs/metadata) exceeds the LLM's token limit? (System should truncate or summarize input). - How does the system handle missing API keys? (Task should fail immediately with a configuration error). ## Requirements *(mandatory)* ### Functional Requirements - **FR-001**: System MUST allow configuration of multiple LLM providers, specifically supporting OpenAI API, OpenRouter, and Kilo Provider. - **FR-002**: System MUST securely store API keys for these providers. - **FR-003**: System MUST implement a `DashboardValidationPlugin` that integrates with the existing `PluginBase` architecture. - **FR-004**: `DashboardValidationPlugin` MUST accept a dashboard identifier as input. - **FR-005**: `DashboardValidationPlugin` MUST be capable of retrieving a visual representation (screenshot) of the dashboard. - **FR-016**: System MUST support configurable screenshot strategies: 'Headless Browser' (default, high accuracy) and 'API Thumbnail' (fallback/fast). - **FR-006**: `DashboardValidationPlugin` MUST retrieve recent execution logs associated with the dashboard. - **FR-007**: `DashboardValidationPlugin` MUST combine visual and text data to prompt a Multimodal LLM for analysis. - **FR-008**: System MUST implement a `DocumentationPlugin` (or similar) for documenting datasets and dashboards. - **FR-009**: `DocumentationPlugin` MUST retrieve schema and metadata for the target asset. - **FR-017**: `DocumentationPlugin` MUST apply generated descriptions directly to the target object's metadata fields (dataset description, column descriptions). - **FR-010**: All LLM interactions MUST be executed as asynchronous tasks via the Task Manager. - **FR-018**: System MUST implement automatic retry logic (3 attempts with exponential backoff) for failed LLM API calls. - **FR-011**: Task execution logs and results MUST be streamed via the existing WebSocket infrastructure. - **FR-012**: System SHOULD expose an interface to generate text summaries for Git diffs, utilizing the diff, file list, and recent commit history as context. - **FR-013**: System MUST support scheduling of validation tasks for dashboards (leveraging existing scheduler architecture). - **FR-014**: System SHOULD support notification dispatch (Email, Pulse) upon validation failure or completion. - **FR-015**: Notifications MUST contain a summary of results (Status, Issue Count) and a direct link to the full report, avoiding sensitive full details in the message body. ### Key Entities - **LLMProviderConfig**: Stores provider type (OpenAI, etc.), base URL, model name, and API key. - **ValidationResult**: Stores the analysis output, timestamp, and reference to the dashboard. - **AutoDocumentation**: Stores the generated documentation text for an asset. ## Success Criteria *(mandatory)* ### Measurable Outcomes - **SC-001**: Users can successfully configure and validate a connection to at least one LLM provider. - **SC-002**: A dashboard validation task completes within 90 seconds (assuming standard LLM latency). - **SC-003**: The system successfully processes a multimodal prompt (image + text) and returns a structured analysis. - **SC-004**: Generated documentation for a standard dataset contains descriptions for at least 80% of the columns (based on LLM capability, but pipeline must support it).