65 Commits

Author SHA1 Message Date
0f16bab2b8 Похоже работает 2026-02-07 11:26:06 +03:00
7de96c17c4 feat(llm-plugin): switch to environment API for log retrieval
- Replace local backend.log reading with Superset API /log/ fetch
- Update DashboardValidationPlugin to use SupersetClient
- Filter logs by dashboard_id and last 24 hours
- Update spec FR-006 to reflect API usage
2026-02-06 17:57:25 +03:00
f018b97ed2 Semantic protocol update - add UX 2026-01-30 18:53:52 +03:00
72846aa835 tasks ux-reference 2026-01-30 13:35:03 +03:00
994c0c3e5d feat(speckit): integrate ux reference into workflows
Introduce a UX reference stage to ensure technical plans align with
user experience goals. Adds a new template, a generation step in the
specification workflow, and mandatory validation checks during
planning to prevent technical compromises from degrading the defined
user experience.
2026-01-30 12:31:19 +03:00
252a8601a9 Вроде работает 2026-01-30 11:10:16 +03:00
8044f85ea4 tasks and workflow updated 2026-01-29 10:06:28 +03:00
d4109e5a03 docs: amend constitution to v2.0.0 (delegate semantics to protocol + add async/testability principles) 2026-01-28 18:48:43 +03:00
b2bbd73439 tasks ready 2026-01-28 18:30:23 +03:00
0e0e26e2f7 semantic update 2026-01-28 16:57:19 +03:00
18b42f8dd0 semantic protocol condense + script update 2026-01-28 15:49:39 +03:00
e7b31accd6 tested 2026-01-27 23:49:19 +03:00
d3c3a80ed2 Передаем на тест 2026-01-27 16:32:08 +03:00
cc244c2d86 tasks ready 2026-01-27 13:26:06 +03:00
d10c23e658 Обновил gitignore - убрал логи 2026-01-26 22:15:17 +03:00
1042b35d1b Закончили редизайн, обновили интерфейс бэкапа 2026-01-26 22:12:35 +03:00
16ffeb1ed6 Выполнено, передано на тестирование 2026-01-26 21:17:05 +03:00
da34deac02 tasks ready 2026-01-26 20:58:38 +03:00
51e9ee3fcc semantic update 2026-01-26 11:57:36 +03:00
edf9286071 Файловое хранилище готово 2026-01-26 11:08:18 +03:00
a542e7d2df Передаем на тест 2026-01-25 18:33:00 +03:00
a863807cf2 tasks ready 2026-01-24 16:21:43 +03:00
e2bc68683f Update .gitignore 2026-01-24 11:26:19 +03:00
43cb82697b Update backup scheduler task status 2026-01-24 11:26:05 +03:00
4ba28cf93e semantic cleanup 2026-01-23 21:58:32 +03:00
343f2e29f5 Мультиязночность + причесывание css 2026-01-23 17:53:46 +03:00
c9a53578fd tasks ready 2026-01-23 14:56:05 +03:00
07ec2d9797 Работает создание коммитов и перенос в новый enviroment 2026-01-23 13:57:44 +03:00
e9d3f3c827 tasks ready 2026-01-22 23:59:16 +03:00
26ba015b75 +gitignore 2026-01-22 23:25:29 +03:00
49129d3e86 fix error 2026-01-22 23:18:48 +03:00
d99a13d91f refactor complete 2026-01-22 17:37:17 +03:00
203ce446f4 ашч 2026-01-21 14:00:48 +03:00
c96d50a3f4 fix(backend): standardize superset client init and auth
- Update plugins (debug, mapper, search) to explicitly map environment config to SupersetConfig
- Add authenticate method to SupersetClient for explicit session management
- Add get_environment method to ConfigManager
- Fix navbar dropdown hover stability in frontend with invisible bridge
2026-01-20 19:31:17 +03:00
3bbe320949 TaskLog fix 2026-01-19 17:10:43 +03:00
2d2435642d bug fixs 2026-01-19 00:07:06 +03:00
ec8d67c956 bug fixes 2026-01-18 23:21:00 +03:00
76baeb1038 semantic markup update 2026-01-18 21:29:54 +03:00
11c59fb420 semantic checker script update 2026-01-13 17:33:57 +03:00
b2529973eb constitution update 2026-01-13 15:29:42 +03:00
ae1d630ad6 semantics update 2026-01-13 09:11:27 +03:00
9a9c5879e6 tasks.md status 2026-01-12 12:35:45 +03:00
696aac32e7 1st iter 2026-01-12 12:33:51 +03:00
7a9b1a190a tasks ready 2026-01-07 18:59:49 +03:00
a3dc1fb2b9 docs: amend constitution to v1.6.0 (add 'Everything is a Plugin' principle) and refactor 010 plan 2026-01-07 18:36:38 +03:00
297b29986d Product Manager role 2026-01-07 11:39:44 +03:00
4c6fc8256d project map script | semantic parcer 2026-01-01 16:58:21 +03:00
a747a163c8 backup worked 2025-12-30 22:02:51 +03:00
fce0941e98 docs ready 2025-12-30 21:30:37 +03:00
45c077b928 +api rework 2025-12-30 20:08:48 +03:00
9ed3a5992d cleaned 2025-12-30 18:20:40 +03:00
a032fe8457 Password promt 2025-12-30 17:21:12 +03:00
4c9d554432 TaskManager refactor 2025-12-29 10:13:37 +03:00
6962a78112 mappings+migrate 2025-12-27 10:16:41 +03:00
3d75a21127 tech_lead / coder 2roles 2025-12-27 08:02:59 +03:00
07914c8728 semantic add 2025-12-27 07:14:08 +03:00
cddc259b76 new loggers logic in constitution 2025-12-27 06:51:28 +03:00
dcbf0a7d7f tasks ready 2025-12-27 06:37:03 +03:00
65f61c1f80 Merge branch '001-migration-ui-redesign' into master 2025-12-27 05:58:35 +03:00
cb7386f274 superset_tool logger rework 2025-12-27 05:53:30 +03:00
83e34e1799 feat(logging): implement configurable belief state logging
- Add LoggingConfig model and logging field to GlobalSettings
- Implement belief_scope context manager for structured logging
- Add configure_logger for dynamic level and file rotation settings
- Add logging configuration UI to Settings page
- Update ConfigManager to apply logging settings on initialization and updates
2025-12-27 05:39:33 +03:00
d197303b9f 006 plan ready 2025-12-26 19:36:49 +03:00
a43f8fb021 001-migration-ui-redesign (#3)
Reviewed-on: #3
2025-12-26 18:17:58 +03:00
4aa01b6470 Merge branch 'migration' into 001-migration-ui-redesign 2025-12-26 18:16:24 +03:00
4448352ef9 Merge pull request '001-fix-ui-ws-validation' (#2) from 001-fix-ui-ws-validation into migration
Reviewed-on: #2
2025-12-21 00:29:19 +03:00
385 changed files with 113504 additions and 15397 deletions

14
.gitignore vendored
View File

@@ -29,7 +29,7 @@ env/
backend/backups/* backend/backups/*
# Node.js # Node.js
node_modules/ frontend/node_modules/
npm-debug.log* npm-debug.log*
yarn-debug.log* yarn-debug.log*
yarn-error.log* yarn-error.log*
@@ -39,6 +39,7 @@ build/
dist/ dist/
.env* .env*
config.json config.json
package-lock.json
# Logs # Logs
*.log *.log
@@ -58,6 +59,13 @@ Thumbs.db
*.ps1 *.ps1
keyring passwords.py keyring passwords.py
*github* *github*
*git*
*tech_spec* *tech_spec*
dashboards dashboards
backend/mappings.db
backend/tasks.db
backend/logs
backend/auth.db
semantics/reports

15
.kilocode/mcp.json Executable file → Normal file
View File

@@ -1,14 +1 @@
{ {"mcpServers":{}}
"mcpServers": {
"tavily": {
"command": "npx",
"args": [
"-y",
"tavily-mcp@0.2.3"
],
"env": {
"TAVILY_API_KEY": "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"
}
}
}
}

View File

@@ -9,6 +9,30 @@ Auto-generated from all feature plans. Last updated: 2025-12-19
- Python 3.9+, Node.js 18+ + FastAPI, SvelteKit, Tailwind CSS, Pydantic (005-fix-ui-ws-validation) - Python 3.9+, Node.js 18+ + FastAPI, SvelteKit, Tailwind CSS, Pydantic (005-fix-ui-ws-validation)
- N/A (Configuration based) (005-fix-ui-ws-validation) - N/A (Configuration based) (005-fix-ui-ws-validation)
- Filesystem (plugins, logs, backups), SQLite (optional, for job history if needed) (005-fix-ui-ws-validation) - Filesystem (plugins, logs, backups), SQLite (optional, for job history if needed) (005-fix-ui-ws-validation)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI, SvelteKit, Tailwind CSS (007-migration-dashboard-grid)
- N/A (Superset API integration) (007-migration-dashboard-grid)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI, SvelteKit, Tailwind CSS, Pydantic, Superset API (007-migration-dashboard-grid)
- N/A (Superset API integration - read-only for metadata) (007-migration-dashboard-grid)
- Python 3.9+ (backend), Node.js 18+ (frontend) + FastAPI, SvelteKit, Tailwind CSS, Pydantic, SQLAlchemy, Superset API (008-migration-ui-improvements)
- SQLite (optional for job history), existing database for mappings (008-migration-ui-improvements)
- Python 3.9+, Node.js 18+ + FastAPI, SvelteKit, Tailwind CSS, Pydantic, SQLAlchemy, Superset API (008-migration-ui-improvements)
- Python 3.9+, Node.js 18+ + FastAPI, APScheduler, SQLAlchemy, SvelteKit, Tailwind CSS (009-backup-scheduler)
- SQLite (`tasks.db`), JSON (`config.json`) (009-backup-scheduler)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI, SvelteKit, Tailwind CSS, Pydantic, SQLAlchemy, `superset_tool` (internal lib) (010-refactor-cli-to-web)
- SQLite (for job history/results, connection configs), Filesystem (for temporary file uploads) (010-refactor-cli-to-web)
- Python 3.9+ + FastAPI, Pydantic, requests, pyyaml (migrated from superset_tool) (012-remove-superset-tool)
- SQLite (tasks.db, migrations.db), Filesystem (012-remove-superset-tool)
- Filesystem (local git repo), SQLite (for GitServerConfig, Environment) (011-git-integration-dashboard)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI, SvelteKit, GitPython (or CLI git), Pydantic, SQLAlchemy, Superset API (011-git-integration-dashboard)
- SQLite (for config/history), Filesystem (local Git repositories) (011-git-integration-dashboard)
- Node.js 18+ (Frontend Build), Svelte 5.x + SvelteKit, Tailwind CSS, `date-fns` (existing) (013-unify-frontend-css)
- LocalStorage (for language preference) (013-unify-frontend-css)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI (Backend), SvelteKit (Frontend) (014-file-storage-ui)
- Local Filesystem (for artifacts), Config (for storage path) (014-file-storage-ui)
- Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI (Backend), SvelteKit + Tailwind CSS (Frontend) (015-frontend-nav-redesign)
- N/A (UI reorganization and API integration) (015-frontend-nav-redesign)
- SQLite (`auth.db`) for Users, Roles, Permissions, and Mappings. (016-multi-user-auth)
- SQLite (existing `tasks.db` for results, `auth.db` for permissions, `mappings.db` or new `plugins.db` for provider config/metadata) (017-llm-analysis-plugin)
- Python 3.9+ (Backend), Node.js 18+ (Frontend Build) (001-plugin-arch-svelte-ui) - Python 3.9+ (Backend), Node.js 18+ (Frontend Build) (001-plugin-arch-svelte-ui)
@@ -29,9 +53,9 @@ cd src; pytest; ruff check .
Python 3.9+ (Backend), Node.js 18+ (Frontend Build): Follow standard conventions Python 3.9+ (Backend), Node.js 18+ (Frontend Build): Follow standard conventions
## Recent Changes ## Recent Changes
- 005-fix-ui-ws-validation: Added Python 3.9+ (Backend), Node.js 18+ (Frontend Build) - 017-llm-analysis-plugin: Added Python 3.9+ (Backend), Node.js 18+ (Frontend)
- 005-fix-ui-ws-validation: Added Python 3.9+, Node.js 18+ + FastAPI, SvelteKit, Tailwind CSS, Pydantic - 016-multi-user-auth: Added Python 3.9+ (Backend), Node.js 18+ (Frontend)
- 005-fix-ui-ws-validation: Added Python 3.9+, Node.js 18+ + FastAPI, SvelteKit, Tailwind CSS, Pydantic - 015-frontend-nav-redesign: Added Python 3.9+ (Backend), Node.js 18+ (Frontend) + FastAPI (Backend), SvelteKit + Tailwind CSS (Frontend)
<!-- MANUAL ADDITIONS START --> <!-- MANUAL ADDITIONS START -->

View File

@@ -0,0 +1,4 @@
---
description: USE SEMANTIC
---
Прочитай semantic_protocol.md. ОБЯЗАТЕЛЬНО используй его при разработке

View File

@@ -63,6 +63,7 @@ Load only the minimal necessary context from each artifact:
**From constitution:** **From constitution:**
- Load `.specify/memory/constitution.md` for principle validation - Load `.specify/memory/constitution.md` for principle validation
- Load `semantic_protocol.md` for technical standard validation
### 3. Build Semantic Models ### 3. Build Semantic Models

View File

@@ -1,5 +1,10 @@
--- ---
description: Execute the implementation plan by processing and executing all tasks defined in tasks.md description: Execute the implementation plan by processing and executing all tasks defined in tasks.md
handoffs:
- label: Verify Changes
agent: speckit.test
prompt: Verify the implementation of...
send: true
--- ---
## User Input ## User Input
@@ -46,6 +51,7 @@ You **MUST** consider the user input before proceeding (if not empty).
- Automatically proceed to step 3 - Automatically proceed to step 3
3. Load and analyze the implementation context: 3. Load and analyze the implementation context:
- **REQUIRED**: Read `semantic_protocol.md` for strict coding standards and contract requirements
- **REQUIRED**: Read tasks.md for the complete task list and execution plan - **REQUIRED**: Read tasks.md for the complete task list and execution plan
- **REQUIRED**: Read plan.md for tech stack, architecture, and file structure - **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
- **IF EXISTS**: Read data-model.md for entities and relationships - **IF EXISTS**: Read data-model.md for entities and relationships
@@ -111,6 +117,7 @@ You **MUST** consider the user input before proceeding (if not empty).
- **Validation checkpoints**: Verify each phase completion before proceeding - **Validation checkpoints**: Verify each phase completion before proceeding
7. Implementation execution rules: 7. Implementation execution rules:
- **Strict Adherence**: Apply `semantic_protocol.md` rules - every file must start with [DEF] header, include @TIER, and define contracts
- **Setup first**: Initialize project structure, dependencies, configuration - **Setup first**: Initialize project structure, dependencies, configuration
- **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios - **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
- **Core development**: Implement models, services, CLI commands, endpoints - **Core development**: Implement models, services, CLI commands, endpoints

View File

@@ -22,7 +22,7 @@ You **MUST** consider the user input before proceeding (if not empty).
1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot"). 1. **Setup**: Run `.specify/scripts/bash/setup-plan.sh --json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied). 2. **Load context**: Read FEATURE_SPEC, `ux_reference.md`, `semantic_protocol.md` and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to: 3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
- Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION") - Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
@@ -64,12 +64,22 @@ You **MUST** consider the user input before proceeding (if not empty).
**Prerequisites:** `research.md` complete **Prerequisites:** `research.md` complete
0. **Validate Design against UX Reference**:
- Check if the proposed architecture supports the latency, interactivity, and flow defined in `ux_reference.md`.
- **CRITICAL**: If the technical plan requires compromising the UX defined in `ux_reference.md` (e.g. "We can't do real-time validation because X"), you **MUST STOP** and warn the user. Do not proceed until resolved.
1. **Extract entities from feature spec** → `data-model.md`: 1. **Extract entities from feature spec** → `data-model.md`:
- Entity name, fields, relationships - Entity name, fields, relationships
- Validation rules from requirements - Validation rules from requirements
- State transitions if applicable - State transitions if applicable
2. **Generate API contracts** from functional requirements: 2. **Define Module & Function Contracts (Semantic Protocol)**:
- **MANDATORY**: For every new module, define the [DEF] Header and Module-level Contract (@TIER, @PURPOSE, @INVARIANT) as per `semantic_protocol.md`.
- **REQUIRED**: Define Function Contracts (@PRE, @POST) for critical logic.
- Output specific contract definitions to `contracts/modules.md` or append to `data-model.md` to guide implementation.
- Ensure strict adherence to `semantic_protocol.md` syntax.
3. **Generate API contracts** from functional requirements:
- For each user action → endpoint - For each user action → endpoint
- Use standard REST/GraphQL patterns - Use standard REST/GraphQL patterns
- Output OpenAPI/GraphQL schema to `/contracts/` - Output OpenAPI/GraphQL schema to `/contracts/`

View File

@@ -70,7 +70,22 @@ Given that feature description, do this:
3. Load `.specify/templates/spec-template.md` to understand required sections. 3. Load `.specify/templates/spec-template.md` to understand required sections.
4. Follow this execution flow: 4. **Generate UX Reference**:
a. Load `.specify/templates/ux-reference-template.md`.
b. **Design the User Experience**:
- **Imagine you are the user**: Visualize the interface and interaction.
- **Persona**: Define who is using this.
- **Happy Path**: Write the story of the perfect interaction.
- **Mockups**: Create concrete CLI text blocks or UI descriptions.
- **Errors**: Define how the system guides the user out of failure.
c. Write the `ux_reference.md` file in the feature directory.
d. **CRITICAL**: This UX Reference is now the source of truth for the "feel" of the feature. The technical spec MUST support this experience.
5. Follow this execution flow:
1. Parse user description from Input 1. Parse user description from Input
If empty: ERROR "No feature description provided" If empty: ERROR "No feature description provided"
@@ -115,6 +130,12 @@ Given that feature description, do this:
- [ ] Focused on user value and business needs - [ ] Focused on user value and business needs
- [ ] Written for non-technical stakeholders - [ ] Written for non-technical stakeholders
- [ ] All mandatory sections completed - [ ] All mandatory sections completed
## UX Consistency
- [ ] Functional requirements fully support the 'Happy Path' in ux_reference.md
- [ ] Error handling requirements match the 'Error Experience' in ux_reference.md
- [ ] No requirements contradict the defined User Persona or Context
## Requirement Completeness ## Requirement Completeness
@@ -190,7 +211,7 @@ Given that feature description, do this:
d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
7. Report completion with branch name, spec file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`). 7. Report completion with branch name, spec file path, ux_reference file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).
**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing. **NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.

View File

@@ -24,7 +24,7 @@ You **MUST** consider the user input before proceeding (if not empty).
1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot"). 1. **Setup**: Run `.specify/scripts/bash/check-prerequisites.sh --json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load design documents**: Read from FEATURE_DIR: 2. **Load design documents**: Read from FEATURE_DIR:
- **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities) - **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities), ux_reference.md (experience source of truth)
- **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios) - **Optional**: data-model.md (entities), contracts/ (API endpoints), research.md (decisions), quickstart.md (test scenarios)
- Note: Not all projects have all documents. Generate tasks based on what's available. - Note: Not all projects have all documents. Generate tasks based on what's available.
@@ -70,6 +70,12 @@ The tasks.md should be immediately executable - each task must be specific enoug
**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach. **Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
### UX Preservation (CRITICAL)
- **Source of Truth**: `ux_reference.md` is the absolute standard for the "feel" of the feature.
- **Violation Warning**: If any task would inherently violate the UX (e.g. "Remove progress bar to simplify code"), you **MUST** flag this to the user immediately.
- **Verification Task**: You **MUST** add a specific task at the end of each User Story phase: `- [ ] Txxx [USx] Verify implementation matches ux_reference.md (Happy Path & Errors)`
### Checklist Format (REQUIRED) ### Checklist Format (REQUIRED)
Every task MUST strictly follow this format: Every task MUST strictly follow this format:

View File

@@ -0,0 +1,66 @@
---
description: Run semantic validation and functional tests for a specific feature, module, or file.
handoffs:
- label: Fix Implementation
agent: speckit.implement
prompt: Fix the issues found during testing...
send: true
---
## User Input
```text
$ARGUMENTS
```
**Input format:** Can be a file path, a directory, or a feature name.
## Outline
1. **Context Analysis**:
- Determine the target scope (Backend vs Frontend vs Full Feature).
- Read `semantic_protocol.md` to load validation rules.
2. **Phase 1: Semantic Static Analysis (The "Compiler" Check)**
- **Command:** Use `grep` or script to verify Protocol compliance before running code.
- **Check:**
- Does the file start with `[DEF:...]` header?
- Are `@TIER` and `@PURPOSE` defined?
- Are imports located *after* the contracts?
- Do functions marked "Critical" have `@PRE`/`@POST` tags?
- **Action:** If this phase fails, **STOP** and report "Semantic Compilation Failed". Do not run runtime tests.
3. **Phase 2: Environment Prep**
- Detect project type:
- **Python**: Check if `.venv` is active.
- **Svelte**: Check if `node_modules` exists.
- **Command:** Run linter (e.g., `ruff check`, `eslint`) to catch syntax errors immediately.
4. **Phase 3: Test Execution (Runtime)**
- Select the test runner based on the file path:
- **Backend (`*.py`)**:
- Command: `pytest <path_to_test_file> -v`
- If no specific test file exists, try to find it by convention: `tests/test_<module_name>.py`.
- **Frontend (`*.svelte`, `*.ts`)**:
- Command: `npm run test -- <path_to_component>`
- **Verification**:
- Analyze output logs.
- If tests fail, summarize the failure (AssertionError, Timeout, etc.).
5. **Phase 4: Contract Coverage Check (Manual/LLM verify)**
- Review the test cases executed.
- **Question**: Do the tests explicitly verify the `@POST` guarantees defined in the module header?
- **Report**: Mark as "Weak Coverage" if contracts exist but aren't tested.
## Execution Rules
- **Fail Fast**: If semantic headers are missing, don't waste time running pytest.
- **No Silent Failures**: Always output the full error log if a command fails.
- **Auto-Correction Hint**: If a test fails, suggest the specific `speckit.implement` command to fix it.
## Example Commands
- **Python**: `pytest backend/tests/test_auth.py`
- **Svelte**: `npm run test:unit -- src/components/Button.svelte`
- **Lint**: `ruff check backend/src/api/`

46
.kilocodemodes Normal file
View File

@@ -0,0 +1,46 @@
customModes:
- slug: tester
name: Tester
description: QA and Plan Verification Specialist
roleDefinition: |-
You are Kilo Code, acting as a QA and Verification Specialist. Your primary goal is to validate that the project implementation aligns strictly with the defined specifications and task plans.
Your responsibilities include: - Reading and analyzing task plans and specifications (typically in the `specs/` directory). - Verifying that implemented code matches the requirements. - Executing tests and validating system behavior via CLI or Browser. - Updating the status of tasks in the plan files (e.g., marking checkboxes [x]) as they are verified. - Identifying and reporting missing features or bugs.
whenToUse: Use this mode when you need to audit the progress of a project, verify completed tasks against the plan, run quality assurance checks, or update the status of task lists in specification documents.
groups:
- read
- edit
- command
- browser
- mcp
customInstructions: 1. Always begin by loading the relevant plan or task list from the `specs/` directory. 2. Do not assume a task is done just because it is checked; verify the code or functionality first if asked to audit. 3. When updating task lists, ensure you only mark items as complete if you have verified them.
- slug: semantic
name: Semantic Agent
roleDefinition: |-
You are Kilo Code, a Semantic Agent responsible for maintaining the semantic integrity of the codebase. Your primary goal is to ensure that all code entities (Modules, Classes, Functions, Components) are properly annotated with semantic anchors and tags as defined in `semantic_protocol.md`.
Your core responsibilities are: 1. **Semantic Mapping**: You run and maintain the `generate_semantic_map.py` script to generate up-to-date semantic maps (`semantics/semantic_map.json`, `specs/project_map.md`) and compliance reports (`semantics/reports/*.md`). 2. **Compliance Auditing**: You analyze the generated compliance reports to identify files with low semantic coverage or parsing errors. 3. **Semantic Enrichment**: You actively edit code files to add missing semantic anchors (`[DEF:...]`, `[/DEF:...]`) and mandatory tags (`@PURPOSE`, `@LAYER`, etc.) to improve the global compliance score. 4. **Protocol Enforcement**: You strictly adhere to the syntax and rules defined in `semantic_protocol.md` when modifying code.
You have access to the full codebase and tools to read, write, and execute scripts. You should prioritize fixing "Critical Parsing Errors" (unclosed anchors) before addressing missing metadata.
whenToUse: Use this mode when you need to update the project's semantic map, fix semantic compliance issues (missing anchors/tags/DbC ), or analyze the codebase structure. This mode is specialized for maintaining the `semantic_protocol.md` standards.
description: Codebase semantic mapping and compliance expert
customInstructions: Always check `semantics/reports/` for the latest compliance status before starting work. When fixing a file, try to fix all semantic issues in that file at once. After making a batch of fixes, run `python3 generate_semantic_map.py` to verify improvements.
groups:
- read
- edit
- command
- browser
- mcp
source: project
- slug: product-manager
name: Product Manager
roleDefinition: |-
Your purpose is to rigorously execute the workflows defined in `.kilocode/workflows/`.
You act as the orchestrator for: - Specification (`speckit.specify`, `speckit.clarify`) - Planning (`speckit.plan`) - Task Management (`speckit.tasks`, `speckit.taskstoissues`) - Quality Assurance (`speckit.analyze`, `speckit.checklist`) - Governance (`speckit.constitution`) - Implementation Oversight (`speckit.implement`)
For each task, you must read the relevant workflow file from `.kilocode/workflows/` and follow its Execution Steps precisely.
whenToUse: Use this mode when you need to run any /speckit.* command or when dealing with high-level feature planning, specification writing, or project management tasks.
description: Executes SpecKit workflows for feature management
customInstructions: 1. Always read the specific workflow file in `.kilocode/workflows/` before executing a command. 2. Adhere strictly to the "Operating Constraints" and "Execution Steps" in the workflow files.
groups:
- read
- edit
- command
- mcp
source: project

View File

@@ -1,29 +1,55 @@
# ss-tools Constitution <!--
SYNC IMPACT REPORT
Version: 2.2.0 (ConfigManager Discipline)
Changes:
- Updated Principle II: Added mandatory requirement for using `ConfigManager` (via dependency injection) for all configuration access to ensure consistent environment handling and avoid hardcoded values.
- Updated Principle III: Refined `requestApi` requirement.
Templates Status:
- .specify/templates/plan-template.md: ✅ Aligned.
- .specify/templates/spec-template.md: ✅ Aligned.
- .specify/templates/tasks-template.md: ✅ Aligned.
-->
# Semantic Code Generation Constitution
## Core Principles ## Core Principles
### I. SPA-First Architecture ### I. Semantic Protocol Compliance
The frontend MUST be a Static Single Page Application (SPA) served by the Python backend. No Node.js server is permitted in production. The backend serves the `index.html` entry point for all non-API routes. The file `semantic_protocol.md` is the **sole and authoritative technical standard** for this project.
- **Law**: All code must adhere to the Axioms (Meaning First, Contract First, etc.) defined in the Protocol.
- **Syntax & Structure**: Anchors (`[DEF]`), Tags (`@KEY`), and File Structures must strictly match the Protocol.
- **Compliance**: Any deviation from `semantic_protocol.md` constitutes a build failure.
### II. API-Driven Communication ### II. Everything is a Plugin & Centralized Config
All data retrieval and state changes MUST be performed via the backend REST API or WebSockets. The frontend should not access the database or filesystem directly. All functional extensions, tools, or major features must be implemented as modular Plugins inheriting from `PluginBase`.
- **Modularity**: Logic should not reside in standalone services or scripts unless strictly necessary for core infrastructure. This ensures a unified execution model via the `TaskManager`.
- **Configuration Discipline**: All configuration access (environments, settings, paths) MUST use the `ConfigManager`. In the backend, the singleton instance MUST be obtained via dependency injection (`get_config_manager()`). Hardcoding environment IDs (e.g., "1") or paths is STRICTLY FORBIDDEN.
### III. Modern Stack Consistency ### III. Unified Frontend Experience
The project strictly uses SvelteKit (Frontend), FastAPI (Backend), and Tailwind CSS (Styling). New dependencies must be justified and approved. To ensure a consistent and accessible user experience, all frontend implementations must strictly adhere to the unified design and localization standards.
- **Component Reusability**: All UI elements MUST utilize the standardized Svelte component library (`src/lib/ui`) and centralized design tokens.
- **Internationalization (i18n)**: All user-facing text MUST be extracted to the translation system (`src/lib/i18n`).
- **Backend Communication**: All API requests MUST use the `requestApi` wrapper (or its derivatives like `fetchApi`, `postApi`) from `src/lib/api.js`. Direct use of the native `fetch` API for backend communication is FORBIDDEN to ensure consistent authentication (JWT) and error handling.
### IV. Semantic Protocol Adherence (GRACE-Poly) ### IV. Security & Access Control
All code generation and modification MUST adhere to the Semantic Protocol defined in `semantic_protocol.md`. To support the Role-Based Access Control (RBAC) system, all functional components must define explicit permissions.
- **Anchors**: Use `[DEF:id:Type]` and `[/DEF:id]` to define semantic boundaries. - **Granular Permissions**: Every Plugin MUST define a unique permission string (e.g., `plugin:name:execute`) required for its operation.
- **Contracts**: Define `@PRE` and `@POST` conditions in headers. - **Registration**: These permissions MUST be registered in the system database (`auth.db`) during initialization.
- **Logging**: Use structured logging with `[AnchorID][State]` format.
- **Immutability**: Respect architectural decisions in headers. ### V. Independent Testability
Every feature specification MUST define "Independent Tests" that allow the feature to be verified in isolation.
- **Decoupling**: Features should be designed such that they can be tested without requiring the full application state or external dependencies where possible.
- **Verification**: A feature is not complete until its Independent Test scenarios pass.
### VI. Asynchronous Execution
All long-running or resource-intensive operations (migrations, analysis, backups, external API calls) MUST be executed as asynchronous tasks via the `TaskManager`.
- **Non-Blocking**: HTTP API endpoints MUST NOT block on these operations; they should spawn a task and return a Task ID.
- **Observability**: Tasks MUST emit real-time status updates via the WebSocket infrastructure.
## Governance ## Governance
This Constitution establishes the "Semantic Code Generation Protocol" as the supreme law of this repository.
### Compliance - **Authoritative Source**: `semantic_protocol.md` defines the specific implementation rules for Principle I.
All Pull Requests and code modifications must be verified against this Constitution. Violations of Core Principles are considered critical defects. - **Amendments**: Changes to core principles require a Constitution amendment. Changes to technical syntax require a Protocol update.
- **Compliance**: Failure to adhere to the Protocol constitutes a build failure.
### Amendments **Version**: 2.2.0 | **Ratified**: 2025-12-19 | **Last Amended**: 2026-01-29
Changes to this Constitution require a formal RFC process and approval from the project lead.
**Version**: 1.0.0 | **Ratified**: 2025-12-20

View File

@@ -1,6 +1,7 @@
# Feature Specification: [FEATURE NAME] # Feature Specification: [FEATURE NAME]
**Feature Branch**: `[###-feature-name]` **Feature Branch**: `[###-feature-name]`
**Reference UX**: `[ux_reference.md]` (See specific folder)
**Created**: [DATE] **Created**: [DATE]
**Status**: Draft **Status**: Draft
**Input**: User description: "$ARGUMENTS" **Input**: User description: "$ARGUMENTS"

View File

@@ -0,0 +1,67 @@
# UX Reference: [FEATURE NAME]
**Feature Branch**: `[###-feature-name]`
**Created**: [DATE]
**Status**: Draft
## 1. User Persona & Context
* **Who is the user?**: [e.g. Junior Developer, System Administrator, End User]
* **What is their goal?**: [e.g. Quickly deploy a hotfix, Visualize complex data]
* **Context**: [e.g. Running a command in a terminal on a remote server, Browsing the dashboard on a mobile device]
## 2. The "Happy Path" Narrative
[Write a short story (3-5 sentences) describing the perfect interaction from the user's perspective. Focus on how it *feels* - is it instant? Does it guide them?]
## 3. Interface Mockups
### CLI Interaction (if applicable)
```bash
# User runs this command:
$ command --flag value
# System responds immediately with:
[ spinner ] specific loading message...
# Success output:
✅ Operation completed successfully in 1.2s
- Created file: /path/to/file
- Updated config: /path/to/config
```
### UI Layout & Flow (if applicable)
**Screen/Component**: [Name]
* **Layout**: [Description of structure, e.g., "Two-column layout, left sidebar navigation..."]
* **Key Elements**:
* **[Button Name]**: Primary action. Color: Blue.
* **[Input Field]**: Placeholder text: "Enter your name...". Validation: Real-time.
* **States**:
* **Default**: Clean state, waiting for input.
* **Loading**: Skeleton loader replaces content area.
* **Success**: Toast notification appears top-right: "Saved!" (Green).
## 4. The "Error" Experience
**Philosophy**: Don't just report the error; guide the user to the fix.
### Scenario A: [Common Error, e.g. Invalid Input]
* **User Action**: Enters "123" in a text-only field.
* **System Response**:
* (UI) Input border turns Red. Message below input: "Please enter text only."
* (CLI) `❌ Error: Invalid input '123'. Expected text format.`
* **Recovery**: User can immediately re-type without refreshing/re-running.
### Scenario B: [System Failure, e.g. Network Timeout]
* **System Response**: "Unable to connect. Retrying in 3s... (Press C to cancel)"
* **Recovery**: Automatic retry or explicit "Retry Now" button.
## 5. Tone & Voice
* **Style**: [e.g. Concise, Technical, Friendly, Verbose]
* **Terminology**: [e.g. Use "Repository" not "Repo", "Directory" not "Folder"]

158
README.md
View File

@@ -1,119 +1,77 @@
# Инструменты автоматизации Superset # Инструменты автоматизации Superset (ss-tools)
## Обзор ## Обзор
Этот репозиторий содержит Python-скрипты и библиотеку (`superset_tool`) для автоматизации задач в Apache Superset, таких как: **ss-tools** — это современная платформа для автоматизации и управления экосистемой Apache Superset. Проект перешел от набора CLI-скриптов к полноценному веб-приложению с архитектурой Backend (FastAPI) + Frontend (SvelteKit), обеспечивая удобный интерфейс для сложных операций.
- **Резервное копирование**: Экспорт всех дашбордов из экземпляра Superset в локальное хранилище.
- **Миграция**: Перенос и преобразование дашбордов между разными средами Superset (например, Development, Sandbox, Production). ## Основные возможности
### 🚀 Миграция и управление дашбордами
- **Dashboard Grid**: Удобный просмотр всех дашбордов во всех окружениях (Dev, Sandbox, Prod) в едином интерфейсе.
- **Интеллектуальный маппинг**: Автоматическое и ручное сопоставление датасетов, таблиц и схем при переносе между окружениями.
- **Проверка зависимостей**: Валидация наличия всех необходимых компонентов перед миграцией.
### 📦 Резервное копирование
- **Планировщик (Scheduler)**: Автоматическое создание резервных копий дашбордов и датасетов по расписанию.
- **Хранилище**: Локальное хранение артефактов с возможностью управления через UI.
### 🛠 Git Интеграция
- **Version Control**: Возможность версионирования ассетов Superset.
- **Git Dashboard**: Управление ветками, коммитами и деплоем изменений напрямую из интерфейса.
- **Conflict Resolution**: Встроенные инструменты для разрешения конфликтов в YAML-конфигурациях.
### 🤖 LLM Анализ (AI Plugin)
- **Автоматический аудит**: Анализ состояния дашбордов на основе скриншотов и метаданных.
- **Генерация документации**: Автоматическое описание датасетов и колонок с помощью LLM (OpenAI, OpenRouter и др.).
- **Smart Validation**: Поиск аномалий и ошибок в визуализациях.
### 🔐 Безопасность и администрирование
- **Multi-user Auth**: Многопользовательский доступ с ролевой моделью (RBAC).
- **Управление подключениями**: Централизованная настройка доступов к различным инстансам Superset.
- **Логирование**: Подробная история выполнения всех фоновых задач.
## Технологический стек
- **Backend**: Python 3.9+, FastAPI, SQLAlchemy, APScheduler, Pydantic.
- **Frontend**: Node.js 18+, SvelteKit, Tailwind CSS.
- **Database**: SQLite (для хранения метаданных, задач и настроек доступа).
## Структура проекта ## Структура проекта
- `backup_script.py`: Основной скрипт для выполнения запланированного резервного копирования дашбордов Superset. - `backend/` — Серверная часть, API и логика плагинов.
- `migration_script.py`: Основной скрипт для переноса конкретных дашбордов между окружениями, включая переопределение соединений с базами данных. - `frontend/` — Клиентская часть (SvelteKit приложение).
- `search_script.py`: Скрипт для поиска данных во всех доступных датасетах на сервере - `specs/` — Спецификации функций и планы реализации.
- `run_mapper.py`: CLI-скрипт для маппинга метаданных датасетов. - `docs/` — Дополнительная документация по маппингу и разработке плагинов.
- `superset_tool/`:
- `client.py`: Python-клиент для взаимодействия с API Superset.
- `exceptions.py`: Пользовательские классы исключений для структурированной обработки ошибок.
- `models.py`: Pydantic-модели для валидации конфигурационных данных.
- `utils/`:
- `fileio.py`: Утилиты для работы с файловой системой (работа с архивами, парсинг YAML).
- `logger.py`: Конфигурация логгера для единообразного логирования в проекте.
- `network.py`: HTTP-клиент для сетевых запросов с обработкой аутентификации и повторных попыток.
- `init_clients.py`: Утилита для инициализации клиентов Superset для разных окружений.
- `dataset_mapper.py`: Логика маппинга метаданных датасетов.
## Настройка ## Быстрый старт
### Требования ### Требования
- Python 3.9+ - Python 3.9+
- `pip` для управления пакетами. - Node.js 18+
- `keyring` для безопасного хранения паролей. - Настроенный доступ к API Superset
### Установка ### Запуск
1. **Клонируйте репозиторий:** Для автоматической настройки окружений и запуска обоих серверов (Backend & Frontend) используйте скрипт:
```bash
git clone https://prod.gitlab.dwh.rusal.com/dwh_bi/superset-tools.git
cd superset-tools
```
2. **Установите зависимости:**
```bash
pip install -r requirements.txt
```
(Возможно, потребуется создать `requirements.txt` с `pydantic`, `requests`, `keyring`, `PyYAML`, `urllib3`)
3. **Настройте пароли:**
Используйте `keyring` для хранения паролей API-пользователей Superset.
```python
import keyring
keyring.set_password("system", "dev migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "prod migrate", "пароль пользователя migrate_user")
keyring.set_password("system", "sandbox migrate", "пароль пользователя migrate_user")
```
## Использование
### Запуск проекта (Web UI)
Для запуска backend и frontend серверов одной командой:
```bash ```bash
./run.sh ./run.sh
``` ```
*Скрипт создаст виртуальное окружение Python, установит зависимости `pip` и `npm`, и запустит сервисы.*
Опции: Опции:
- `--skip-install`: Пропустить проверку и установку зависимостей. - `--skip-install`: Пропустить установку зависимостей.
- `--help`: Показать справку. - `--help`: Показать справку.
Переменные окружения: Переменные окружения:
- `BACKEND_PORT`: Порт для backend (по умолчанию 8000). - `BACKEND_PORT`: Порт API (по умолчанию 8000).
- `FRONTEND_PORT`: Порт для frontend (по умолчанию 5173). - `FRONTEND_PORT`: Порт UI (по умолчанию 5173).
### Скрипт резервного копирования (`backup_script.py`) ## Разработка
Для создания резервных копий дашбордов из настроенных окружений Superset: Проект следует строгим правилам разработки:
```bash 1. **Semantic Code Generation**: Использование протокола `semantic_protocol.md` для обеспечения надежности кода.
python backup_script.py 2. **Design by Contract (DbC)**: Определение предусловий и постусловий для ключевых функций.
``` 3. **Constitution**: Соблюдение правил, описанных в конституции проекта в папке `.specify/`.
Резервные копии сохраняются в `P:\Superset\010 Бекапы\` по умолчанию. Логи хранятся в `P:\Superset\010 Бекапы\Logs`.
### Скрипт миграции (`migration_script.py`) ### Полезные команды
Для переноса конкретного дашборда: - **Backend**: `cd backend && .venv/bin/python3 -m uvicorn src.app:app --reload`
```bash - **Frontend**: `cd frontend && npm run dev`
python migration_script.py - **Тесты**: `cd backend && .venv/bin/pytest`
```
### Скрипт поиска (`search_script.py`) ## Контакты и вклад
Для поиска по текстовым паттернам в метаданных датасетов Superset: Для добавления новых функций или исправления ошибок, пожалуйста, ознакомьтесь с `docs/plugin_dev.md` и создайте соответствующую спецификацию в `specs/`.
```bash
python search_script.py
```
Скрипт использует регулярные выражения для поиска в полях датасетов, таких как SQL-запросы. Результаты поиска выводятся в лог и в консоль.
### Скрипт маппинга метаданных (`run_mapper.py`)
Для обновления метаданных датасета (например, verbose names) в Superset:
```bash
python run_mapper.py --source <source_type> --dataset-id <dataset_id> [--table-name <table_name>] [--table-schema <table_schema>] [--excel-path <path_to_excel>] [--env <environment>]
```
Если вы используете XLSX - файл должен содержать два столбца - column_name | verbose_name
Параметры:
- `--source`: Источник данных ('postgres', 'excel' или 'both').
- `--dataset-id`: ID датасета для обновления.
- `--table-name`: Имя таблицы для PostgreSQL.
- `--table-schema`: Схема таблицы для PostgreSQL.
- `--excel-path`: Путь к Excel-файлу.
- `--env`: Окружение Superset ('dev', 'prod' и т.д.).
Пример использования:
```bash
python run_mapper.py --source postgres --dataset-id 123 --table-name account_debt --table-schema dm_view --env dev
python run_mapper.py --source=excel --dataset-id=286 --excel-path=H:\dev\ss-tools\286_map.xlsx --env=dev
```
## Логирование
Логи пишутся в файл в директории `Logs` (например, `P:\Superset\010 Бекапы\Logs` для резервных копий) и выводятся в консоль. Уровень логирования по умолчанию — `INFO`.
## Разработка и вклад
- Следуйте **Semantic Code Generation Protocol** (см. `semantic_protocol.md`):
- Все определения обернуты в `[DEF]...[/DEF]`.
- Контракты (`@PRE`, `@POST`) определяются ДО реализации.
- Строгая типизация и иммутабельность архитектурных решений.
- Соблюдайте Конституцию проекта (`.specify/memory/constitution.md`).
- Используйте `Pydantic`-модели для валидации данных.
- Реализуйте всестороннюю обработку ошибок с помощью пользовательских исключений.

View File

@@ -1,269 +0,0 @@
2025-12-20 19:55:11,325 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 19:55:11,325 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 19:55:11,327 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 43, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 21:01:49,905 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 21:01:49,906 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 21:01:49,988 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 21:01:49,990 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 22:42:32,538 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 22:42:32,538 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 22:42:32,583 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 22:42:32,587 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 22:54:29,770 - INFO - [BackupPlugin][Entry] Starting backup for .
2025-12-20 22:54:29,771 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 22:54:29,831 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 22:54:29,833 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 22:54:34,078 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 22:54:34,078 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 22:54:34,079 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 22:54:34,079 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 22:59:25,060 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 22:59:25,060 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 22:59:25,114 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 22:59:25,117 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:00:31,156 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:00:31,156 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:00:31,157 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:00:31,162 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:00:34,710 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:00:34,710 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:00:34,710 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:00:34,711 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:01:43,894 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:01:43,894 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:01:43,895 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:01:43,895 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:04:07,731 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:04:07,731 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:04:07,732 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:04:07,732 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:06:39,641 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:06:39,642 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:06:39,687 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:06:39,689 - CRITICAL - [setup_clients][Failure] Critical error during client initialization: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
Traceback (most recent call last):
File "/home/user/ss-tools/superset_tool/utils/init_clients.py", line 66, in setup_clients
config = SupersetConfig(
^^^^^^^^^^^^^^^
File "/home/user/ss-tools/backend/.venv/lib/python3.12/site-packages/pydantic/main.py", line 250, in __init__
validated_self = self.__pydantic_validator__.validate_python(data, self_instance=self)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
pydantic_core._pydantic_core.ValidationError: 1 validation error for SupersetConfig
base_url
Value error, Invalid URL format: https://superset.bebesh.ru. Must include '/api/v1'. [type=value_error, input_value='https://superset.bebesh.ru', input_type=str]
For further information visit https://errors.pydantic.dev/2.12/v/value_error
2025-12-20 23:30:36,090 - INFO - [BackupPlugin][Entry] Starting backup for superset.
2025-12-20 23:30:36,093 - INFO - [setup_clients][Enter] Starting Superset clients initialization.
2025-12-20 23:30:36,128 - INFO - [setup_clients][Action] Loading environments from ConfigManager
2025-12-20 23:30:36,129 - INFO - [SupersetClient.__init__][Enter] Initializing SupersetClient.
2025-12-20 23:30:36,129 - INFO - [APIClient.__init__][Entry] Initializing APIClient.
2025-12-20 23:30:36,130 - WARNING - [_init_session][State] SSL verification disabled.
2025-12-20 23:30:36,130 - INFO - [APIClient.__init__][Exit] APIClient initialized.
2025-12-20 23:30:36,130 - INFO - [SupersetClient.__init__][Exit] SupersetClient initialized.
2025-12-20 23:30:36,130 - INFO - [get_dashboards][Enter] Fetching dashboards.
2025-12-20 23:30:36,131 - INFO - [authenticate][Enter] Authenticating to https://superset.bebesh.ru/api/v1
2025-12-20 23:30:36,897 - INFO - [authenticate][Exit] Authenticated successfully.
2025-12-20 23:30:37,527 - INFO - [get_dashboards][Exit] Found 11 dashboards.
2025-12-20 23:30:37,527 - INFO - [BackupPlugin][Progress] Found 11 dashboards to export in superset.
2025-12-20 23:30:37,529 - INFO - [export_dashboard][Enter] Exporting dashboard 11.
2025-12-20 23:30:38,224 - INFO - [export_dashboard][Exit] Exported dashboard 11 to dashboard_export_20251220T203037.zip.
2025-12-20 23:30:38,225 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:38,226 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/FCC New Coder Survey 2018/dashboard_export_20251220T203037.zip
2025-12-20 23:30:38,227 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/FCC New Coder Survey 2018
2025-12-20 23:30:38,230 - INFO - [export_dashboard][Enter] Exporting dashboard 10.
2025-12-20 23:30:38,438 - INFO - [export_dashboard][Exit] Exported dashboard 10 to dashboard_export_20251220T203038.zip.
2025-12-20 23:30:38,438 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:38,439 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/COVID Vaccine Dashboard/dashboard_export_20251220T203038.zip
2025-12-20 23:30:38,439 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/COVID Vaccine Dashboard
2025-12-20 23:30:38,440 - INFO - [export_dashboard][Enter] Exporting dashboard 9.
2025-12-20 23:30:38,853 - INFO - [export_dashboard][Exit] Exported dashboard 9 to dashboard_export_20251220T203038.zip.
2025-12-20 23:30:38,853 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:38,856 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Sales Dashboard/dashboard_export_20251220T203038.zip
2025-12-20 23:30:38,856 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Sales Dashboard
2025-12-20 23:30:38,858 - INFO - [export_dashboard][Enter] Exporting dashboard 8.
2025-12-20 23:30:38,939 - INFO - [export_dashboard][Exit] Exported dashboard 8 to dashboard_export_20251220T203038.zip.
2025-12-20 23:30:38,940 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:38,941 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Unicode Test/dashboard_export_20251220T203038.zip
2025-12-20 23:30:38,941 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Unicode Test
2025-12-20 23:30:38,942 - INFO - [export_dashboard][Enter] Exporting dashboard 7.
2025-12-20 23:30:39,148 - INFO - [export_dashboard][Exit] Exported dashboard 7 to dashboard_export_20251220T203038.zip.
2025-12-20 23:30:39,148 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:39,149 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Video Game Sales/dashboard_export_20251220T203038.zip
2025-12-20 23:30:39,149 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Video Game Sales
2025-12-20 23:30:39,150 - INFO - [export_dashboard][Enter] Exporting dashboard 6.
2025-12-20 23:30:39,689 - INFO - [export_dashboard][Exit] Exported dashboard 6 to dashboard_export_20251220T203039.zip.
2025-12-20 23:30:39,689 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:39,690 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Featured Charts/dashboard_export_20251220T203039.zip
2025-12-20 23:30:39,691 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Featured Charts
2025-12-20 23:30:39,692 - INFO - [export_dashboard][Enter] Exporting dashboard 5.
2025-12-20 23:30:39,960 - INFO - [export_dashboard][Exit] Exported dashboard 5 to dashboard_export_20251220T203039.zip.
2025-12-20 23:30:39,960 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:39,961 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Slack Dashboard/dashboard_export_20251220T203039.zip
2025-12-20 23:30:39,961 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Slack Dashboard
2025-12-20 23:30:39,962 - INFO - [export_dashboard][Enter] Exporting dashboard 4.
2025-12-20 23:30:40,196 - INFO - [export_dashboard][Exit] Exported dashboard 4 to dashboard_export_20251220T203039.zip.
2025-12-20 23:30:40,196 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:40,197 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/deck.gl Demo/dashboard_export_20251220T203039.zip
2025-12-20 23:30:40,197 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/deck.gl Demo
2025-12-20 23:30:40,198 - INFO - [export_dashboard][Enter] Exporting dashboard 3.
2025-12-20 23:30:40,745 - INFO - [export_dashboard][Exit] Exported dashboard 3 to dashboard_export_20251220T203040.zip.
2025-12-20 23:30:40,746 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:40,760 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/Misc Charts/dashboard_export_20251220T203040.zip
2025-12-20 23:30:40,761 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/Misc Charts
2025-12-20 23:30:40,762 - INFO - [export_dashboard][Enter] Exporting dashboard 2.
2025-12-20 23:30:40,928 - INFO - [export_dashboard][Exit] Exported dashboard 2 to dashboard_export_20251220T203040.zip.
2025-12-20 23:30:40,929 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:40,930 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/USA Births Names/dashboard_export_20251220T203040.zip
2025-12-20 23:30:40,931 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/USA Births Names
2025-12-20 23:30:40,932 - INFO - [export_dashboard][Enter] Exporting dashboard 1.
2025-12-20 23:30:41,582 - INFO - [export_dashboard][Exit] Exported dashboard 1 to dashboard_export_20251220T203040.zip.
2025-12-20 23:30:41,582 - INFO - [save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: False
2025-12-20 23:30:41,749 - INFO - [save_and_unpack_dashboard][State] Dashboard saved to: backups/SUPERSET/World Bank's Data/dashboard_export_20251220T203040.zip
2025-12-20 23:30:41,750 - INFO - [archive_exports][Enter] Managing archive in backups/SUPERSET/World Bank's Data
2025-12-20 23:30:41,752 - INFO - [consolidate_archive_folders][Enter] Consolidating archives in backups/SUPERSET
2025-12-20 23:30:41,753 - INFO - [remove_empty_directories][Enter] Starting cleanup of empty directories in backups/SUPERSET
2025-12-20 23:30:41,758 - INFO - [remove_empty_directories][Exit] Removed 0 empty directories.
2025-12-20 23:30:41,758 - INFO - [BackupPlugin][CoherenceCheck:Passed] Backup logic completed for superset.

View File

@@ -0,0 +1,44 @@
#!/usr/bin/env python3
# [DEF:backend.delete_running_tasks:Module]
# @PURPOSE: Script to delete tasks with RUNNING status from the database.
# @LAYER: Utility
# @SEMANTICS: maintenance, database, cleanup
from sqlalchemy.orm import Session
from src.core.database import TasksSessionLocal
from src.models.task import TaskRecord
# [DEF:delete_running_tasks:Function]
# @PURPOSE: Delete all tasks with RUNNING status from the database.
# @PRE: Database is accessible and TaskRecord model is defined.
# @POST: All tasks with status 'RUNNING' are removed from the database.
def delete_running_tasks():
"""Delete all tasks with RUNNING status from the database."""
session: Session = TasksSessionLocal()
try:
# Find all task records with RUNNING status
running_tasks = session.query(TaskRecord).filter(TaskRecord.status == "RUNNING").all()
if not running_tasks:
print("No RUNNING tasks found.")
return
print(f"Found {len(running_tasks)} RUNNING tasks:")
for task in running_tasks:
print(f"- Task ID: {task.id}, Type: {task.type}")
# Delete the found tasks
session.query(TaskRecord).filter(TaskRecord.status == "RUNNING").delete(synchronize_session=False)
session.commit()
print(f"Successfully deleted {len(running_tasks)} RUNNING tasks.")
except Exception as e:
session.rollback()
print(f"Error deleting tasks: {e}")
finally:
session.close()
# [/DEF:delete_running_tasks:Function]
if __name__ == "__main__":
delete_running_tasks()
# [/DEF:backend.delete_running_tasks:Module]

1
backend/get_full_key.py Normal file
View File

@@ -0,0 +1 @@
{"print(f'Length": {"else": "print('Provider not found')\ndb.close()"}}

1
backend/git_repos/12 Submodule

Submodule backend/git_repos/12 added at 57ab7e8679

58722
backend/logs/app.log.1 Normal file

File diff suppressed because it is too large Load Diff

Binary file not shown.

BIN
backend/migrations.db Normal file

Binary file not shown.

View File

@@ -1,14 +1,56 @@
fastapi annotated-doc==0.0.4
uvicorn annotated-types==0.7.0
pydantic anyio==4.12.0
authlib APScheduler==3.11.2
python-multipart attrs==25.4.0
starlette Authlib==1.6.6
jsonschema certifi==2025.11.12
requests cffi==2.0.0
keyring charset-normalizer==3.4.4
httpx click==8.3.1
PyYAML cryptography==46.0.3
websockets fastapi==0.126.0
rapidfuzz greenlet==3.3.0
sqlalchemy h11==0.16.0
httpcore==1.0.9
httpx==0.28.1
idna==3.11
jaraco.classes==3.4.0
jaraco.context==6.0.1
jaraco.functools==4.3.0
jeepney==0.9.0
jsonschema==4.25.1
jsonschema-specifications==2025.9.1
keyring==25.7.0
more-itertools==10.8.0
pycparser==2.23
pydantic==2.12.5
pydantic-settings
pydantic_core==2.41.5
python-multipart==0.0.21
PyYAML==6.0.3
passlib[bcrypt]
python-jose[cryptography]
PyJWT
RapidFuzz==3.14.3
referencing==0.37.0
requests==2.32.5
rpds-py==0.30.0
SecretStorage==3.5.0
SQLAlchemy==2.0.45
starlette==0.50.0
typing-inspection==0.4.2
typing_extensions==4.15.0
tzlocal==5.3.1
urllib3==2.6.2
uvicorn==0.38.0
websockets==15.0.1
pandas
psycopg2-binary
openpyxl
GitPython==3.1.44
itsdangerous
email-validator
openai
playwright
tenacity

View File

@@ -1,52 +1,118 @@
# [DEF:AuthModule:Module] # [DEF:backend.src.api.auth:Module]
# @SEMANTICS: auth, authentication, adfs, oauth, middleware #
# @PURPOSE: Implements ADFS authentication using Authlib for FastAPI. It provides a dependency to protect endpoints. # @SEMANTICS: api, auth, routes, login, logout
# @LAYER: UI (API) # @PURPOSE: Authentication API endpoints.
# @RELATION: Used by API routers to protect endpoints that require authentication. # @LAYER: API
# @RELATION: USES -> backend.src.services.auth_service.AuthService
# @RELATION: USES -> backend.src.core.database.get_auth_db
#
# @INVARIANT: All auth endpoints must return consistent error codes.
from fastapi import Depends, HTTPException, status # [SECTION: IMPORTS]
from fastapi.security import OAuth2AuthorizationCodeBearer from fastapi import APIRouter, Depends, HTTPException, status
from authlib.integrations.starlette_client import OAuth from fastapi.security import OAuth2PasswordRequestForm
from starlette.config import Config from sqlalchemy.orm import Session
from ..core.database import get_auth_db
from ..services.auth_service import AuthService
from ..schemas.auth import Token, User as UserSchema
from ..dependencies import get_current_user
from ..core.auth.oauth import oauth, is_adfs_configured
from ..core.auth.logger import log_security_event
from ..core.logger import belief_scope
import starlette.requests
# [/SECTION]
# Placeholder for ADFS configuration. In a real app, this would come from a secure source. # [DEF:router:Variable]
# Create an in-memory .env file # @PURPOSE: APIRouter instance for authentication routes.
from io import StringIO router = APIRouter(prefix="/api/auth", tags=["auth"])
config_data = StringIO(""" # [/DEF:router:Variable]
ADFS_CLIENT_ID=your-client-id
ADFS_CLIENT_SECRET=your-client-secret
ADFS_SERVER_METADATA_URL=https://your-adfs-server/.well-known/openid-configuration
""")
config = Config(config_data)
oauth = OAuth(config)
oauth.register( # [DEF:login_for_access_token:Function]
name='adfs', # @PURPOSE: Authenticates a user and returns a JWT access token.
server_metadata_url=config('ADFS_SERVER_METADATA_URL'), # @PRE: form_data contains username and password.
client_kwargs={'scope': 'openid profile email'} # @POST: Returns a Token object on success.
) # @THROW: HTTPException 401 if authentication fails.
# @PARAM: form_data (OAuth2PasswordRequestForm) - Login credentials.
# @PARAM: db (Session) - Auth database session.
# @RETURN: Token - The generated JWT token.
@router.post("/login", response_model=Token)
async def login_for_access_token(
form_data: OAuth2PasswordRequestForm = Depends(),
db: Session = Depends(get_auth_db)
):
with belief_scope("api.auth.login"):
auth_service = AuthService(db)
user = auth_service.authenticate_user(form_data.username, form_data.password)
if not user:
log_security_event("LOGIN_FAILED", form_data.username, {"reason": "Invalid credentials"})
raise HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Incorrect username or password",
headers={"WWW-Authenticate": "Bearer"},
)
log_security_event("LOGIN_SUCCESS", user.username, {"source": "LOCAL"})
return auth_service.create_session(user)
# [/DEF:login_for_access_token:Function]
oauth2_scheme = OAuth2AuthorizationCodeBearer( # [DEF:read_users_me:Function]
authorizationUrl="https://your-adfs-server/adfs/oauth2/authorize", # @PURPOSE: Retrieves the profile of the currently authenticated user.
tokenUrl="https://your-adfs-server/adfs/oauth2/token", # @PRE: Valid JWT token provided.
) # @POST: Returns the current user's data.
# @PARAM: current_user (UserSchema) - The user extracted from the token.
# @RETURN: UserSchema - The current user profile.
@router.get("/me", response_model=UserSchema)
async def read_users_me(current_user: UserSchema = Depends(get_current_user)):
with belief_scope("api.auth.me"):
return current_user
# [/DEF:read_users_me:Function]
async def get_current_user(token: str = Depends(oauth2_scheme)): # [DEF:logout:Function]
""" # @PURPOSE: Logs out the current user (placeholder for session revocation).
Dependency to get the current user from the ADFS token. # @PRE: Valid JWT token provided.
This is a placeholder and needs to be fully implemented. # @POST: Returns success message.
""" @router.post("/logout")
# In a real implementation, you would: async def logout(current_user: UserSchema = Depends(get_current_user)):
# 1. Validate the token with ADFS. with belief_scope("api.auth.logout"):
# 2. Fetch user information. log_security_event("LOGOUT", current_user.username)
# 3. Create a user object. # In a stateless JWT setup, client-side token deletion is primary.
# For now, we'll just check if a token exists. # Server-side revocation (blacklisting) can be added here if needed.
if not token: return {"message": "Successfully logged out"}
raise HTTPException( # [/DEF:logout:Function]
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Not authenticated", # [DEF:login_adfs:Function]
headers={"WWW-Authenticate": "Bearer"}, # @PURPOSE: Initiates the ADFS OIDC login flow.
) # @POST: Redirects the user to ADFS.
# A real implementation would return a user object. @router.get("/login/adfs")
return {"placeholder_user": "user@example.com"} async def login_adfs(request: starlette.requests.Request):
# [/DEF] with belief_scope("api.auth.login_adfs"):
if not is_adfs_configured():
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="ADFS is not configured. Please set ADFS_CLIENT_ID, ADFS_CLIENT_SECRET, and ADFS_METADATA_URL environment variables."
)
redirect_uri = request.url_for('auth_callback_adfs')
return await oauth.adfs.authorize_redirect(request, str(redirect_uri))
# [/DEF:login_adfs:Function]
# [DEF:auth_callback_adfs:Function]
# @PURPOSE: Handles the callback from ADFS after successful authentication.
# @POST: Provisions user JIT and returns session token.
@router.get("/callback/adfs", name="auth_callback_adfs")
async def auth_callback_adfs(request: starlette.requests.Request, db: Session = Depends(get_auth_db)):
with belief_scope("api.auth.callback_adfs"):
if not is_adfs_configured():
raise HTTPException(
status_code=status.HTTP_503_SERVICE_UNAVAILABLE,
detail="ADFS is not configured. Please set ADFS_CLIENT_ID, ADFS_CLIENT_SECRET, and ADFS_METADATA_URL environment variables."
)
token = await oauth.adfs.authorize_access_token(request)
user_info = token.get('userinfo')
if not user_info:
raise HTTPException(status_code=400, detail="Failed to retrieve user info from ADFS")
auth_service = AuthService(db)
user = auth_service.provision_adfs_user(user_info)
return auth_service.create_session(user)
# [/DEF:auth_callback_adfs:Function]
# [/DEF:backend.src.api.auth:Module]

View File

@@ -1 +1 @@
from . import plugins, tasks, settings from . import plugins, tasks, settings, connections, environments, mappings, migration, git, storage, admin

View File

@@ -0,0 +1,310 @@
# [DEF:backend.src.api.routes.admin:Module]
#
# @TIER: STANDARD
# @SEMANTICS: api, admin, users, roles, permissions
# @PURPOSE: Admin API endpoints for user and role management.
# @LAYER: API
# @RELATION: USES -> backend.src.core.auth.repository.AuthRepository
# @RELATION: USES -> backend.src.dependencies.has_permission
#
# @INVARIANT: All endpoints in this module require 'Admin' role or 'admin' scope.
# [SECTION: IMPORTS]
from typing import List
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy.orm import Session
from ...core.database import get_auth_db
from ...core.auth.repository import AuthRepository
from ...core.auth.security import get_password_hash
from ...schemas.auth import (
User as UserSchema, UserCreate, UserUpdate,
RoleSchema, RoleCreate, RoleUpdate, PermissionSchema,
ADGroupMappingSchema, ADGroupMappingCreate
)
from ...models.auth import User, Role, Permission, ADGroupMapping
from ...dependencies import has_permission, get_current_user
from ...core.logger import logger, belief_scope
# [/SECTION]
# [DEF:router:Variable]
# @PURPOSE: APIRouter instance for admin routes.
router = APIRouter(prefix="/api/admin", tags=["admin"])
# [/DEF:router:Variable]
# [DEF:list_users:Function]
# @PURPOSE: Lists all registered users.
# @PRE: Current user has 'Admin' role.
# @POST: Returns a list of UserSchema objects.
# @PARAM: db (Session) - Auth database session.
# @RETURN: List[UserSchema] - List of users.
@router.get("/users", response_model=List[UserSchema])
async def list_users(
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:users", "READ"))
):
with belief_scope("api.admin.list_users"):
users = db.query(User).all()
return users
# [/DEF:list_users:Function]
# [DEF:create_user:Function]
# @PURPOSE: Creates a new local user.
# @PRE: Current user has 'Admin' role.
# @POST: New user is created in the database.
# @PARAM: user_in (UserCreate) - New user data.
# @PARAM: db (Session) - Auth database session.
# @RETURN: UserSchema - The created user.
@router.post("/users", response_model=UserSchema, status_code=status.HTTP_201_CREATED)
async def create_user(
user_in: UserCreate,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:users", "WRITE"))
):
with belief_scope("api.admin.create_user"):
repo = AuthRepository(db)
if repo.get_user_by_username(user_in.username):
raise HTTPException(status_code=400, detail="Username already exists")
new_user = User(
username=user_in.username,
email=user_in.email,
password_hash=get_password_hash(user_in.password),
auth_source="LOCAL",
is_active=user_in.is_active
)
for role_name in user_in.roles:
role = repo.get_role_by_name(role_name)
if role:
new_user.roles.append(role)
db.add(new_user)
db.commit()
db.refresh(new_user)
return new_user
# [/DEF:create_user:Function]
# [DEF:update_user:Function]
# @PURPOSE: Updates an existing user.
@router.put("/users/{user_id}", response_model=UserSchema)
async def update_user(
user_id: str,
user_in: UserUpdate,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:users", "WRITE"))
):
with belief_scope("api.admin.update_user"):
repo = AuthRepository(db)
user = repo.get_user_by_id(user_id)
if not user:
raise HTTPException(status_code=404, detail="User not found")
if user_in.email is not None:
user.email = user_in.email
if user_in.is_active is not None:
user.is_active = user_in.is_active
if user_in.password is not None:
user.password_hash = get_password_hash(user_in.password)
if user_in.roles is not None:
user.roles = []
for role_name in user_in.roles:
role = repo.get_role_by_name(role_name)
if role:
user.roles.append(role)
db.commit()
db.refresh(user)
return user
# [/DEF:update_user:Function]
# [DEF:delete_user:Function]
# @PURPOSE: Deletes a user.
@router.delete("/users/{user_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_user(
user_id: str,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:users", "WRITE"))
):
with belief_scope("api.admin.delete_user"):
logger.info(f"[DEBUG] Attempting to delete user context={{'user_id': '{user_id}'}}")
repo = AuthRepository(db)
user = repo.get_user_by_id(user_id)
if not user:
logger.warning(f"[DEBUG] User not found for deletion context={{'user_id': '{user_id}'}}")
raise HTTPException(status_code=404, detail="User not found")
logger.info(f"[DEBUG] Found user to delete context={{'username': '{user.username}'}}")
db.delete(user)
db.commit()
logger.info(f"[DEBUG] Successfully deleted user context={{'user_id': '{user_id}'}}")
return None
# [/DEF:delete_user:Function]
# [DEF:list_roles:Function]
# @PURPOSE: Lists all available roles.
# @RETURN: List[RoleSchema] - List of roles.
# @RELATION: CALLS -> backend.src.models.auth.Role
@router.get("/roles", response_model=List[RoleSchema])
async def list_roles(
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:roles", "READ"))
):
with belief_scope("api.admin.list_roles"):
return db.query(Role).all()
# [/DEF:list_roles:Function]
# [DEF:create_role:Function]
# @PURPOSE: Creates a new system role with associated permissions.
# @PRE: Role name must be unique.
# @POST: New Role record is created in auth.db.
# @PARAM: role_in (RoleCreate) - New role data.
# @PARAM: db (Session) - Auth database session.
# @RETURN: RoleSchema - The created role.
# @SIDE_EFFECT: Commits new role and associations to auth.db.
# @RELATION: CALLS -> backend.src.core.auth.repository.AuthRepository.get_permission_by_id
@router.post("/roles", response_model=RoleSchema, status_code=status.HTTP_201_CREATED)
async def create_role(
role_in: RoleCreate,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:roles", "WRITE"))
):
with belief_scope("api.admin.create_role"):
if db.query(Role).filter(Role.name == role_in.name).first():
raise HTTPException(status_code=400, detail="Role already exists")
new_role = Role(name=role_in.name, description=role_in.description)
repo = AuthRepository(db)
for perm_id_or_str in role_in.permissions:
perm = repo.get_permission_by_id(perm_id_or_str)
if not perm and ":" in perm_id_or_str:
res, act = perm_id_or_str.split(":", 1)
perm = repo.get_permission_by_resource_action(res, act)
if perm:
new_role.permissions.append(perm)
db.add(new_role)
db.commit()
db.refresh(new_role)
return new_role
# [/DEF:create_role:Function]
# [DEF:update_role:Function]
# @PURPOSE: Updates an existing role's metadata and permissions.
# @PRE: role_id must be a valid existing role UUID.
# @POST: Role record is updated in auth.db.
# @PARAM: role_id (str) - Target role identifier.
# @PARAM: role_in (RoleUpdate) - Updated role data.
# @PARAM: db (Session) - Auth database session.
# @RETURN: RoleSchema - The updated role.
# @SIDE_EFFECT: Commits updates to auth.db.
# @RELATION: CALLS -> backend.src.core.auth.repository.AuthRepository.get_role_by_id
@router.put("/roles/{role_id}", response_model=RoleSchema)
async def update_role(
role_id: str,
role_in: RoleUpdate,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:roles", "WRITE"))
):
with belief_scope("api.admin.update_role"):
repo = AuthRepository(db)
role = repo.get_role_by_id(role_id)
if not role:
raise HTTPException(status_code=404, detail="Role not found")
if role_in.name is not None:
role.name = role_in.name
if role_in.description is not None:
role.description = role_in.description
if role_in.permissions is not None:
role.permissions = []
for perm_id_or_str in role_in.permissions:
perm = repo.get_permission_by_id(perm_id_or_str)
if not perm and ":" in perm_id_or_str:
res, act = perm_id_or_str.split(":", 1)
perm = repo.get_permission_by_resource_action(res, act)
if perm:
role.permissions.append(perm)
db.commit()
db.refresh(role)
return role
# [/DEF:update_role:Function]
# [DEF:delete_role:Function]
# @PURPOSE: Removes a role from the system.
# @PRE: role_id must be a valid existing role UUID.
# @POST: Role record is removed from auth.db.
# @PARAM: role_id (str) - Target role identifier.
# @PARAM: db (Session) - Auth database session.
# @RETURN: None
# @SIDE_EFFECT: Deletes record from auth.db and commits.
# @RELATION: CALLS -> backend.src.core.auth.repository.AuthRepository.get_role_by_id
@router.delete("/roles/{role_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_role(
role_id: str,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:roles", "WRITE"))
):
with belief_scope("api.admin.delete_role"):
repo = AuthRepository(db)
role = repo.get_role_by_id(role_id)
if not role:
raise HTTPException(status_code=404, detail="Role not found")
db.delete(role)
db.commit()
return None
# [/DEF:delete_role:Function]
# [DEF:list_permissions:Function]
# @PURPOSE: Lists all available system permissions for assignment.
# @POST: Returns a list of all PermissionSchema objects.
# @PARAM: db (Session) - Auth database session.
# @RETURN: List[PermissionSchema] - List of permissions.
# @RELATION: CALLS -> backend.src.core.auth.repository.AuthRepository.list_permissions
@router.get("/permissions", response_model=List[PermissionSchema])
async def list_permissions(
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:roles", "READ"))
):
with belief_scope("api.admin.list_permissions"):
repo = AuthRepository(db)
return repo.list_permissions()
# [/DEF:list_permissions:Function]
# [DEF:list_ad_mappings:Function]
# @PURPOSE: Lists all AD Group to Role mappings.
@router.get("/ad-mappings", response_model=List[ADGroupMappingSchema])
async def list_ad_mappings(
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("api.admin.list_ad_mappings"):
return db.query(ADGroupMapping).all()
# [/DEF:list_ad_mappings:Function]
# [DEF:create_ad_mapping:Function]
# @PURPOSE: Creates a new AD Group mapping.
@router.post("/ad-mappings", response_model=ADGroupMappingSchema)
async def create_ad_mapping(
mapping_in: ADGroupMappingCreate,
db: Session = Depends(get_auth_db),
_ = Depends(has_permission("admin:settings", "WRITE"))
):
with belief_scope("api.admin.create_ad_mapping"):
new_mapping = ADGroupMapping(
ad_group=mapping_in.ad_group,
role_id=mapping_in.role_id
)
db.add(new_mapping)
db.commit()
db.refresh(new_mapping)
return new_mapping
# [/DEF:create_ad_mapping:Function]
# [/DEF:backend.src.api.routes.admin:Module]

View File

@@ -0,0 +1,100 @@
# [DEF:ConnectionsRouter:Module]
# @SEMANTICS: api, router, connections, database
# @PURPOSE: Defines the FastAPI router for managing external database connections.
# @LAYER: UI (API)
# @RELATION: Depends on SQLAlchemy session.
# @CONSTRAINT: Must use belief_scope for logging.
# [SECTION: IMPORTS]
from typing import List, Optional
from fastapi import APIRouter, Depends, HTTPException, status
from sqlalchemy.orm import Session
from ...core.database import get_db
from ...models.connection import ConnectionConfig
from pydantic import BaseModel, Field
from datetime import datetime
from ...core.logger import logger, belief_scope
# [/SECTION]
router = APIRouter()
# [DEF:ConnectionSchema:Class]
# @PURPOSE: Pydantic model for connection response.
class ConnectionSchema(BaseModel):
id: str
name: str
type: str
host: Optional[str] = None
port: Optional[int] = None
database: Optional[str] = None
username: Optional[str] = None
created_at: datetime
class Config:
orm_mode = True
# [/DEF:ConnectionSchema:Class]
# [DEF:ConnectionCreate:Class]
# @PURPOSE: Pydantic model for creating a connection.
class ConnectionCreate(BaseModel):
name: str
type: str
host: Optional[str] = None
port: Optional[int] = None
database: Optional[str] = None
username: Optional[str] = None
password: Optional[str] = None
# [/DEF:ConnectionCreate:Class]
# [DEF:list_connections:Function]
# @PURPOSE: Lists all saved connections.
# @PRE: Database session is active.
# @POST: Returns list of connection configs.
# @PARAM: db (Session) - Database session.
# @RETURN: List[ConnectionSchema] - List of connections.
@router.get("", response_model=List[ConnectionSchema])
async def list_connections(db: Session = Depends(get_db)):
with belief_scope("ConnectionsRouter.list_connections"):
connections = db.query(ConnectionConfig).all()
return connections
# [/DEF:list_connections:Function]
# [DEF:create_connection:Function]
# @PURPOSE: Creates a new connection configuration.
# @PRE: Connection name is unique.
# @POST: Connection is saved to DB.
# @PARAM: connection (ConnectionCreate) - Config data.
# @PARAM: db (Session) - Database session.
# @RETURN: ConnectionSchema - Created connection.
@router.post("", response_model=ConnectionSchema, status_code=status.HTTP_201_CREATED)
async def create_connection(connection: ConnectionCreate, db: Session = Depends(get_db)):
with belief_scope("ConnectionsRouter.create_connection", f"name={connection.name}"):
db_connection = ConnectionConfig(**connection.dict())
db.add(db_connection)
db.commit()
db.refresh(db_connection)
logger.info(f"[ConnectionsRouter.create_connection][Success] Created connection {db_connection.id}")
return db_connection
# [/DEF:create_connection:Function]
# [DEF:delete_connection:Function]
# @PURPOSE: Deletes a connection configuration.
# @PRE: Connection ID exists.
# @POST: Connection is removed from DB.
# @PARAM: connection_id (str) - ID to delete.
# @PARAM: db (Session) - Database session.
# @RETURN: None.
@router.delete("/{connection_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_connection(connection_id: str, db: Session = Depends(get_db)):
with belief_scope("ConnectionsRouter.delete_connection", f"id={connection_id}"):
db_connection = db.query(ConnectionConfig).filter(ConnectionConfig.id == connection_id).first()
if not db_connection:
logger.error(f"[ConnectionsRouter.delete_connection][State] Connection {connection_id} not found")
raise HTTPException(status_code=404, detail="Connection not found")
db.delete(db_connection)
db.commit()
logger.info(f"[ConnectionsRouter.delete_connection][Success] Deleted connection {connection_id}")
return
# [/DEF:delete_connection:Function]
# [/DEF:ConnectionsRouter:Module]

View File

@@ -11,65 +11,120 @@
# [SECTION: IMPORTS] # [SECTION: IMPORTS]
from fastapi import APIRouter, Depends, HTTPException from fastapi import APIRouter, Depends, HTTPException
from typing import List, Dict, Optional from typing import List, Dict, Optional
from backend.src.dependencies import get_config_manager from ...dependencies import get_config_manager, get_scheduler_service, has_permission
from backend.src.core.superset_client import SupersetClient from ...core.superset_client import SupersetClient
from superset_tool.models import SupersetConfig from pydantic import BaseModel, Field
from pydantic import BaseModel from ...core.config_models import Environment as EnvModel
from ...core.logger import belief_scope
# [/SECTION] # [/SECTION]
router = APIRouter(prefix="/api/environments", tags=["environments"]) router = APIRouter()
# [DEF:ScheduleSchema:DataClass]
class ScheduleSchema(BaseModel):
enabled: bool = False
cron_expression: str = Field(..., pattern=r'^(@(annually|yearly|monthly|weekly|daily|hourly|reboot))|((((\d+,)*\d+|(\d+(\/|-)\d+)|\d+|\*) ?){4,6})$')
# [/DEF:ScheduleSchema:DataClass]
# [DEF:EnvironmentResponse:DataClass] # [DEF:EnvironmentResponse:DataClass]
class EnvironmentResponse(BaseModel): class EnvironmentResponse(BaseModel):
id: str id: str
name: str name: str
url: str url: str
# [/DEF:EnvironmentResponse] backup_schedule: Optional[ScheduleSchema] = None
# [/DEF:EnvironmentResponse:DataClass]
# [DEF:DatabaseResponse:DataClass] # [DEF:DatabaseResponse:DataClass]
class DatabaseResponse(BaseModel): class DatabaseResponse(BaseModel):
uuid: str uuid: str
database_name: str database_name: str
engine: Optional[str] engine: Optional[str]
# [/DEF:DatabaseResponse] # [/DEF:DatabaseResponse:DataClass]
# [DEF:get_environments:Function] # [DEF:get_environments:Function]
# @PURPOSE: List all configured environments. # @PURPOSE: List all configured environments.
# @PRE: config_manager is injected via Depends.
# @POST: Returns a list of EnvironmentResponse objects.
# @RETURN: List[EnvironmentResponse] # @RETURN: List[EnvironmentResponse]
@router.get("", response_model=List[EnvironmentResponse]) @router.get("", response_model=List[EnvironmentResponse])
async def get_environments(config_manager=Depends(get_config_manager)): async def get_environments(
envs = config_manager.get_environments() config_manager=Depends(get_config_manager),
return [EnvironmentResponse(id=e.id, name=e.name, url=e.url) for e in envs] _ = Depends(has_permission("environments", "READ"))
# [/DEF:get_environments] ):
with belief_scope("get_environments"):
envs = config_manager.get_environments()
# Ensure envs is a list
if not isinstance(envs, list):
envs = []
return [
EnvironmentResponse(
id=e.id,
name=e.name,
url=e.url,
backup_schedule=ScheduleSchema(
enabled=e.backup_schedule.enabled,
cron_expression=e.backup_schedule.cron_expression
) if getattr(e, 'backup_schedule', None) else None
) for e in envs
]
# [/DEF:get_environments:Function]
# [DEF:update_environment_schedule:Function]
# @PURPOSE: Update backup schedule for an environment.
# @PRE: Environment id exists, schedule is valid ScheduleSchema.
# @POST: Backup schedule updated and scheduler reloaded.
# @PARAM: id (str) - The environment ID.
# @PARAM: schedule (ScheduleSchema) - The new schedule.
@router.put("/{id}/schedule")
async def update_environment_schedule(
id: str,
schedule: ScheduleSchema,
config_manager=Depends(get_config_manager),
scheduler_service=Depends(get_scheduler_service),
_ = Depends(has_permission("admin:settings", "WRITE"))
):
with belief_scope("update_environment_schedule", f"id={id}"):
envs = config_manager.get_environments()
env = next((e for e in envs if e.id == id), None)
if not env:
raise HTTPException(status_code=404, detail="Environment not found")
# Update environment config
env.backup_schedule.enabled = schedule.enabled
env.backup_schedule.cron_expression = schedule.cron_expression
config_manager.update_environment(id, env)
# Refresh scheduler
scheduler_service.load_schedules()
return {"message": "Schedule updated successfully"}
# [/DEF:update_environment_schedule:Function]
# [DEF:get_environment_databases:Function] # [DEF:get_environment_databases:Function]
# @PURPOSE: Fetch the list of databases from a specific environment. # @PURPOSE: Fetch the list of databases from a specific environment.
# @PRE: Environment id exists.
# @POST: Returns a list of database summaries from the environment.
# @PARAM: id (str) - The environment ID. # @PARAM: id (str) - The environment ID.
# @RETURN: List[Dict] - List of databases. # @RETURN: List[Dict] - List of databases.
@router.get("/{id}/databases") @router.get("/{id}/databases")
async def get_environment_databases(id: str, config_manager=Depends(get_config_manager)): async def get_environment_databases(
envs = config_manager.get_environments() id: str,
config_manager=Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("get_environment_databases", f"id={id}"):
envs = config_manager.get_environments()
env = next((e for e in envs if e.id == id), None) env = next((e for e in envs if e.id == id), None)
if not env: if not env:
raise HTTPException(status_code=404, detail="Environment not found") raise HTTPException(status_code=404, detail="Environment not found")
try: try:
# Initialize SupersetClient from environment config # Initialize SupersetClient from environment config
# Note: We need to map Environment model to SupersetConfig client = SupersetClient(env)
superset_config = SupersetConfig(
env=env.name,
base_url=env.url,
auth={
"provider": "db", # Defaulting to db provider
"username": env.username,
"password": env.password,
"refresh": "false"
}
)
client = SupersetClient(superset_config)
return client.get_databases_summary() return client.get_databases_summary()
except Exception as e: except Exception as e:
raise HTTPException(status_code=500, detail=f"Failed to fetch databases: {str(e)}") raise HTTPException(status_code=500, detail=f"Failed to fetch databases: {str(e)}")
# [/DEF:get_environment_databases] # [/DEF:get_environment_databases:Function]
# [/DEF:backend.src.api.routes.environments] # [/DEF:backend.src.api.routes.environments:Module]

View File

@@ -0,0 +1,455 @@
# [DEF:backend.src.api.routes.git:Module]
#
# @SEMANTICS: git, routes, api, fastapi, repository, deployment
# @PURPOSE: Provides FastAPI endpoints for Git integration operations.
# @LAYER: API
# @RELATION: USES -> src.services.git_service.GitService
# @RELATION: USES -> src.api.routes.git_schemas
# @RELATION: USES -> src.models.git
#
# @INVARIANT: All Git operations must be routed through GitService.
from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session
from typing import List, Optional
import typing
from src.dependencies import get_config_manager, has_permission
from src.core.database import get_db
from src.models.git import GitServerConfig, GitStatus, DeploymentEnvironment, GitRepository
from src.api.routes.git_schemas import (
GitServerConfigSchema, GitServerConfigCreate,
GitRepositorySchema, BranchSchema, BranchCreate,
BranchCheckout, CommitSchema, CommitCreate,
DeploymentEnvironmentSchema, DeployRequest, RepoInitRequest
)
from src.services.git_service import GitService
from src.core.logger import logger, belief_scope
router = APIRouter(prefix="/api/git", tags=["git"])
git_service = GitService()
# [DEF:get_git_configs:Function]
# @PURPOSE: List all configured Git servers.
# @PRE: Database session `db` is available.
# @POST: Returns a list of all GitServerConfig objects from the database.
# @RETURN: List[GitServerConfigSchema]
@router.get("/config", response_model=List[GitServerConfigSchema])
async def get_git_configs(
db: Session = Depends(get_db),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("get_git_configs"):
return db.query(GitServerConfig).all()
# [/DEF:get_git_configs:Function]
# [DEF:create_git_config:Function]
# @PURPOSE: Register a new Git server configuration.
# @PRE: `config` contains valid GitServerConfigCreate data.
# @POST: A new GitServerConfig record is created in the database.
# @PARAM: config (GitServerConfigCreate)
# @RETURN: GitServerConfigSchema
@router.post("/config", response_model=GitServerConfigSchema)
async def create_git_config(
config: GitServerConfigCreate,
db: Session = Depends(get_db),
_ = Depends(has_permission("admin:settings", "WRITE"))
):
with belief_scope("create_git_config"):
db_config = GitServerConfig(**config.dict())
db.add(db_config)
db.commit()
db.refresh(db_config)
return db_config
# [/DEF:create_git_config:Function]
# [DEF:delete_git_config:Function]
# @PURPOSE: Remove a Git server configuration.
# @PRE: `config_id` corresponds to an existing configuration.
# @POST: The configuration record is removed from the database.
# @PARAM: config_id (str)
@router.delete("/config/{config_id}")
async def delete_git_config(
config_id: str,
db: Session = Depends(get_db),
_ = Depends(has_permission("admin:settings", "WRITE"))
):
with belief_scope("delete_git_config"):
db_config = db.query(GitServerConfig).filter(GitServerConfig.id == config_id).first()
if not db_config:
raise HTTPException(status_code=404, detail="Configuration not found")
db.delete(db_config)
db.commit()
return {"status": "success", "message": "Configuration deleted"}
# [/DEF:delete_git_config:Function]
# [DEF:test_git_config:Function]
# @PURPOSE: Validate connection to a Git server using provided credentials.
# @PRE: `config` contains provider, url, and pat.
# @POST: Returns success if the connection is validated via GitService.
# @PARAM: config (GitServerConfigCreate)
@router.post("/config/test")
async def test_git_config(
config: GitServerConfigCreate,
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("test_git_config"):
success = await git_service.test_connection(config.provider, config.url, config.pat)
if success:
return {"status": "success", "message": "Connection successful"}
else:
raise HTTPException(status_code=400, detail="Connection failed")
# [/DEF:test_git_config:Function]
# [DEF:init_repository:Function]
# @PURPOSE: Link a dashboard to a Git repository and perform initial clone/init.
# @PRE: `dashboard_id` exists and `init_data` contains valid config_id and remote_url.
# @POST: Repository is initialized on disk and a GitRepository record is saved in DB.
# @PARAM: dashboard_id (int)
# @PARAM: init_data (RepoInitRequest)
@router.post("/repositories/{dashboard_id}/init")
async def init_repository(
dashboard_id: int,
init_data: RepoInitRequest,
db: Session = Depends(get_db),
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("init_repository"):
# 1. Get config
config = db.query(GitServerConfig).filter(GitServerConfig.id == init_data.config_id).first()
if not config:
raise HTTPException(status_code=404, detail="Git configuration not found")
try:
# 2. Perform Git clone/init
logger.info(f"[init_repository][Action] Initializing repo for dashboard {dashboard_id}")
git_service.init_repo(dashboard_id, init_data.remote_url, config.pat)
# 3. Save to DB
repo_path = git_service._get_repo_path(dashboard_id)
db_repo = db.query(GitRepository).filter(GitRepository.dashboard_id == dashboard_id).first()
if not db_repo:
db_repo = GitRepository(
dashboard_id=dashboard_id,
config_id=config.id,
remote_url=init_data.remote_url,
local_path=repo_path
)
db.add(db_repo)
else:
db_repo.config_id = config.id
db_repo.remote_url = init_data.remote_url
db_repo.local_path = repo_path
db.commit()
logger.info(f"[init_repository][Coherence:OK] Repository initialized for dashboard {dashboard_id}")
return {"status": "success", "message": "Repository initialized"}
except Exception as e:
db.rollback()
logger.error(f"[init_repository][Coherence:Failed] Failed to init repository: {e}")
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:init_repository:Function]
# [DEF:get_branches:Function]
# @PURPOSE: List all branches for a dashboard's repository.
# @PRE: Repository for `dashboard_id` is initialized.
# @POST: Returns a list of branches from the local repository.
# @PARAM: dashboard_id (int)
# @RETURN: List[BranchSchema]
@router.get("/repositories/{dashboard_id}/branches", response_model=List[BranchSchema])
async def get_branches(
dashboard_id: int,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("get_branches"):
try:
return git_service.list_branches(dashboard_id)
except Exception as e:
raise HTTPException(status_code=404, detail=str(e))
# [/DEF:get_branches:Function]
# [DEF:create_branch:Function]
# @PURPOSE: Create a new branch in the dashboard's repository.
# @PRE: `dashboard_id` repository exists and `branch_data` has name and from_branch.
# @POST: A new branch is created in the local repository.
# @PARAM: dashboard_id (int)
# @PARAM: branch_data (BranchCreate)
@router.post("/repositories/{dashboard_id}/branches")
async def create_branch(
dashboard_id: int,
branch_data: BranchCreate,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("create_branch"):
try:
git_service.create_branch(dashboard_id, branch_data.name, branch_data.from_branch)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:create_branch:Function]
# [DEF:checkout_branch:Function]
# @PURPOSE: Switch the dashboard's repository to a specific branch.
# @PRE: `dashboard_id` repository exists and branch `checkout_data.name` exists.
# @POST: The local repository HEAD is moved to the specified branch.
# @PARAM: dashboard_id (int)
# @PARAM: checkout_data (BranchCheckout)
@router.post("/repositories/{dashboard_id}/checkout")
async def checkout_branch(
dashboard_id: int,
checkout_data: BranchCheckout,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("checkout_branch"):
try:
git_service.checkout_branch(dashboard_id, checkout_data.name)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:checkout_branch:Function]
# [DEF:commit_changes:Function]
# @PURPOSE: Stage and commit changes in the dashboard's repository.
# @PRE: `dashboard_id` repository exists and `commit_data` has message and files.
# @POST: Specified files are staged and a new commit is created.
# @PARAM: dashboard_id (int)
# @PARAM: commit_data (CommitCreate)
@router.post("/repositories/{dashboard_id}/commit")
async def commit_changes(
dashboard_id: int,
commit_data: CommitCreate,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("commit_changes"):
try:
git_service.commit_changes(dashboard_id, commit_data.message, commit_data.files)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:commit_changes:Function]
# [DEF:push_changes:Function]
# @PURPOSE: Push local commits to the remote repository.
# @PRE: `dashboard_id` repository exists and has a remote configured.
# @POST: Local commits are pushed to the remote repository.
# @PARAM: dashboard_id (int)
@router.post("/repositories/{dashboard_id}/push")
async def push_changes(
dashboard_id: int,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("push_changes"):
try:
git_service.push_changes(dashboard_id)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:push_changes:Function]
# [DEF:pull_changes:Function]
# @PURPOSE: Pull changes from the remote repository.
# @PRE: `dashboard_id` repository exists and has a remote configured.
# @POST: Remote changes are fetched and merged into the local branch.
# @PARAM: dashboard_id (int)
@router.post("/repositories/{dashboard_id}/pull")
async def pull_changes(
dashboard_id: int,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("pull_changes"):
try:
git_service.pull_changes(dashboard_id)
return {"status": "success"}
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:pull_changes:Function]
# [DEF:sync_dashboard:Function]
# @PURPOSE: Sync dashboard state from Superset to Git using the GitPlugin.
# @PRE: `dashboard_id` is valid; GitPlugin is available.
# @POST: Dashboard YAMLs are exported from Superset and committed to Git.
# @PARAM: dashboard_id (int)
# @PARAM: source_env_id (Optional[str])
@router.post("/repositories/{dashboard_id}/sync")
async def sync_dashboard(
dashboard_id: int,
source_env_id: typing.Optional[str] = None,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("sync_dashboard"):
try:
from src.plugins.git_plugin import GitPlugin
plugin = GitPlugin()
return await plugin.execute({
"operation": "sync",
"dashboard_id": dashboard_id,
"source_env_id": source_env_id
})
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:sync_dashboard:Function]
# [DEF:get_environments:Function]
# @PURPOSE: List all deployment environments.
# @PRE: Config manager is accessible.
# @POST: Returns a list of DeploymentEnvironmentSchema objects.
# @RETURN: List[DeploymentEnvironmentSchema]
@router.get("/environments", response_model=List[DeploymentEnvironmentSchema])
async def get_environments(
config_manager=Depends(get_config_manager),
_ = Depends(has_permission("environments", "READ"))
):
with belief_scope("get_environments"):
envs = config_manager.get_environments()
return [
DeploymentEnvironmentSchema(
id=e.id,
name=e.name,
superset_url=e.url,
is_active=True
) for e in envs
]
# [/DEF:get_environments:Function]
# [DEF:deploy_dashboard:Function]
# @PURPOSE: Deploy dashboard from Git to a target environment.
# @PRE: `dashboard_id` and `deploy_data.environment_id` are valid.
# @POST: Dashboard YAMLs are read from Git and imported into the target Superset.
# @PARAM: dashboard_id (int)
# @PARAM: deploy_data (DeployRequest)
@router.post("/repositories/{dashboard_id}/deploy")
async def deploy_dashboard(
dashboard_id: int,
deploy_data: DeployRequest,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("deploy_dashboard"):
try:
from src.plugins.git_plugin import GitPlugin
plugin = GitPlugin()
return await plugin.execute({
"operation": "deploy",
"dashboard_id": dashboard_id,
"environment_id": deploy_data.environment_id
})
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:deploy_dashboard:Function]
# [DEF:get_history:Function]
# @PURPOSE: View commit history for a dashboard's repository.
# @PRE: `dashboard_id` repository exists.
# @POST: Returns a list of recent commits from the repository.
# @PARAM: dashboard_id (int)
# @PARAM: limit (int)
# @RETURN: List[CommitSchema]
@router.get("/repositories/{dashboard_id}/history", response_model=List[CommitSchema])
async def get_history(
dashboard_id: int,
limit: int = 50,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("get_history"):
try:
return git_service.get_commit_history(dashboard_id, limit)
except Exception as e:
raise HTTPException(status_code=404, detail=str(e))
# [/DEF:get_history:Function]
# [DEF:get_repository_status:Function]
# @PURPOSE: Get current Git status for a dashboard repository.
# @PRE: `dashboard_id` repository exists.
# @POST: Returns the status of the working directory (staged, unstaged, untracked).
# @PARAM: dashboard_id (int)
# @RETURN: dict
@router.get("/repositories/{dashboard_id}/status")
async def get_repository_status(
dashboard_id: int,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("get_repository_status"):
try:
return git_service.get_status(dashboard_id)
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:get_repository_status:Function]
# [DEF:get_repository_diff:Function]
# @PURPOSE: Get Git diff for a dashboard repository.
# @PRE: `dashboard_id` repository exists.
# @POST: Returns the diff text for the specified file or all changes.
# @PARAM: dashboard_id (int)
# @PARAM: file_path (Optional[str])
# @PARAM: staged (bool)
# @RETURN: str
@router.get("/repositories/{dashboard_id}/diff")
async def get_repository_diff(
dashboard_id: int,
file_path: Optional[str] = None,
staged: bool = False,
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("get_repository_diff"):
try:
diff_text = git_service.get_diff(dashboard_id, file_path, staged)
return diff_text
except Exception as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:get_repository_diff:Function]
# [DEF:generate_commit_message:Function]
# @PURPOSE: Generate a suggested commit message using LLM.
# @PRE: Repository for `dashboard_id` is initialized.
# @POST: Returns a suggested commit message string.
@router.post("/repositories/{dashboard_id}/generate-message")
async def generate_commit_message(
dashboard_id: int,
db: Session = Depends(get_db),
_ = Depends(has_permission("plugin:git", "EXECUTE"))
):
with belief_scope("generate_commit_message"):
try:
# 1. Get Diff
diff = git_service.get_diff(dashboard_id, staged=True)
if not diff:
diff = git_service.get_diff(dashboard_id, staged=False)
if not diff:
return {"message": "No changes detected"}
# 2. Get History
history_objs = git_service.get_commit_history(dashboard_id, limit=5)
history = [h.message for h in history_objs if hasattr(h, 'message')]
# 3. Get LLM Client
from ...services.llm_provider import LLMProviderService
from ...plugins.llm_analysis.service import LLMClient
from ...plugins.llm_analysis.models import LLMProviderType
llm_service = LLMProviderService(db)
providers = llm_service.get_all_providers()
provider = next((p for p in providers if p.is_active), None)
if not provider:
raise HTTPException(status_code=400, detail="No active LLM provider found")
api_key = llm_service.get_decrypted_api_key(provider.id)
client = LLMClient(
provider_type=LLMProviderType(provider.provider_type),
api_key=api_key,
base_url=provider.base_url,
default_model=provider.default_model
)
# 4. Generate Message
from ...plugins.git.llm_extension import GitLLMExtension
extension = GitLLMExtension(client)
message = await extension.suggest_commit_message(diff, history)
return {"message": message}
except Exception as e:
logger.error(f"Failed to generate commit message: {e}")
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:generate_commit_message:Function]
# [/DEF:backend.src.api.routes.git:Module]

View File

@@ -0,0 +1,145 @@
# [DEF:backend.src.api.routes.git_schemas:Module]
#
# @TIER: STANDARD
# @SEMANTICS: git, schemas, pydantic, api, contracts
# @PURPOSE: Defines Pydantic models for the Git integration API layer.
# @LAYER: API
# @RELATION: DEPENDS_ON -> backend.src.models.git
#
# @INVARIANT: All schemas must be compatible with the FastAPI router.
from pydantic import BaseModel, Field
from typing import List, Optional
from datetime import datetime
from uuid import UUID
from src.models.git import GitProvider, GitStatus, SyncStatus
# [DEF:GitServerConfigBase:Class]
# @TIER: TRIVIAL
# @PURPOSE: Base schema for Git server configuration attributes.
class GitServerConfigBase(BaseModel):
name: str = Field(..., description="Display name for the Git server")
provider: GitProvider = Field(..., description="Git provider (GITHUB, GITLAB, GITEA)")
url: str = Field(..., description="Server base URL")
pat: str = Field(..., description="Personal Access Token")
default_repository: Optional[str] = Field(None, description="Default repository path (org/repo)")
# [/DEF:GitServerConfigBase:Class]
# [DEF:GitServerConfigCreate:Class]
# @PURPOSE: Schema for creating a new Git server configuration.
class GitServerConfigCreate(GitServerConfigBase):
"""Schema for creating a new Git server configuration."""
pass
# [/DEF:GitServerConfigCreate:Class]
# [DEF:GitServerConfigSchema:Class]
# @PURPOSE: Schema for representing a Git server configuration with metadata.
class GitServerConfigSchema(GitServerConfigBase):
"""Schema for representing a Git server configuration with metadata."""
id: str
status: GitStatus
last_validated: datetime
class Config:
from_attributes = True
# [/DEF:GitServerConfigSchema:Class]
# [DEF:GitRepositorySchema:Class]
# @PURPOSE: Schema for tracking a local Git repository linked to a dashboard.
class GitRepositorySchema(BaseModel):
"""Schema for tracking a local Git repository linked to a dashboard."""
id: str
dashboard_id: int
config_id: str
remote_url: str
local_path: str
current_branch: str
sync_status: SyncStatus
class Config:
from_attributes = True
# [/DEF:GitRepositorySchema:Class]
# [DEF:BranchSchema:Class]
# @PURPOSE: Schema for representing a Git branch metadata.
class BranchSchema(BaseModel):
"""Schema for representing a Git branch."""
name: str
commit_hash: str
is_remote: bool
last_updated: datetime
# [/DEF:BranchSchema:Class]
# [DEF:CommitSchema:Class]
# @PURPOSE: Schema for representing Git commit details.
class CommitSchema(BaseModel):
"""Schema for representing a Git commit."""
hash: str
author: str
email: str
timestamp: datetime
message: str
files_changed: List[str]
# [/DEF:CommitSchema:Class]
# [DEF:BranchCreate:Class]
# @PURPOSE: Schema for branch creation requests.
class BranchCreate(BaseModel):
"""Schema for branch creation requests."""
name: str
from_branch: str
# [/DEF:BranchCreate:Class]
# [DEF:BranchCheckout:Class]
# @PURPOSE: Schema for branch checkout requests.
class BranchCheckout(BaseModel):
"""Schema for branch checkout requests."""
name: str
# [/DEF:BranchCheckout:Class]
# [DEF:CommitCreate:Class]
# @PURPOSE: Schema for staging and committing changes.
class CommitCreate(BaseModel):
"""Schema for staging and committing changes."""
message: str
files: List[str]
# [/DEF:CommitCreate:Class]
# [DEF:ConflictResolution:Class]
# @PURPOSE: Schema for resolving merge conflicts.
class ConflictResolution(BaseModel):
"""Schema for resolving merge conflicts."""
file_path: str
resolution: str = Field(pattern="^(mine|theirs|manual)$")
content: Optional[str] = None
# [/DEF:ConflictResolution:Class]
# [DEF:DeploymentEnvironmentSchema:Class]
# @PURPOSE: Schema for representing a target deployment environment.
class DeploymentEnvironmentSchema(BaseModel):
"""Schema for representing a target deployment environment."""
id: str
name: str
superset_url: str
is_active: bool
class Config:
from_attributes = True
# [/DEF:DeploymentEnvironmentSchema:Class]
# [DEF:DeployRequest:Class]
# @PURPOSE: Schema for dashboard deployment requests.
class DeployRequest(BaseModel):
"""Schema for deployment requests."""
environment_id: str
# [/DEF:DeployRequest:Class]
# [DEF:RepoInitRequest:Class]
# @PURPOSE: Schema for repository initialization requests.
class RepoInitRequest(BaseModel):
"""Schema for repository initialization requests."""
config_id: str
remote_url: str
# [/DEF:RepoInitRequest:Class]
# [/DEF:backend.src.api.routes.git_schemas:Module]

View File

@@ -0,0 +1,207 @@
# [DEF:backend/src/api/routes/llm.py:Module]
# @TIER: STANDARD
# @SEMANTICS: api, routes, llm
# @PURPOSE: API routes for LLM provider configuration and management.
# @LAYER: UI (API)
from fastapi import APIRouter, Depends, HTTPException, status
from typing import List
from ...core.logger import logger
from ...schemas.auth import User
from ...dependencies import get_current_user as get_current_active_user
from ...plugins.llm_analysis.models import LLMProviderConfig, LLMProviderType
from ...services.llm_provider import LLMProviderService
from ...core.database import get_db
from sqlalchemy.orm import Session
# [DEF:router:Global]
# @PURPOSE: APIRouter instance for LLM routes.
router = APIRouter(prefix="/api/llm", tags=["LLM"])
# [/DEF:router:Global]
# [DEF:get_providers:Function]
# @PURPOSE: Retrieve all LLM provider configurations.
# @PRE: User is authenticated.
# @POST: Returns list of LLMProviderConfig.
@router.get("/providers", response_model=List[LLMProviderConfig])
async def get_providers(
current_user: User = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""
Get all LLM provider configurations.
"""
logger.info(f"[llm_routes][get_providers][Action] Fetching providers for user: {current_user.username}")
service = LLMProviderService(db)
providers = service.get_all_providers()
return [
LLMProviderConfig(
id=p.id,
provider_type=LLMProviderType(p.provider_type),
name=p.name,
base_url=p.base_url,
api_key="********",
default_model=p.default_model,
is_active=p.is_active
) for p in providers
]
# [/DEF:get_providers:Function]
# [DEF:create_provider:Function]
# @PURPOSE: Create a new LLM provider configuration.
# @PRE: User is authenticated and has admin permissions.
# @POST: Returns the created LLMProviderConfig.
@router.post("/providers", response_model=LLMProviderConfig, status_code=status.HTTP_201_CREATED)
async def create_provider(
config: LLMProviderConfig,
current_user: User = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""
Create a new LLM provider configuration.
"""
service = LLMProviderService(db)
provider = service.create_provider(config)
return LLMProviderConfig(
id=provider.id,
provider_type=LLMProviderType(provider.provider_type),
name=provider.name,
base_url=provider.base_url,
api_key="********",
default_model=provider.default_model,
is_active=provider.is_active
)
# [/DEF:create_provider:Function]
# [DEF:update_provider:Function]
# @PURPOSE: Update an existing LLM provider configuration.
# @PRE: User is authenticated and has admin permissions.
# @POST: Returns the updated LLMProviderConfig.
@router.put("/providers/{provider_id}", response_model=LLMProviderConfig)
async def update_provider(
provider_id: str,
config: LLMProviderConfig,
current_user: User = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""
Update an existing LLM provider configuration.
"""
service = LLMProviderService(db)
provider = service.update_provider(provider_id, config)
if not provider:
raise HTTPException(status_code=404, detail="Provider not found")
return LLMProviderConfig(
id=provider.id,
provider_type=LLMProviderType(provider.provider_type),
name=provider.name,
base_url=provider.base_url,
api_key="********",
default_model=provider.default_model,
is_active=provider.is_active
)
# [/DEF:update_provider:Function]
# [DEF:delete_provider:Function]
# @PURPOSE: Delete an LLM provider configuration.
# @PRE: User is authenticated and has admin permissions.
# @POST: Returns success status.
@router.delete("/providers/{provider_id}", status_code=status.HTTP_204_NO_CONTENT)
async def delete_provider(
provider_id: str,
current_user: User = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
"""
Delete an LLM provider configuration.
"""
service = LLMProviderService(db)
if not service.delete_provider(provider_id):
raise HTTPException(status_code=404, detail="Provider not found")
return
# [/DEF:delete_provider:Function]
# [DEF:test_connection:Function]
# @PURPOSE: Test connection to an LLM provider.
# @PRE: User is authenticated.
# @POST: Returns success status and message.
@router.post("/providers/{provider_id}/test")
async def test_connection(
provider_id: str,
current_user: User = Depends(get_current_active_user),
db: Session = Depends(get_db)
):
logger.info(f"[llm_routes][test_connection][Action] Testing connection for provider_id: {provider_id}")
"""
Test connection to an LLM provider.
"""
from ...plugins.llm_analysis.service import LLMClient
service = LLMProviderService(db)
db_provider = service.get_provider(provider_id)
if not db_provider:
raise HTTPException(status_code=404, detail="Provider not found")
api_key = service.get_decrypted_api_key(provider_id)
# Check if API key was successfully decrypted
if not api_key:
logger.error(f"[llm_routes][test_connection] Failed to decrypt API key for provider {provider_id}")
raise HTTPException(
status_code=500,
detail="Failed to decrypt API key. The provider may have been encrypted with a different encryption key. Please update the provider with a new API key."
)
client = LLMClient(
provider_type=LLMProviderType(db_provider.provider_type),
api_key=api_key,
base_url=db_provider.base_url,
default_model=db_provider.default_model
)
try:
# Simple test call
await client.client.models.list()
return {"success": True, "message": "Connection successful"}
except Exception as e:
return {"success": False, "error": str(e)}
# [/DEF:test_connection:Function]
# [DEF:test_provider_config:Function]
# @PURPOSE: Test connection with a provided configuration (not yet saved).
# @PRE: User is authenticated.
# @POST: Returns success status and message.
@router.post("/providers/test")
async def test_provider_config(
config: LLMProviderConfig,
current_user: User = Depends(get_current_active_user)
):
"""
Test connection with a provided configuration.
"""
from ...plugins.llm_analysis.service import LLMClient
logger.info(f"[llm_routes][test_provider_config][Action] Testing config for {config.name}")
# Check if API key is provided
if not config.api_key or config.api_key == "********":
raise HTTPException(
status_code=400,
detail="API key is required for testing connection"
)
client = LLMClient(
provider_type=config.provider_type,
api_key=config.api_key,
base_url=config.base_url,
default_model=config.default_model
)
try:
# Simple test call
await client.client.models.list()
return {"success": True, "message": "Connection successful"}
except Exception as e:
return {"success": False, "error": str(e)}
# [/DEF:test_provider_config:Function]
# [/DEF:backend/src/api/routes/llm.py]

View File

@@ -13,9 +13,10 @@
from fastapi import APIRouter, Depends, HTTPException from fastapi import APIRouter, Depends, HTTPException
from sqlalchemy.orm import Session from sqlalchemy.orm import Session
from typing import List, Optional from typing import List, Optional
from backend.src.dependencies import get_config_manager from ...core.logger import belief_scope
from backend.src.core.database import get_db from ...dependencies import get_config_manager, has_permission
from backend.src.models.mapping import DatabaseMapping from ...core.database import get_db
from ...models.mapping import DatabaseMapping
from pydantic import BaseModel from pydantic import BaseModel
# [/SECTION] # [/SECTION]
@@ -29,7 +30,7 @@ class MappingCreate(BaseModel):
target_db_uuid: str target_db_uuid: str
source_db_name: str source_db_name: str
target_db_name: str target_db_name: str
# [/DEF:MappingCreate] # [/DEF:MappingCreate:DataClass]
# [DEF:MappingResponse:DataClass] # [DEF:MappingResponse:DataClass]
class MappingResponse(BaseModel): class MappingResponse(BaseModel):
@@ -43,68 +44,83 @@ class MappingResponse(BaseModel):
class Config: class Config:
from_attributes = True from_attributes = True
# [/DEF:MappingResponse] # [/DEF:MappingResponse:DataClass]
# [DEF:SuggestRequest:DataClass] # [DEF:SuggestRequest:DataClass]
class SuggestRequest(BaseModel): class SuggestRequest(BaseModel):
source_env_id: str source_env_id: str
target_env_id: str target_env_id: str
# [/DEF:SuggestRequest] # [/DEF:SuggestRequest:DataClass]
# [DEF:get_mappings:Function] # [DEF:get_mappings:Function]
# @PURPOSE: List all saved database mappings. # @PURPOSE: List all saved database mappings.
# @PRE: db session is injected.
# @POST: Returns filtered list of DatabaseMapping records.
@router.get("", response_model=List[MappingResponse]) @router.get("", response_model=List[MappingResponse])
async def get_mappings( async def get_mappings(
source_env_id: Optional[str] = None, source_env_id: Optional[str] = None,
target_env_id: Optional[str] = None, target_env_id: Optional[str] = None,
db: Session = Depends(get_db) db: Session = Depends(get_db),
_ = Depends(has_permission("plugin:mapper", "EXECUTE"))
): ):
query = db.query(DatabaseMapping) with belief_scope("get_mappings"):
query = db.query(DatabaseMapping)
if source_env_id: if source_env_id:
query = query.filter(DatabaseMapping.source_env_id == source_env_id) query = query.filter(DatabaseMapping.source_env_id == source_env_id)
if target_env_id: if target_env_id:
query = query.filter(DatabaseMapping.target_env_id == target_env_id) query = query.filter(DatabaseMapping.target_env_id == target_env_id)
return query.all() return query.all()
# [/DEF:get_mappings] # [/DEF:get_mappings:Function]
# [DEF:create_mapping:Function] # [DEF:create_mapping:Function]
# @PURPOSE: Create or update a database mapping. # @PURPOSE: Create or update a database mapping.
# @PRE: mapping is valid MappingCreate, db session is injected.
# @POST: DatabaseMapping created or updated in database.
@router.post("", response_model=MappingResponse) @router.post("", response_model=MappingResponse)
async def create_mapping(mapping: MappingCreate, db: Session = Depends(get_db)): async def create_mapping(
# Check if mapping already exists mapping: MappingCreate,
existing = db.query(DatabaseMapping).filter( db: Session = Depends(get_db),
DatabaseMapping.source_env_id == mapping.source_env_id, _ = Depends(has_permission("plugin:mapper", "EXECUTE"))
DatabaseMapping.target_env_id == mapping.target_env_id, ):
DatabaseMapping.source_db_uuid == mapping.source_db_uuid with belief_scope("create_mapping"):
).first() # Check if mapping already exists
existing = db.query(DatabaseMapping).filter(
if existing: DatabaseMapping.source_env_id == mapping.source_env_id,
existing.target_db_uuid = mapping.target_db_uuid DatabaseMapping.target_env_id == mapping.target_env_id,
existing.target_db_name = mapping.target_db_name DatabaseMapping.source_db_uuid == mapping.source_db_uuid
).first()
if existing:
existing.target_db_uuid = mapping.target_db_uuid
existing.target_db_name = mapping.target_db_name
db.commit()
db.refresh(existing)
return existing
new_mapping = DatabaseMapping(**mapping.dict())
db.add(new_mapping)
db.commit() db.commit()
db.refresh(existing) db.refresh(new_mapping)
return existing return new_mapping
# [/DEF:create_mapping:Function]
new_mapping = DatabaseMapping(**mapping.dict())
db.add(new_mapping)
db.commit()
db.refresh(new_mapping)
return new_mapping
# [/DEF:create_mapping]
# [DEF:suggest_mappings_api:Function] # [DEF:suggest_mappings_api:Function]
# @PURPOSE: Get suggested mappings based on fuzzy matching. # @PURPOSE: Get suggested mappings based on fuzzy matching.
# @PRE: request is valid SuggestRequest, config_manager is injected.
# @POST: Returns mapping suggestions.
@router.post("/suggest") @router.post("/suggest")
async def suggest_mappings_api( async def suggest_mappings_api(
request: SuggestRequest, request: SuggestRequest,
config_manager=Depends(get_config_manager) config_manager=Depends(get_config_manager),
_ = Depends(has_permission("plugin:mapper", "EXECUTE"))
): ):
from backend.src.services.mapping_service import MappingService with belief_scope("suggest_mappings_api"):
service = MappingService(config_manager) from ...services.mapping_service import MappingService
try: service = MappingService(config_manager)
return await service.get_suggestions(request.source_env_id, request.target_env_id) try:
except Exception as e: return await service.get_suggestions(request.source_env_id, request.target_env_id)
raise HTTPException(status_code=500, detail=str(e)) except Exception as e:
# [/DEF:suggest_mappings_api] raise HTTPException(status_code=500, detail=str(e))
# [/DEF:suggest_mappings_api:Function]
# [/DEF:backend.src.api.routes.mappings] # [/DEF:backend.src.api.routes.mappings:Module]

View File

@@ -0,0 +1,80 @@
# [DEF:backend.src.api.routes.migration:Module]
# @SEMANTICS: api, migration, dashboards
# @PURPOSE: API endpoints for migration operations.
# @LAYER: API
# @RELATION: DEPENDS_ON -> backend.src.dependencies
# @RELATION: DEPENDS_ON -> backend.src.models.dashboard
from fastapi import APIRouter, Depends, HTTPException
from typing import List, Dict
from ...dependencies import get_config_manager, get_task_manager, has_permission
from ...models.dashboard import DashboardMetadata, DashboardSelection
from ...core.superset_client import SupersetClient
from ...core.logger import belief_scope
router = APIRouter(prefix="/api", tags=["migration"])
# [DEF:get_dashboards:Function]
# @PURPOSE: Fetch all dashboards from the specified environment for the grid.
# @PRE: Environment ID must be valid.
# @POST: Returns a list of dashboard metadata.
# @PARAM: env_id (str) - The ID of the environment to fetch from.
# @RETURN: List[DashboardMetadata]
@router.get("/environments/{env_id}/dashboards", response_model=List[DashboardMetadata])
async def get_dashboards(
env_id: str,
config_manager=Depends(get_config_manager),
_ = Depends(has_permission("plugin:migration", "EXECUTE"))
):
with belief_scope("get_dashboards", f"env_id={env_id}"):
environments = config_manager.get_environments()
env = next((e for e in environments if e.id == env_id), None)
if not env:
raise HTTPException(status_code=404, detail="Environment not found")
client = SupersetClient(env)
dashboards = client.get_dashboards_summary()
return dashboards
# [/DEF:get_dashboards:Function]
# [DEF:execute_migration:Function]
# @PURPOSE: Execute the migration of selected dashboards.
# @PRE: Selection must be valid and environments must exist.
# @POST: Starts the migration task and returns the task ID.
# @PARAM: selection (DashboardSelection) - The dashboards to migrate.
# @RETURN: Dict - {"task_id": str, "message": str}
@router.post("/migration/execute")
async def execute_migration(
selection: DashboardSelection,
config_manager=Depends(get_config_manager),
task_manager=Depends(get_task_manager),
_ = Depends(has_permission("plugin:migration", "EXECUTE"))
):
with belief_scope("execute_migration"):
# Validate environments exist
environments = config_manager.get_environments()
env_ids = {e.id for e in environments}
if selection.source_env_id not in env_ids or selection.target_env_id not in env_ids:
raise HTTPException(status_code=400, detail="Invalid source or target environment")
# Create migration task with debug logging
from ...core.logger import logger
# Include replace_db_config in the task parameters
task_params = selection.dict()
task_params['replace_db_config'] = selection.replace_db_config
logger.info(f"Creating migration task with params: {task_params}")
logger.info(f"Available environments: {env_ids}")
logger.info(f"Source env: {selection.source_env_id}, Target env: {selection.target_env_id}")
try:
task = await task_manager.create_task("superset-migration", task_params)
logger.info(f"Task created successfully: {task.id}")
return {"task_id": task.id, "message": "Migration initiated"}
except Exception as e:
logger.error(f"Task creation failed: {e}")
raise HTTPException(status_code=500, detail=f"Failed to create migration task: {str(e)}")
# [/DEF:execute_migration:Function]
# [/DEF:backend.src.api.routes.migration:Module]

View File

@@ -7,16 +7,25 @@ from typing import List
from fastapi import APIRouter, Depends from fastapi import APIRouter, Depends
from ...core.plugin_base import PluginConfig from ...core.plugin_base import PluginConfig
from ...dependencies import get_plugin_loader from ...dependencies import get_plugin_loader, has_permission
from ...core.logger import belief_scope
router = APIRouter() router = APIRouter()
@router.get("/", response_model=List[PluginConfig]) # [DEF:list_plugins:Function]
# @PURPOSE: Retrieve a list of all available plugins.
# @PRE: plugin_loader is injected via Depends.
# @POST: Returns a list of PluginConfig objects.
# @RETURN: List[PluginConfig] - List of registered plugins.
@router.get("", response_model=List[PluginConfig])
async def list_plugins( async def list_plugins(
plugin_loader = Depends(get_plugin_loader) plugin_loader = Depends(get_plugin_loader),
_ = Depends(has_permission("plugins", "READ"))
): ):
""" with belief_scope("list_plugins"):
Retrieve a list of all available plugins. """
""" Retrieve a list of all available plugins.
return plugin_loader.get_all_plugin_configs() """
# [/DEF] return plugin_loader.get_all_plugin_configs()
# [/DEF:list_plugins:Function]
# [/DEF:PluginsRouter:Module]

View File

@@ -13,11 +13,11 @@
from fastapi import APIRouter, Depends, HTTPException from fastapi import APIRouter, Depends, HTTPException
from typing import List from typing import List
from ...core.config_models import AppConfig, Environment, GlobalSettings from ...core.config_models import AppConfig, Environment, GlobalSettings
from ...dependencies import get_config_manager from ...models.storage import StorageConfig
from ...dependencies import get_config_manager, has_permission
from ...core.config_manager import ConfigManager from ...core.config_manager import ConfigManager
from ...core.logger import logger from ...core.logger import logger, belief_scope
from superset_tool.client import SupersetClient from ...core.superset_client import SupersetClient
from superset_tool.models import SupersetConfig
import os import os
# [/SECTION] # [/SECTION]
@@ -25,65 +25,110 @@ router = APIRouter()
# [DEF:get_settings:Function] # [DEF:get_settings:Function]
# @PURPOSE: Retrieves all application settings. # @PURPOSE: Retrieves all application settings.
# @PRE: Config manager is available.
# @POST: Returns masked AppConfig.
# @RETURN: AppConfig - The current configuration. # @RETURN: AppConfig - The current configuration.
@router.get("/", response_model=AppConfig) @router.get("", response_model=AppConfig)
async def get_settings(config_manager: ConfigManager = Depends(get_config_manager)): async def get_settings(
logger.info("[get_settings][Entry] Fetching all settings") config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("get_settings"):
logger.info("[get_settings][Entry] Fetching all settings")
config = config_manager.get_config().copy(deep=True) config = config_manager.get_config().copy(deep=True)
# Mask passwords # Mask passwords
for env in config.environments: for env in config.environments:
if env.password: if env.password:
env.password = "********" env.password = "********"
return config return config
# [/DEF:get_settings] # [/DEF:get_settings:Function]
# [DEF:update_global_settings:Function] # [DEF:update_global_settings:Function]
# @PURPOSE: Updates global application settings. # @PURPOSE: Updates global application settings.
# @PRE: New settings are provided.
# @POST: Global settings are updated.
# @PARAM: settings (GlobalSettings) - The new global settings. # @PARAM: settings (GlobalSettings) - The new global settings.
# @RETURN: GlobalSettings - The updated settings. # @RETURN: GlobalSettings - The updated settings.
@router.patch("/global", response_model=GlobalSettings) @router.patch("/global", response_model=GlobalSettings)
async def update_global_settings( async def update_global_settings(
settings: GlobalSettings, settings: GlobalSettings,
config_manager: ConfigManager = Depends(get_config_manager) config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "WRITE"))
): ):
logger.info("[update_global_settings][Entry] Updating global settings") with belief_scope("update_global_settings"):
logger.info("[update_global_settings][Entry] Updating global settings")
config_manager.update_global_settings(settings) config_manager.update_global_settings(settings)
return settings return settings
# [/DEF:update_global_settings] # [/DEF:update_global_settings:Function]
# [DEF:get_storage_settings:Function]
# @PURPOSE: Retrieves storage-specific settings.
# @RETURN: StorageConfig - The storage configuration.
@router.get("/storage", response_model=StorageConfig)
async def get_storage_settings(
config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("get_storage_settings"):
return config_manager.get_config().settings.storage
# [/DEF:get_storage_settings:Function]
# [DEF:update_storage_settings:Function]
# @PURPOSE: Updates storage-specific settings.
# @PARAM: storage (StorageConfig) - The new storage settings.
# @POST: Storage settings are updated and saved.
# @RETURN: StorageConfig - The updated storage settings.
@router.put("/storage", response_model=StorageConfig)
async def update_storage_settings(
storage: StorageConfig,
config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "WRITE"))
):
with belief_scope("update_storage_settings"):
is_valid, message = config_manager.validate_path(storage.root_path)
if not is_valid:
raise HTTPException(status_code=400, detail=message)
settings = config_manager.get_config().settings
settings.storage = storage
config_manager.update_global_settings(settings)
return config_manager.get_config().settings.storage
# [/DEF:update_storage_settings:Function]
# [DEF:get_environments:Function] # [DEF:get_environments:Function]
# @PURPOSE: Lists all configured Superset environments. # @PURPOSE: Lists all configured Superset environments.
# @PRE: Config manager is available.
# @POST: Returns list of environments.
# @RETURN: List[Environment] - List of environments. # @RETURN: List[Environment] - List of environments.
@router.get("/environments", response_model=List[Environment]) @router.get("/environments", response_model=List[Environment])
async def get_environments(config_manager: ConfigManager = Depends(get_config_manager)): async def get_environments(
logger.info("[get_environments][Entry] Fetching environments") config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "READ"))
):
with belief_scope("get_environments"):
logger.info("[get_environments][Entry] Fetching environments")
return config_manager.get_environments() return config_manager.get_environments()
# [/DEF:get_environments] # [/DEF:get_environments:Function]
# [DEF:add_environment:Function] # [DEF:add_environment:Function]
# @PURPOSE: Adds a new Superset environment. # @PURPOSE: Adds a new Superset environment.
# @PRE: Environment data is valid and reachable.
# @POST: Environment is added to config.
# @PARAM: env (Environment) - The environment to add. # @PARAM: env (Environment) - The environment to add.
# @RETURN: Environment - The added environment. # @RETURN: Environment - The added environment.
@router.post("/environments", response_model=Environment) @router.post("/environments", response_model=Environment)
async def add_environment( async def add_environment(
env: Environment, env: Environment,
config_manager: ConfigManager = Depends(get_config_manager) config_manager: ConfigManager = Depends(get_config_manager),
_ = Depends(has_permission("admin:settings", "WRITE"))
): ):
logger.info(f"[add_environment][Entry] Adding environment {env.id}") with belief_scope("add_environment"):
logger.info(f"[add_environment][Entry] Adding environment {env.id}")
# Validate connection before adding # Validate connection before adding
try: try:
superset_config = SupersetConfig( client = SupersetClient(env)
env=env.name,
base_url=env.url,
auth={
"provider": "db",
"username": env.username,
"password": env.password,
"refresh": "true"
}
)
client = SupersetClient(config=superset_config)
client.get_dashboards(query={"page_size": 1}) client.get_dashboards(query={"page_size": 1})
except Exception as e: except Exception as e:
logger.error(f"[add_environment][Coherence:Failed] Connection validation failed: {e}") logger.error(f"[add_environment][Coherence:Failed] Connection validation failed: {e}")
@@ -91,20 +136,23 @@ async def add_environment(
config_manager.add_environment(env) config_manager.add_environment(env)
return env return env
# [/DEF:add_environment] # [/DEF:add_environment:Function]
# [DEF:update_environment:Function] # [DEF:update_environment:Function]
# @PURPOSE: Updates an existing Superset environment. # @PURPOSE: Updates an existing Superset environment.
# @PRE: ID and valid environment data are provided.
# @POST: Environment is updated in config.
# @PARAM: id (str) - The ID of the environment to update. # @PARAM: id (str) - The ID of the environment to update.
# @PARAM: env (Environment) - The updated environment data. # @PARAM: env (Environment) - The updated environment data.
# @RETURN: Environment - The updated environment. # @RETURN: Environment - The updated environment.
@router.put("/environments/{id}", response_model=Environment) @router.put("/environments/{id}", response_model=Environment)
async def update_environment( async def update_environment(
id: str, id: str,
env: Environment, env: Environment,
config_manager: ConfigManager = Depends(get_config_manager) config_manager: ConfigManager = Depends(get_config_manager)
): ):
logger.info(f"[update_environment][Entry] Updating environment {id}") with belief_scope("update_environment"):
logger.info(f"[update_environment][Entry] Updating environment {id}")
# If password is masked, we need the real one for validation # If password is masked, we need the real one for validation
env_to_validate = env.copy(deep=True) env_to_validate = env.copy(deep=True)
@@ -115,17 +163,7 @@ async def update_environment(
# Validate connection before updating # Validate connection before updating
try: try:
superset_config = SupersetConfig( client = SupersetClient(env_to_validate)
env=env_to_validate.name,
base_url=env_to_validate.url,
auth={
"provider": "db",
"username": env_to_validate.username,
"password": env_to_validate.password,
"refresh": "true"
}
)
client = SupersetClient(config=superset_config)
client.get_dashboards(query={"page_size": 1}) client.get_dashboards(query={"page_size": 1})
except Exception as e: except Exception as e:
logger.error(f"[update_environment][Coherence:Failed] Connection validation failed: {e}") logger.error(f"[update_environment][Coherence:Failed] Connection validation failed: {e}")
@@ -134,23 +172,28 @@ async def update_environment(
if config_manager.update_environment(id, env): if config_manager.update_environment(id, env):
return env return env
raise HTTPException(status_code=404, detail=f"Environment {id} not found") raise HTTPException(status_code=404, detail=f"Environment {id} not found")
# [/DEF:update_environment] # [/DEF:update_environment:Function]
# [DEF:delete_environment:Function] # [DEF:delete_environment:Function]
# @PURPOSE: Deletes a Superset environment. # @PURPOSE: Deletes a Superset environment.
# @PRE: ID is provided.
# @POST: Environment is removed from config.
# @PARAM: id (str) - The ID of the environment to delete. # @PARAM: id (str) - The ID of the environment to delete.
@router.delete("/environments/{id}") @router.delete("/environments/{id}")
async def delete_environment( async def delete_environment(
id: str, id: str,
config_manager: ConfigManager = Depends(get_config_manager) config_manager: ConfigManager = Depends(get_config_manager)
): ):
logger.info(f"[delete_environment][Entry] Deleting environment {id}") with belief_scope("delete_environment"):
logger.info(f"[delete_environment][Entry] Deleting environment {id}")
config_manager.delete_environment(id) config_manager.delete_environment(id)
return {"message": f"Environment {id} deleted"} return {"message": f"Environment {id} deleted"}
# [/DEF:delete_environment] # [/DEF:delete_environment:Function]
# [DEF:test_environment_connection:Function] # [DEF:test_environment_connection:Function]
# @PURPOSE: Tests the connection to a Superset environment. # @PURPOSE: Tests the connection to a Superset environment.
# @PRE: ID is provided.
# @POST: Returns success or error status.
# @PARAM: id (str) - The ID of the environment to test. # @PARAM: id (str) - The ID of the environment to test.
# @RETURN: dict - Success message or error. # @RETURN: dict - Success message or error.
@router.post("/environments/{id}/test") @router.post("/environments/{id}/test")
@@ -158,7 +201,8 @@ async def test_environment_connection(
id: str, id: str,
config_manager: ConfigManager = Depends(get_config_manager) config_manager: ConfigManager = Depends(get_config_manager)
): ):
logger.info(f"[test_environment_connection][Entry] Testing environment {id}") with belief_scope("test_environment_connection"):
logger.info(f"[test_environment_connection][Entry] Testing environment {id}")
# Find environment # Find environment
env = next((e for e in config_manager.get_environments() if e.id == id), None) env = next((e for e in config_manager.get_environments() if e.id == id), None)
@@ -166,21 +210,8 @@ async def test_environment_connection(
raise HTTPException(status_code=404, detail=f"Environment {id} not found") raise HTTPException(status_code=404, detail=f"Environment {id} not found")
try: try:
# Create SupersetConfig
# Note: SupersetConfig expects 'auth' dict with specific keys
superset_config = SupersetConfig(
env=env.name,
base_url=env.url,
auth={
"provider": "db", # Defaulting to db for now
"username": env.username,
"password": env.password,
"refresh": "true"
}
)
# Initialize client (this will trigger authentication) # Initialize client (this will trigger authentication)
client = SupersetClient(config=superset_config) client = SupersetClient(env)
# Try a simple request to verify # Try a simple request to verify
client.get_dashboards(query={"page_size": 1}) client.get_dashboards(query={"page_size": 1})
@@ -190,29 +221,7 @@ async def test_environment_connection(
except Exception as e: except Exception as e:
logger.error(f"[test_environment_connection][Coherence:Failed] Connection failed for {id}: {e}") logger.error(f"[test_environment_connection][Coherence:Failed] Connection failed for {id}: {e}")
return {"status": "error", "message": str(e)} return {"status": "error", "message": str(e)}
# [/DEF:test_environment_connection] # [/DEF:test_environment_connection:Function]
# [DEF:validate_backup_path:Function]
# @PURPOSE: Validates if a backup path exists and is writable.
# @PARAM: path (str) - The path to validate.
# @RETURN: dict - Validation result.
@router.post("/validate-path")
async def validate_backup_path(
path_data: dict,
config_manager: ConfigManager = Depends(get_config_manager)
):
path = path_data.get("path")
if not path:
raise HTTPException(status_code=400, detail="Path is required")
logger.info(f"[validate_backup_path][Entry] Validating path: {path}")
valid, message = config_manager.validate_path(path)
if not valid:
return {"status": "error", "message": message}
return {"status": "success", "message": message}
# [/DEF:validate_backup_path]
# [/DEF:SettingsRouter] # [/DEF:SettingsRouter:Module]

View File

@@ -0,0 +1,145 @@
# [DEF:storage_routes:Module]
#
# @SEMANTICS: storage, files, upload, download, backup, repository
# @PURPOSE: API endpoints for file storage management (backups and repositories).
# @LAYER: API
# @RELATION: DEPENDS_ON -> backend.src.models.storage
#
# @INVARIANT: All paths must be validated against path traversal.
# [SECTION: IMPORTS]
from pathlib import Path
from fastapi import APIRouter, Depends, UploadFile, File, Form, HTTPException
from fastapi.responses import FileResponse
from typing import List, Optional
from ...models.storage import StoredFile, FileCategory
from ...dependencies import get_plugin_loader, has_permission
from ...plugins.storage.plugin import StoragePlugin
from ...core.logger import belief_scope
# [/SECTION]
router = APIRouter(tags=["storage"])
# [DEF:list_files:Function]
# @PURPOSE: List all files and directories in the storage system.
#
# @PRE: None.
# @POST: Returns a list of StoredFile objects.
#
# @PARAM: category (Optional[FileCategory]) - Filter by category.
# @PARAM: path (Optional[str]) - Subpath within the category.
# @RETURN: List[StoredFile] - List of files/directories.
#
# @RELATION: CALLS -> StoragePlugin.list_files
@router.get("/files", response_model=List[StoredFile])
async def list_files(
category: Optional[FileCategory] = None,
path: Optional[str] = None,
plugin_loader=Depends(get_plugin_loader),
_ = Depends(has_permission("plugin:storage", "READ"))
):
with belief_scope("list_files"):
storage_plugin: StoragePlugin = plugin_loader.get_plugin("storage-manager")
if not storage_plugin:
raise HTTPException(status_code=500, detail="Storage plugin not loaded")
return storage_plugin.list_files(category, path)
# [/DEF:list_files:Function]
# [DEF:upload_file:Function]
# @PURPOSE: Upload a file to the storage system.
#
# @PRE: category must be a valid FileCategory.
# @PRE: file must be a valid UploadFile.
# @POST: Returns the StoredFile object of the uploaded file.
#
# @PARAM: category (FileCategory) - Target category.
# @PARAM: path (Optional[str]) - Target subpath.
# @PARAM: file (UploadFile) - The file content.
# @RETURN: StoredFile - Metadata of the uploaded file.
#
# @SIDE_EFFECT: Writes file to the filesystem.
#
# @RELATION: CALLS -> StoragePlugin.save_file
@router.post("/upload", response_model=StoredFile, status_code=201)
async def upload_file(
category: FileCategory = Form(...),
path: Optional[str] = Form(None),
file: UploadFile = File(...),
plugin_loader=Depends(get_plugin_loader),
_ = Depends(has_permission("plugin:storage", "WRITE"))
):
with belief_scope("upload_file"):
storage_plugin: StoragePlugin = plugin_loader.get_plugin("storage-manager")
if not storage_plugin:
raise HTTPException(status_code=500, detail="Storage plugin not loaded")
try:
return await storage_plugin.save_file(file, category, path)
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:upload_file:Function]
# [DEF:delete_file:Function]
# @PURPOSE: Delete a specific file or directory.
#
# @PRE: category must be a valid FileCategory.
# @POST: Item is removed from storage.
#
# @PARAM: category (FileCategory) - File category.
# @PARAM: path (str) - Relative path of the item.
# @RETURN: None
#
# @SIDE_EFFECT: Deletes item from the filesystem.
#
# @RELATION: CALLS -> StoragePlugin.delete_file
@router.delete("/files/{category}/{path:path}", status_code=204)
async def delete_file(
category: FileCategory,
path: str,
plugin_loader=Depends(get_plugin_loader),
_ = Depends(has_permission("plugin:storage", "WRITE"))
):
with belief_scope("delete_file"):
storage_plugin: StoragePlugin = plugin_loader.get_plugin("storage-manager")
if not storage_plugin:
raise HTTPException(status_code=500, detail="Storage plugin not loaded")
try:
storage_plugin.delete_file(category, path)
except FileNotFoundError:
raise HTTPException(status_code=404, detail="File not found")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:delete_file:Function]
# [DEF:download_file:Function]
# @PURPOSE: Retrieve a file for download.
#
# @PRE: category must be a valid FileCategory.
# @POST: Returns a FileResponse.
#
# @PARAM: category (FileCategory) - File category.
# @PARAM: path (str) - Relative path of the file.
# @RETURN: FileResponse - The file content.
#
# @RELATION: CALLS -> StoragePlugin.get_file_path
@router.get("/download/{category}/{path:path}")
async def download_file(
category: FileCategory,
path: str,
plugin_loader=Depends(get_plugin_loader),
_ = Depends(has_permission("plugin:storage", "READ"))
):
with belief_scope("download_file"):
storage_plugin: StoragePlugin = plugin_loader.get_plugin("storage-manager")
if not storage_plugin:
raise HTTPException(status_code=500, detail="Storage plugin not loaded")
try:
abs_path = storage_plugin.get_file_path(category, path)
filename = Path(path).name
return FileResponse(path=abs_path, filename=filename)
except FileNotFoundError:
raise HTTPException(status_code=404, detail="File not found")
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
# [/DEF:download_file:Function]
# [/DEF:storage_routes:Module]

View File

@@ -3,12 +3,13 @@
# @PURPOSE: Defines the FastAPI router for task-related endpoints, allowing clients to create, list, and get the status of tasks. # @PURPOSE: Defines the FastAPI router for task-related endpoints, allowing clients to create, list, and get the status of tasks.
# @LAYER: UI (API) # @LAYER: UI (API)
# @RELATION: Depends on the TaskManager. It is included by the main app. # @RELATION: Depends on the TaskManager. It is included by the main app.
from typing import List, Dict, Any from typing import List, Dict, Any, Optional
from fastapi import APIRouter, Depends, HTTPException, status from fastapi import APIRouter, Depends, HTTPException, status
from pydantic import BaseModel from pydantic import BaseModel
from ...core.logger import belief_scope
from ...core.task_manager import TaskManager, Task from ...core.task_manager import TaskManager, Task, TaskStatus, LogEntry
from ...dependencies import get_task_manager from ...dependencies import get_task_manager, has_permission, get_current_user
router = APIRouter() router = APIRouter()
@@ -19,57 +20,192 @@ class CreateTaskRequest(BaseModel):
class ResolveTaskRequest(BaseModel): class ResolveTaskRequest(BaseModel):
resolution_params: Dict[str, Any] resolution_params: Dict[str, Any]
@router.post("/", response_model=Task, status_code=status.HTTP_201_CREATED) class ResumeTaskRequest(BaseModel):
passwords: Dict[str, str]
@router.post("", response_model=Task, status_code=status.HTTP_201_CREATED)
# [DEF:create_task:Function]
# @PURPOSE: Create and start a new task for a given plugin.
# @PARAM: request (CreateTaskRequest) - The request body containing plugin_id and params.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: plugin_id must exist and params must be valid for that plugin.
# @POST: A new task is created and started.
# @RETURN: Task - The created task instance.
async def create_task( async def create_task(
request: CreateTaskRequest, request: CreateTaskRequest,
task_manager: TaskManager = Depends(get_task_manager) task_manager: TaskManager = Depends(get_task_manager),
current_user = Depends(get_current_user)
): ):
# Dynamic permission check based on plugin_id
has_permission(f"plugin:{request.plugin_id}", "EXECUTE")(current_user)
""" """
Create and start a new task for a given plugin. Create and start a new task for a given plugin.
""" """
try: with belief_scope("create_task"):
task = await task_manager.create_task( try:
plugin_id=request.plugin_id, # Special handling for validation task to include provider config
params=request.params if request.plugin_id == "llm_dashboard_validation":
) from ...core.database import SessionLocal
return task from ...services.llm_provider import LLMProviderService
except ValueError as e: db = SessionLocal()
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e)) try:
llm_service = LLMProviderService(db)
provider_id = request.params.get("provider_id")
if provider_id:
db_provider = llm_service.get_provider(provider_id)
if not db_provider:
raise ValueError(f"LLM Provider {provider_id} not found")
finally:
db.close()
@router.get("/", response_model=List[Task]) task = await task_manager.create_task(
plugin_id=request.plugin_id,
params=request.params
)
return task
except ValueError as e:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail=str(e))
# [/DEF:create_task:Function]
@router.get("", response_model=List[Task])
# [DEF:list_tasks:Function]
# @PURPOSE: Retrieve a list of tasks with pagination and optional status filter.
# @PARAM: limit (int) - Maximum number of tasks to return.
# @PARAM: offset (int) - Number of tasks to skip.
# @PARAM: status (Optional[TaskStatus]) - Filter by task status.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task_manager must be available.
# @POST: Returns a list of tasks.
# @RETURN: List[Task] - List of tasks.
async def list_tasks( async def list_tasks(
task_manager: TaskManager = Depends(get_task_manager) limit: int = 10,
offset: int = 0,
status: Optional[TaskStatus] = None,
task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "READ"))
): ):
""" """
Retrieve a list of all tasks. Retrieve a list of tasks with pagination and optional status filter.
""" """
return task_manager.get_all_tasks() with belief_scope("list_tasks"):
return task_manager.get_tasks(limit=limit, offset=offset, status=status)
# [/DEF:list_tasks:Function]
@router.get("/{task_id}", response_model=Task) @router.get("/{task_id}", response_model=Task)
# [DEF:get_task:Function]
# @PURPOSE: Retrieve the details of a specific task.
# @PARAM: task_id (str) - The unique identifier of the task.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task_id must exist.
# @POST: Returns task details or raises 404.
# @RETURN: Task - The task details.
async def get_task( async def get_task(
task_id: str, task_id: str,
task_manager: TaskManager = Depends(get_task_manager) task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "READ"))
): ):
""" """
Retrieve the details of a specific task. Retrieve the details of a specific task.
""" """
task = task_manager.get_task(task_id) with belief_scope("get_task"):
if not task: task = task_manager.get_task(task_id)
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Task not found") if not task:
return task raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Task not found")
return task
# [/DEF:get_task:Function]
@router.get("/{task_id}/logs", response_model=List[LogEntry])
# [DEF:get_task_logs:Function]
# @PURPOSE: Retrieve logs for a specific task.
# @PARAM: task_id (str) - The unique identifier of the task.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task_id must exist.
# @POST: Returns a list of log entries or raises 404.
# @RETURN: List[LogEntry] - List of log entries.
async def get_task_logs(
task_id: str,
task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "READ"))
):
"""
Retrieve logs for a specific task.
"""
with belief_scope("get_task_logs"):
task = task_manager.get_task(task_id)
if not task:
raise HTTPException(status_code=status.HTTP_404_NOT_FOUND, detail="Task not found")
return task_manager.get_task_logs(task_id)
# [/DEF:get_task_logs:Function]
@router.post("/{task_id}/resolve", response_model=Task) @router.post("/{task_id}/resolve", response_model=Task)
# [DEF:resolve_task:Function]
# @PURPOSE: Resolve a task that is awaiting mapping.
# @PARAM: task_id (str) - The unique identifier of the task.
# @PARAM: request (ResolveTaskRequest) - The resolution parameters.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task must be in AWAITING_MAPPING status.
# @POST: Task is resolved and resumes execution.
# @RETURN: Task - The updated task object.
async def resolve_task( async def resolve_task(
task_id: str, task_id: str,
request: ResolveTaskRequest, request: ResolveTaskRequest,
task_manager: TaskManager = Depends(get_task_manager) task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "WRITE"))
): ):
""" """
Resolve a task that is awaiting mapping. Resolve a task that is awaiting mapping.
""" """
try: with belief_scope("resolve_task"):
await task_manager.resolve_task(task_id, request.resolution_params) try:
return task_manager.get_task(task_id) await task_manager.resolve_task(task_id, request.resolution_params)
except ValueError as e: return task_manager.get_task(task_id)
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e)) except ValueError as e:
# [/DEF] raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
# [/DEF:resolve_task:Function]
@router.post("/{task_id}/resume", response_model=Task)
# [DEF:resume_task:Function]
# @PURPOSE: Resume a task that is awaiting input (e.g., passwords).
# @PARAM: task_id (str) - The unique identifier of the task.
# @PARAM: request (ResumeTaskRequest) - The input (passwords).
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task must be in AWAITING_INPUT status.
# @POST: Task resumes execution with provided input.
# @RETURN: Task - The updated task object.
async def resume_task(
task_id: str,
request: ResumeTaskRequest,
task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "WRITE"))
):
"""
Resume a task that is awaiting input (e.g., passwords).
"""
with belief_scope("resume_task"):
try:
task_manager.resume_task_with_password(task_id, request.passwords)
return task_manager.get_task(task_id)
except ValueError as e:
raise HTTPException(status_code=status.HTTP_400_BAD_REQUEST, detail=str(e))
# [/DEF:resume_task:Function]
@router.delete("", status_code=status.HTTP_204_NO_CONTENT)
# [DEF:clear_tasks:Function]
# @PURPOSE: Clear tasks matching the status filter.
# @PARAM: status (Optional[TaskStatus]) - Filter by task status.
# @PARAM: task_manager (TaskManager) - The task manager instance.
# @PRE: task_manager is available.
# @POST: Tasks are removed from memory/persistence.
async def clear_tasks(
status: Optional[TaskStatus] = None,
task_manager: TaskManager = Depends(get_task_manager),
_ = Depends(has_permission("tasks", "WRITE"))
):
"""
Clear tasks matching the status filter. If no filter, clears all non-running tasks.
"""
with belief_scope("clear_tasks", f"status={status}"):
task_manager.clear_tasks(status)
return
# [/DEF:clear_tasks:Function]
# [/DEF:TasksRouter:Module]

View File

@@ -6,26 +6,24 @@
import sys import sys
from pathlib import Path from pathlib import Path
# Add project root to sys.path to allow importing superset_tool # project_root is used for static files mounting
# Assuming app.py is in backend/src/
project_root = Path(__file__).resolve().parent.parent.parent project_root = Path(__file__).resolve().parent.parent.parent
sys.path.append(str(project_root))
from fastapi import FastAPI, WebSocket, WebSocketDisconnect, Depends from fastapi import FastAPI, WebSocket, WebSocketDisconnect, Depends, Request, HTTPException
from starlette.middleware.sessions import SessionMiddleware
from fastapi.middleware.cors import CORSMiddleware from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles from fastapi.staticfiles import StaticFiles
from fastapi.responses import FileResponse from fastapi.responses import FileResponse
import asyncio import asyncio
import os import os
from .dependencies import get_task_manager from .dependencies import get_task_manager, get_scheduler_service
from .core.logger import logger from .core.utils.network import NetworkError
from .api.routes import plugins, tasks, settings, environments, mappings from .core.logger import logger, belief_scope
from .api.routes import plugins, tasks, settings, environments, mappings, migration, connections, git, storage, admin, llm
from .api import auth
from .core.database import init_db from .core.database import init_db
# Initialize database
init_db()
# [DEF:App:Global] # [DEF:App:Global]
# @SEMANTICS: app, fastapi, instance # @SEMANTICS: app, fastapi, instance
# @PURPOSE: The global FastAPI application instance. # @PURPOSE: The global FastAPI application instance.
@@ -34,6 +32,35 @@ app = FastAPI(
description="API for managing Superset automation tools and plugins.", description="API for managing Superset automation tools and plugins.",
version="1.0.0", version="1.0.0",
) )
# [/DEF:App:Global]
# [DEF:startup_event:Function]
# @PURPOSE: Handles application startup tasks, such as starting the scheduler.
# @PRE: None.
# @POST: Scheduler is started.
# Startup event
@app.on_event("startup")
async def startup_event():
with belief_scope("startup_event"):
scheduler = get_scheduler_service()
scheduler.start()
# [/DEF:startup_event:Function]
# [DEF:shutdown_event:Function]
# @PURPOSE: Handles application shutdown tasks, such as stopping the scheduler.
# @PRE: None.
# @POST: Scheduler is stopped.
# Shutdown event
@app.on_event("shutdown")
async def shutdown_event():
with belief_scope("shutdown_event"):
scheduler = get_scheduler_service()
scheduler.stop()
# [/DEF:shutdown_event:Function]
# Configure Session Middleware (required by Authlib for OAuth2 flow)
from .core.auth.config import auth_config
app.add_middleware(SessionMiddleware, secret_key=auth_config.SECRET_KEY)
# Configure CORS # Configure CORS
app.add_middleware( app.add_middleware(
@@ -45,33 +72,92 @@ app.add_middleware(
) )
# [DEF:log_requests:Function]
# @PURPOSE: Middleware to log incoming HTTP requests and their response status.
# @PRE: request is a FastAPI Request object.
# @POST: Logs request and response details.
# @PARAM: request (Request) - The incoming request object.
# @PARAM: call_next (Callable) - The next middleware or route handler.
@app.exception_handler(NetworkError)
async def network_error_handler(request: Request, exc: NetworkError):
with belief_scope("network_error_handler"):
logger.error(f"Network error: {exc}")
return HTTPException(
status_code=503,
detail="Environment unavailable. Please check if the Superset instance is running."
)
@app.middleware("http")
async def log_requests(request: Request, call_next):
# Avoid spamming logs for polling endpoints
is_polling = request.url.path.endswith("/api/tasks") and request.method == "GET"
if not is_polling:
logger.info(f"Incoming request: {request.method} {request.url.path}")
try:
response = await call_next(request)
if not is_polling:
logger.info(f"Response status: {response.status_code} for {request.url.path}")
return response
except NetworkError as e:
logger.error(f"Network error caught in middleware: {e}")
raise HTTPException(
status_code=503,
detail="Environment unavailable. Please check if the Superset instance is running."
)
# [/DEF:log_requests:Function]
# Include API routes # Include API routes
app.include_router(auth.router)
app.include_router(admin.router)
app.include_router(plugins.router, prefix="/api/plugins", tags=["Plugins"]) app.include_router(plugins.router, prefix="/api/plugins", tags=["Plugins"])
app.include_router(tasks.router, prefix="/api/tasks", tags=["Tasks"]) app.include_router(tasks.router, prefix="/api/tasks", tags=["Tasks"])
app.include_router(settings.router, prefix="/api/settings", tags=["Settings"]) app.include_router(settings.router, prefix="/api/settings", tags=["Settings"])
app.include_router(environments.router) app.include_router(connections.router, prefix="/api/settings/connections", tags=["Connections"])
app.include_router(environments.router, prefix="/api/environments", tags=["Environments"])
app.include_router(mappings.router) app.include_router(mappings.router)
app.include_router(migration.router)
app.include_router(git.router)
app.include_router(llm.router)
app.include_router(storage.router, prefix="/api/storage", tags=["Storage"])
# [DEF:WebSocketEndpoint:Endpoint] # [DEF:websocket_endpoint:Function]
# @SEMANTICS: websocket, logs, streaming, real-time # @PURPOSE: Provides a WebSocket endpoint for real-time log streaming of a task.
# @PURPOSE: Provides a WebSocket endpoint for clients to connect to and receive real-time log entries for a specific task. # @PRE: task_id must be a valid task ID.
# @POST: WebSocket connection is managed and logs are streamed until disconnect.
@app.websocket("/ws/logs/{task_id}") @app.websocket("/ws/logs/{task_id}")
async def websocket_endpoint(websocket: WebSocket, task_id: str): async def websocket_endpoint(websocket: WebSocket, task_id: str):
await websocket.accept() with belief_scope("websocket_endpoint", f"task_id={task_id}"):
await websocket.accept()
logger.info(f"WebSocket connection accepted for task {task_id}") logger.info(f"WebSocket connection accepted for task {task_id}")
task_manager = get_task_manager() task_manager = get_task_manager()
queue = await task_manager.subscribe_logs(task_id) queue = await task_manager.subscribe_logs(task_id)
try: try:
# Send initial logs if any # Stream new logs
logger.info(f"Starting log stream for task {task_id}")
# Send initial logs first to build context
initial_logs = task_manager.get_task_logs(task_id) initial_logs = task_manager.get_task_logs(task_id)
for log_entry in initial_logs: for log_entry in initial_logs:
# Convert datetime to string for JSON serialization
log_dict = log_entry.dict() log_dict = log_entry.dict()
log_dict['timestamp'] = log_dict['timestamp'].isoformat() log_dict['timestamp'] = log_dict['timestamp'].isoformat()
await websocket.send_json(log_dict) await websocket.send_json(log_dict)
# Stream new logs # Force a check for AWAITING_INPUT status immediately upon connection
logger.info(f"Starting log stream for task {task_id}") # This ensures that if the task is already waiting when the user connects, they get the prompt.
task = task_manager.get_task(task_id)
if task and task.status == "AWAITING_INPUT" and task.input_request:
# Construct a synthetic log entry to trigger the frontend handler
# This is a bit of a hack but avoids changing the websocket protocol significantly
synthetic_log = {
"timestamp": task.logs[-1].timestamp.isoformat() if task.logs else "2024-01-01T00:00:00",
"level": "INFO",
"message": "Task paused for user input (Connection Re-established)",
"context": {"input_request": task.input_request}
}
await websocket.send_json(synthetic_log)
while True: while True:
log_entry = await queue.get() log_entry = await queue.get()
log_dict = log_entry.dict() log_dict = log_entry.dict()
@@ -83,7 +169,9 @@ async def websocket_endpoint(websocket: WebSocket, task_id: str):
if "Task completed successfully" in log_entry.message or "Task failed" in log_entry.message: if "Task completed successfully" in log_entry.message or "Task failed" in log_entry.message:
# Wait a bit to ensure client receives the last message # Wait a bit to ensure client receives the last message
await asyncio.sleep(2) await asyncio.sleep(2)
break # DO NOT BREAK here - allow client to keep connection open if they want to review logs
# or until they disconnect. Breaking closes the socket immediately.
# break
except WebSocketDisconnect: except WebSocketDisconnect:
logger.info(f"WebSocket connection disconnected for task {task_id}") logger.info(f"WebSocket connection disconnected for task {task_id}")
@@ -91,8 +179,7 @@ async def websocket_endpoint(websocket: WebSocket, task_id: str):
logger.error(f"WebSocket error for task {task_id}: {e}") logger.error(f"WebSocket error for task {task_id}: {e}")
finally: finally:
task_manager.unsubscribe_logs(task_id, queue) task_manager.unsubscribe_logs(task_id, queue)
# [/DEF:websocket_endpoint:Function]
# [/DEF]
# [DEF:StaticFiles:Mount] # [DEF:StaticFiles:Mount]
# @SEMANTICS: static, frontend, spa # @SEMANTICS: static, frontend, spa
@@ -102,18 +189,33 @@ if frontend_path.exists():
app.mount("/_app", StaticFiles(directory=str(frontend_path / "_app")), name="static") app.mount("/_app", StaticFiles(directory=str(frontend_path / "_app")), name="static")
# Serve other static files from the root of build directory # Serve other static files from the root of build directory
# [DEF:serve_spa:Function]
# @PURPOSE: Serves frontend static files or index.html for SPA routing.
# @PRE: file_path is requested by the client.
# @POST: Returns the requested file or index.html as a fallback.
@app.get("/{file_path:path}") @app.get("/{file_path:path}")
async def serve_spa(file_path: str): async def serve_spa(file_path: str):
full_path = frontend_path / file_path with belief_scope("serve_spa", f"path={file_path}"):
if full_path.is_file(): # Don't serve SPA for API routes that fell through
return FileResponse(str(full_path)) if file_path.startswith("api/"):
# Fallback to index.html for SPA routing logger.info(f"[DEBUG] API route fell through to serve_spa: {file_path}")
return FileResponse(str(frontend_path / "index.html")) raise HTTPException(status_code=404, detail=f"API endpoint not found: {file_path}")
full_path = frontend_path / file_path
if full_path.is_file():
return FileResponse(str(full_path))
# Fallback to index.html for SPA routing
return FileResponse(str(frontend_path / "index.html"))
# [/DEF:serve_spa:Function]
else: else:
# [DEF:RootEndpoint:Endpoint] # [DEF:read_root:Function]
# @SEMANTICS: root, healthcheck # @PURPOSE: A simple root endpoint to confirm that the API is running when frontend is missing.
# @PURPOSE: A simple root endpoint to confirm that the API is running. # @PRE: None.
# @POST: Returns a JSON message indicating API status.
@app.get("/") @app.get("/")
async def read_root(): async def read_root():
return {"message": "Superset Tools API is running (Frontend build not found)"} with belief_scope("read_root"):
# [/DEF] return {"message": "Superset Tools API is running (Frontend build not found)"}
# [/DEF:read_root:Function]
# [/DEF:StaticFiles:Mount]
# [/DEF:AppModule:Module]

View File

@@ -0,0 +1,45 @@
# [DEF:backend.src.core.auth.config:Module]
#
# @SEMANTICS: auth, config, settings, jwt, adfs
# @PURPOSE: Centralized configuration for authentication and authorization.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> pydantic
#
# @INVARIANT: All sensitive configuration must have defaults or be loaded from environment.
# [SECTION: IMPORTS]
from pydantic import Field
from pydantic_settings import BaseSettings
import os
# [/SECTION]
# [DEF:AuthConfig:Class]
# @PURPOSE: Holds authentication-related settings.
# @PRE: Environment variables may be provided via .env file.
# @POST: Returns a configuration object with validated settings.
class AuthConfig(BaseSettings):
# JWT Settings
SECRET_KEY: str = Field(default="super-secret-key-change-in-production", env="AUTH_SECRET_KEY")
ALGORITHM: str = "HS256"
ACCESS_TOKEN_EXPIRE_MINUTES: int = 480
REFRESH_TOKEN_EXPIRE_DAYS: int = 7
# Database Settings
AUTH_DATABASE_URL: str = Field(default="sqlite:///./backend/auth.db", env="AUTH_DATABASE_URL")
# ADFS Settings
ADFS_CLIENT_ID: str = Field(default="", env="ADFS_CLIENT_ID")
ADFS_CLIENT_SECRET: str = Field(default="", env="ADFS_CLIENT_SECRET")
ADFS_METADATA_URL: str = Field(default="", env="ADFS_METADATA_URL")
class Config:
env_file = ".env"
extra = "ignore"
# [/DEF:AuthConfig:Class]
# [DEF:auth_config:Variable]
# @PURPOSE: Singleton instance of AuthConfig.
auth_config = AuthConfig()
# [/DEF:auth_config:Variable]
# [/DEF:backend.src.core.auth.config:Module]

View File

@@ -0,0 +1,54 @@
# [DEF:backend.src.core.auth.jwt:Module]
#
# @SEMANTICS: jwt, token, session, auth
# @PURPOSE: JWT token generation and validation logic.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> jose
# @RELATION: USES -> backend.src.core.auth.config.auth_config
#
# @INVARIANT: Tokens must include expiration time and user identifier.
# [SECTION: IMPORTS]
from datetime import datetime, timedelta
from typing import Optional, List
from jose import JWTError, jwt
from .config import auth_config
from ..logger import belief_scope
# [/SECTION]
# [DEF:create_access_token:Function]
# @PURPOSE: Generates a new JWT access token.
# @PRE: data dict contains 'sub' (user_id) and optional 'scopes' (roles).
# @POST: Returns a signed JWT string.
#
# @PARAM: data (dict) - Payload data for the token.
# @PARAM: expires_delta (Optional[timedelta]) - Custom expiration time.
# @RETURN: str - The encoded JWT.
def create_access_token(data: dict, expires_delta: Optional[timedelta] = None) -> str:
with belief_scope("create_access_token"):
to_encode = data.copy()
if expires_delta:
expire = datetime.utcnow() + expires_delta
else:
expire = datetime.utcnow() + timedelta(minutes=auth_config.ACCESS_TOKEN_EXPIRE_MINUTES)
to_encode.update({"exp": expire})
encoded_jwt = jwt.encode(to_encode, auth_config.SECRET_KEY, algorithm=auth_config.ALGORITHM)
return encoded_jwt
# [/DEF:create_access_token:Function]
# [DEF:decode_token:Function]
# @PURPOSE: Decodes and validates a JWT token.
# @PRE: token is a signed JWT string.
# @POST: Returns the decoded payload if valid.
#
# @PARAM: token (str) - The JWT to decode.
# @RETURN: dict - The decoded payload.
# @THROW: jose.JWTError - If token is invalid or expired.
def decode_token(token: str) -> dict:
with belief_scope("decode_token"):
payload = jwt.decode(token, auth_config.SECRET_KEY, algorithms=[auth_config.ALGORITHM])
return payload
# [/DEF:decode_token:Function]
# [/DEF:backend.src.core.auth.jwt:Module]

View File

@@ -0,0 +1,31 @@
# [DEF:backend.src.core.auth.logger:Module]
#
# @SEMANTICS: auth, logger, audit, security
# @PURPOSE: Audit logging for security-related events.
# @LAYER: Core
# @RELATION: USES -> backend.src.core.logger.belief_scope
#
# @INVARIANT: Must not log sensitive data like passwords or full tokens.
# [SECTION: IMPORTS]
from ..logger import logger, belief_scope
from datetime import datetime
# [/SECTION]
# [DEF:log_security_event:Function]
# @PURPOSE: Logs a security-related event for audit trails.
# @PRE: event_type and username are strings.
# @POST: Security event is written to the application log.
# @PARAM: event_type (str) - Type of event (e.g., LOGIN_SUCCESS, PERMISSION_DENIED).
# @PARAM: username (str) - The user involved in the event.
# @PARAM: details (dict) - Additional non-sensitive metadata.
def log_security_event(event_type: str, username: str, details: dict = None):
with belief_scope("log_security_event", f"{event_type}:{username}"):
timestamp = datetime.utcnow().isoformat()
msg = f"[AUDIT][{timestamp}][{event_type}] User: {username}"
if details:
msg += f" Details: {details}"
logger.info(msg)
# [/DEF:log_security_event:Function]
# [/DEF:backend.src.core.auth.logger:Module]

View File

@@ -0,0 +1,51 @@
# [DEF:backend.src.core.auth.oauth:Module]
#
# @SEMANTICS: auth, oauth, oidc, adfs
# @PURPOSE: ADFS OIDC configuration and client using Authlib.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> authlib
# @RELATION: USES -> backend.src.core.auth.config.auth_config
#
# @INVARIANT: Must use secure OIDC flows.
# [SECTION: IMPORTS]
from authlib.integrations.starlette_client import OAuth
from .config import auth_config
# [/SECTION]
# [DEF:oauth:Variable]
# @PURPOSE: Global Authlib OAuth registry.
oauth = OAuth()
# [/DEF:oauth:Variable]
# [DEF:register_adfs:Function]
# @PURPOSE: Registers the ADFS OIDC client.
# @PRE: ADFS configuration is provided in auth_config.
# @POST: ADFS client is registered in oauth registry.
def register_adfs():
if auth_config.ADFS_CLIENT_ID:
oauth.register(
name='adfs',
client_id=auth_config.ADFS_CLIENT_ID,
client_secret=auth_config.ADFS_CLIENT_SECRET,
server_metadata_url=auth_config.ADFS_METADATA_URL,
client_kwargs={
'scope': 'openid email profile groups'
}
)
# [/DEF:register_adfs:Function]
# [DEF:is_adfs_configured:Function]
# @PURPOSE: Checks if ADFS is properly configured.
# @PRE: None.
# @POST: Returns True if ADFS client is registered, False otherwise.
# @RETURN: bool - Configuration status.
def is_adfs_configured() -> bool:
"""Check if ADFS OAuth client is registered."""
return 'adfs' in oauth._registry
# [/DEF:is_adfs_configured:Function]
# Initial registration
register_adfs()
# [/DEF:backend.src.core.auth.oauth:Module]

View File

@@ -0,0 +1,123 @@
# [DEF:backend.src.core.auth.repository:Module]
#
# @SEMANTICS: auth, repository, database, user, role
# @PURPOSE: Data access layer for authentication-related entities.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> sqlalchemy
# @RELATION: USES -> backend.src.models.auth
#
# @INVARIANT: All database operations must be performed within a session.
# [SECTION: IMPORTS]
from typing import Optional, List
from sqlalchemy.orm import Session
from ...models.auth import User, Role, Permission, ADGroupMapping
from ..logger import belief_scope
# [/SECTION]
# [DEF:AuthRepository:Class]
# @PURPOSE: Encapsulates database operations for authentication.
class AuthRepository:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the repository with a database session.
# @PARAM: db (Session) - SQLAlchemy session.
def __init__(self, db: Session):
self.db = db
# [/DEF:__init__:Function]
# [DEF:get_user_by_username:Function]
# @PURPOSE: Retrieves a user by their username.
# @PRE: username is a string.
# @POST: Returns User object if found, else None.
# @PARAM: username (str) - The username to search for.
# @RETURN: Optional[User] - The found user or None.
def get_user_by_username(self, username: str) -> Optional[User]:
with belief_scope("AuthRepository.get_user_by_username"):
return self.db.query(User).filter(User.username == username).first()
# [/DEF:get_user_by_username:Function]
# [DEF:get_user_by_id:Function]
# @PURPOSE: Retrieves a user by their unique ID.
# @PRE: user_id is a valid UUID string.
# @POST: Returns User object if found, else None.
# @PARAM: user_id (str) - The user's unique identifier.
# @RETURN: Optional[User] - The found user or None.
def get_user_by_id(self, user_id: str) -> Optional[User]:
with belief_scope("AuthRepository.get_user_by_id"):
return self.db.query(User).filter(User.id == user_id).first()
# [/DEF:get_user_by_id:Function]
# [DEF:get_role_by_name:Function]
# @PURPOSE: Retrieves a role by its name.
# @PRE: name is a string.
# @POST: Returns Role object if found, else None.
# @PARAM: name (str) - The role name to search for.
# @RETURN: Optional[Role] - The found role or None.
def get_role_by_name(self, name: str) -> Optional[Role]:
with belief_scope("AuthRepository.get_role_by_name"):
return self.db.query(Role).filter(Role.name == name).first()
# [/DEF:get_role_by_name:Function]
# [DEF:update_last_login:Function]
# @PURPOSE: Updates the last_login timestamp for a user.
# @PRE: user object is a valid User instance.
# @POST: User's last_login is updated in the database.
# @SIDE_EFFECT: Commits the transaction.
# @PARAM: user (User) - The user to update.
def update_last_login(self, user: User):
with belief_scope("AuthRepository.update_last_login"):
from datetime import datetime
user.last_login = datetime.utcnow()
self.db.add(user)
self.db.commit()
# [/DEF:update_last_login:Function]
# [DEF:get_role_by_id:Function]
# @PURPOSE: Retrieves a role by its unique ID.
# @PRE: role_id is a string.
# @POST: Returns Role object if found, else None.
# @PARAM: role_id (str) - The role's unique identifier.
# @RETURN: Optional[Role] - The found role or None.
def get_role_by_id(self, role_id: str) -> Optional[Role]:
with belief_scope("AuthRepository.get_role_by_id"):
return self.db.query(Role).filter(Role.id == role_id).first()
# [/DEF:get_role_by_id:Function]
# [DEF:get_permission_by_id:Function]
# @PURPOSE: Retrieves a permission by its unique ID.
# @PRE: perm_id is a string.
# @POST: Returns Permission object if found, else None.
# @PARAM: perm_id (str) - The permission's unique identifier.
# @RETURN: Optional[Permission] - The found permission or None.
def get_permission_by_id(self, perm_id: str) -> Optional[Permission]:
with belief_scope("AuthRepository.get_permission_by_id"):
return self.db.query(Permission).filter(Permission.id == perm_id).first()
# [/DEF:get_permission_by_id:Function]
# [DEF:get_permission_by_resource_action:Function]
# @PURPOSE: Retrieves a permission by resource and action.
# @PRE: resource and action are strings.
# @POST: Returns Permission object if found, else None.
# @PARAM: resource (str) - The resource name.
# @PARAM: action (str) - The action name.
# @RETURN: Optional[Permission] - The found permission or None.
def get_permission_by_resource_action(self, resource: str, action: str) -> Optional[Permission]:
with belief_scope("AuthRepository.get_permission_by_resource_action"):
return self.db.query(Permission).filter(
Permission.resource == resource,
Permission.action == action
).first()
# [/DEF:get_permission_by_resource_action:Function]
# [DEF:list_permissions:Function]
# @PURPOSE: Lists all available permissions.
# @POST: Returns a list of all Permission objects.
# @RETURN: List[Permission] - List of permissions.
def list_permissions(self) -> List[Permission]:
with belief_scope("AuthRepository.list_permissions"):
return self.db.query(Permission).all()
# [/DEF:list_permissions:Function]
# [/DEF:AuthRepository:Class]
# [/DEF:backend.src.core.auth.repository:Module]

View File

@@ -0,0 +1,42 @@
# [DEF:backend.src.core.auth.security:Module]
#
# @SEMANTICS: security, password, hashing, bcrypt
# @PURPOSE: Utility for password hashing and verification using Passlib.
# @LAYER: Core
# @RELATION: DEPENDS_ON -> passlib
#
# @INVARIANT: Uses bcrypt for hashing with standard work factor.
# [SECTION: IMPORTS]
from passlib.context import CryptContext
# [/SECTION]
# [DEF:pwd_context:Variable]
# @PURPOSE: Passlib CryptContext for password management.
pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")
# [/DEF:pwd_context:Variable]
# [DEF:verify_password:Function]
# @PURPOSE: Verifies a plain password against a hashed password.
# @PRE: plain_password is a string, hashed_password is a bcrypt hash.
# @POST: Returns True if password matches, False otherwise.
#
# @PARAM: plain_password (str) - The unhashed password.
# @PARAM: hashed_password (str) - The stored hash.
# @RETURN: bool - Verification result.
def verify_password(plain_password: str, hashed_password: str) -> bool:
return pwd_context.verify(plain_password, hashed_password)
# [/DEF:verify_password:Function]
# [DEF:get_password_hash:Function]
# @PURPOSE: Generates a bcrypt hash for a plain password.
# @PRE: password is a string.
# @POST: Returns a secure bcrypt hash string.
#
# @PARAM: password (str) - The password to hash.
# @RETURN: str - The generated hash.
def get_password_hash(password: str) -> str:
return pwd_context.hash(password)
# [/DEF:get_password_hash:Function]
# [/DEF:backend.src.core.auth.security:Module]

View File

@@ -16,7 +16,7 @@ import os
from pathlib import Path from pathlib import Path
from typing import Optional, List from typing import Optional, List
from .config_models import AppConfig, Environment, GlobalSettings from .config_models import AppConfig, Environment, GlobalSettings
from .logger import logger from .logger import logger, configure_logger, belief_scope
# [/SECTION] # [/SECTION]
# [DEF:ConfigManager:Class] # [DEF:ConfigManager:Class]
@@ -30,57 +30,71 @@ class ConfigManager:
# @POST: self.config is an instance of AppConfig # @POST: self.config is an instance of AppConfig
# @PARAM: config_path (str) - Path to the configuration file. # @PARAM: config_path (str) - Path to the configuration file.
def __init__(self, config_path: str = "config.json"): def __init__(self, config_path: str = "config.json"):
# 1. Runtime check of @PRE with belief_scope("__init__"):
assert isinstance(config_path, str) and config_path, "config_path must be a non-empty string" # 1. Runtime check of @PRE
assert isinstance(config_path, str) and config_path, "config_path must be a non-empty string"
logger.info(f"[ConfigManager][Entry] Initializing with {config_path}")
logger.info(f"[ConfigManager][Entry] Initializing with {config_path}")
# 2. Logic implementation
self.config_path = Path(config_path) # 2. Logic implementation
self.config: AppConfig = self._load_config() self.config_path = Path(config_path)
self.config: AppConfig = self._load_config()
# 3. Runtime check of @POST
assert isinstance(self.config, AppConfig), "self.config must be an instance of AppConfig" # Configure logger with loaded settings
configure_logger(self.config.settings.logging)
logger.info(f"[ConfigManager][Exit] Initialized")
# [/DEF:__init__] # 3. Runtime check of @POST
assert isinstance(self.config, AppConfig), "self.config must be an instance of AppConfig"
logger.info(f"[ConfigManager][Exit] Initialized")
# [/DEF:__init__:Function]
# [DEF:_load_config:Function] # [DEF:_load_config:Function]
# @PURPOSE: Loads the configuration from disk or creates a default one. # @PURPOSE: Loads the configuration from disk or creates a default one.
# @PRE: self.config_path is set.
# @POST: isinstance(return, AppConfig) # @POST: isinstance(return, AppConfig)
# @RETURN: AppConfig - The loaded or default configuration. # @RETURN: AppConfig - The loaded or default configuration.
def _load_config(self) -> AppConfig: def _load_config(self) -> AppConfig:
logger.debug(f"[_load_config][Entry] Loading from {self.config_path}") with belief_scope("_load_config"):
logger.debug(f"[_load_config][Entry] Loading from {self.config_path}")
if not self.config_path.exists(): if not self.config_path.exists():
logger.info(f"[_load_config][Action] Config file not found. Creating default.") logger.info(f"[_load_config][Action] Config file not found. Creating default.")
default_config = AppConfig( default_config = AppConfig(
environments=[], environments=[],
settings=GlobalSettings(backup_path="backups") settings=GlobalSettings()
) )
self._save_config_to_disk(default_config) self._save_config_to_disk(default_config)
return default_config return default_config
try: try:
with open(self.config_path, "r") as f: with open(self.config_path, "r") as f:
data = json.load(f) data = json.load(f)
# Check for deprecated field
if "settings" in data and "backup_path" in data["settings"]:
del data["settings"]["backup_path"]
config = AppConfig(**data) config = AppConfig(**data)
logger.info(f"[_load_config][Coherence:OK] Configuration loaded") logger.info(f"[_load_config][Coherence:OK] Configuration loaded")
return config return config
except Exception as e: except Exception as e:
logger.error(f"[_load_config][Coherence:Failed] Error loading config: {e}") logger.error(f"[_load_config][Coherence:Failed] Error loading config: {e}")
# Fallback but try to preserve existing settings if possible?
# For now, return default to be safe, but log the error prominently.
return AppConfig( return AppConfig(
environments=[], environments=[],
settings=GlobalSettings(backup_path="backups") settings=GlobalSettings(storage=StorageConfig())
) )
# [/DEF:_load_config] # [/DEF:_load_config:Function]
# [DEF:_save_config_to_disk:Function] # [DEF:_save_config_to_disk:Function]
# @PURPOSE: Saves the provided configuration object to disk. # @PURPOSE: Saves the provided configuration object to disk.
# @PRE: isinstance(config, AppConfig) # @PRE: isinstance(config, AppConfig)
# @POST: Configuration saved to disk.
# @PARAM: config (AppConfig) - The configuration to save. # @PARAM: config (AppConfig) - The configuration to save.
def _save_config_to_disk(self, config: AppConfig): def _save_config_to_disk(self, config: AppConfig):
logger.debug(f"[_save_config_to_disk][Entry] Saving to {self.config_path}") with belief_scope("_save_config_to_disk"):
logger.debug(f"[_save_config_to_disk][Entry] Saving to {self.config_path}")
# 1. Runtime check of @PRE # 1. Runtime check of @PRE
assert isinstance(config, AppConfig), "config must be an instance of AppConfig" assert isinstance(config, AppConfig), "config must be an instance of AppConfig"
@@ -92,27 +106,35 @@ class ConfigManager:
logger.info(f"[_save_config_to_disk][Action] Configuration saved") logger.info(f"[_save_config_to_disk][Action] Configuration saved")
except Exception as e: except Exception as e:
logger.error(f"[_save_config_to_disk][Coherence:Failed] Failed to save: {e}") logger.error(f"[_save_config_to_disk][Coherence:Failed] Failed to save: {e}")
# [/DEF:_save_config_to_disk] # [/DEF:_save_config_to_disk:Function]
# [DEF:save:Function] # [DEF:save:Function]
# @PURPOSE: Saves the current configuration state to disk. # @PURPOSE: Saves the current configuration state to disk.
# @PRE: self.config is set.
# @POST: self._save_config_to_disk called.
def save(self): def save(self):
self._save_config_to_disk(self.config) with belief_scope("save"):
# [/DEF:save] self._save_config_to_disk(self.config)
# [/DEF:save:Function]
# [DEF:get_config:Function] # [DEF:get_config:Function]
# @PURPOSE: Returns the current configuration. # @PURPOSE: Returns the current configuration.
# @PRE: self.config is set.
# @POST: Returns self.config.
# @RETURN: AppConfig - The current configuration. # @RETURN: AppConfig - The current configuration.
def get_config(self) -> AppConfig: def get_config(self) -> AppConfig:
return self.config with belief_scope("get_config"):
# [/DEF:get_config] return self.config
# [/DEF:get_config:Function]
# [DEF:update_global_settings:Function] # [DEF:update_global_settings:Function]
# @PURPOSE: Updates the global settings and persists the change. # @PURPOSE: Updates the global settings and persists the change.
# @PRE: isinstance(settings, GlobalSettings) # @PRE: isinstance(settings, GlobalSettings)
# @POST: self.config.settings updated and saved.
# @PARAM: settings (GlobalSettings) - The new global settings. # @PARAM: settings (GlobalSettings) - The new global settings.
def update_global_settings(self, settings: GlobalSettings): def update_global_settings(self, settings: GlobalSettings):
logger.info(f"[update_global_settings][Entry] Updating settings") with belief_scope("update_global_settings"):
logger.info(f"[update_global_settings][Entry] Updating settings")
# 1. Runtime check of @PRE # 1. Runtime check of @PRE
assert isinstance(settings, GlobalSettings), "settings must be an instance of GlobalSettings" assert isinstance(settings, GlobalSettings), "settings must be an instance of GlobalSettings"
@@ -120,16 +142,22 @@ class ConfigManager:
# 2. Logic implementation # 2. Logic implementation
self.config.settings = settings self.config.settings = settings
self.save() self.save()
# Reconfigure logger with new settings
configure_logger(settings.logging)
logger.info(f"[update_global_settings][Exit] Settings updated") logger.info(f"[update_global_settings][Exit] Settings updated")
# [/DEF:update_global_settings] # [/DEF:update_global_settings:Function]
# [DEF:validate_path:Function] # [DEF:validate_path:Function]
# @PURPOSE: Validates if a path exists and is writable. # @PURPOSE: Validates if a path exists and is writable.
# @PRE: path is a string.
# @POST: Returns (bool, str) status.
# @PARAM: path (str) - The path to validate. # @PARAM: path (str) - The path to validate.
# @RETURN: tuple (bool, str) - (is_valid, message) # @RETURN: tuple (bool, str) - (is_valid, message)
def validate_path(self, path: str) -> tuple[bool, str]: def validate_path(self, path: str) -> tuple[bool, str]:
p = os.path.abspath(path) with belief_scope("validate_path"):
p = os.path.abspath(path)
if not os.path.exists(p): if not os.path.exists(p):
try: try:
os.makedirs(p, exist_ok=True) os.makedirs(p, exist_ok=True)
@@ -140,28 +168,50 @@ class ConfigManager:
return False, "Path is not writable" return False, "Path is not writable"
return True, "Path is valid and writable" return True, "Path is valid and writable"
# [/DEF:validate_path] # [/DEF:validate_path:Function]
# [DEF:get_environments:Function] # [DEF:get_environments:Function]
# @PURPOSE: Returns the list of configured environments. # @PURPOSE: Returns the list of configured environments.
# @PRE: self.config is set.
# @POST: Returns list of environments.
# @RETURN: List[Environment] - List of environments. # @RETURN: List[Environment] - List of environments.
def get_environments(self) -> List[Environment]: def get_environments(self) -> List[Environment]:
return self.config.environments with belief_scope("get_environments"):
# [/DEF:get_environments] return self.config.environments
# [/DEF:get_environments:Function]
# [DEF:has_environments:Function] # [DEF:has_environments:Function]
# @PURPOSE: Checks if at least one environment is configured. # @PURPOSE: Checks if at least one environment is configured.
# @PRE: self.config is set.
# @POST: Returns boolean indicating if environments exist.
# @RETURN: bool - True if at least one environment exists. # @RETURN: bool - True if at least one environment exists.
def has_environments(self) -> bool: def has_environments(self) -> bool:
return len(self.config.environments) > 0 with belief_scope("has_environments"):
# [/DEF:has_environments] return len(self.config.environments) > 0
# [/DEF:has_environments:Function]
# [DEF:get_environment:Function]
# @PURPOSE: Returns a single environment by ID.
# @PRE: self.config is set and isinstance(env_id, str) and len(env_id) > 0.
# @POST: Returns Environment object if found, None otherwise.
# @PARAM: env_id (str) - The ID of the environment to retrieve.
# @RETURN: Optional[Environment] - The environment with the given ID, or None.
def get_environment(self, env_id: str) -> Optional[Environment]:
with belief_scope("get_environment"):
for env in self.config.environments:
if env.id == env_id:
return env
return None
# [/DEF:get_environment:Function]
# [DEF:add_environment:Function] # [DEF:add_environment:Function]
# @PURPOSE: Adds a new environment to the configuration. # @PURPOSE: Adds a new environment to the configuration.
# @PRE: isinstance(env, Environment) # @PRE: isinstance(env, Environment)
# @POST: Environment added or updated in self.config.environments.
# @PARAM: env (Environment) - The environment to add. # @PARAM: env (Environment) - The environment to add.
def add_environment(self, env: Environment): def add_environment(self, env: Environment):
logger.info(f"[add_environment][Entry] Adding environment {env.id}") with belief_scope("add_environment"):
logger.info(f"[add_environment][Entry] Adding environment {env.id}")
# 1. Runtime check of @PRE # 1. Runtime check of @PRE
assert isinstance(env, Environment), "env must be an instance of Environment" assert isinstance(env, Environment), "env must be an instance of Environment"
@@ -173,16 +223,18 @@ class ConfigManager:
self.save() self.save()
logger.info(f"[add_environment][Exit] Environment added") logger.info(f"[add_environment][Exit] Environment added")
# [/DEF:add_environment] # [/DEF:add_environment:Function]
# [DEF:update_environment:Function] # [DEF:update_environment:Function]
# @PURPOSE: Updates an existing environment. # @PURPOSE: Updates an existing environment.
# @PRE: isinstance(env_id, str) and len(env_id) > 0 and isinstance(updated_env, Environment) # @PRE: isinstance(env_id, str) and len(env_id) > 0 and isinstance(updated_env, Environment)
# @POST: Returns True if environment was found and updated.
# @PARAM: env_id (str) - The ID of the environment to update. # @PARAM: env_id (str) - The ID of the environment to update.
# @PARAM: updated_env (Environment) - The updated environment data. # @PARAM: updated_env (Environment) - The updated environment data.
# @RETURN: bool - True if updated, False otherwise. # @RETURN: bool - True if updated, False otherwise.
def update_environment(self, env_id: str, updated_env: Environment) -> bool: def update_environment(self, env_id: str, updated_env: Environment) -> bool:
logger.info(f"[update_environment][Entry] Updating {env_id}") with belief_scope("update_environment"):
logger.info(f"[update_environment][Entry] Updating {env_id}")
# 1. Runtime check of @PRE # 1. Runtime check of @PRE
assert env_id and isinstance(env_id, str), "env_id must be a non-empty string" assert env_id and isinstance(env_id, str), "env_id must be a non-empty string"
@@ -202,14 +254,16 @@ class ConfigManager:
logger.warning(f"[update_environment][Coherence:Failed] Environment {env_id} not found") logger.warning(f"[update_environment][Coherence:Failed] Environment {env_id} not found")
return False return False
# [/DEF:update_environment] # [/DEF:update_environment:Function]
# [DEF:delete_environment:Function] # [DEF:delete_environment:Function]
# @PURPOSE: Deletes an environment by ID. # @PURPOSE: Deletes an environment by ID.
# @PRE: isinstance(env_id, str) and len(env_id) > 0 # @PRE: isinstance(env_id, str) and len(env_id) > 0
# @POST: Environment removed from self.config.environments if it existed.
# @PARAM: env_id (str) - The ID of the environment to delete. # @PARAM: env_id (str) - The ID of the environment to delete.
def delete_environment(self, env_id: str): def delete_environment(self, env_id: str):
logger.info(f"[delete_environment][Entry] Deleting {env_id}") with belief_scope("delete_environment"):
logger.info(f"[delete_environment][Entry] Deleting {env_id}")
# 1. Runtime check of @PRE # 1. Runtime check of @PRE
assert env_id and isinstance(env_id, str), "env_id must be a non-empty string" assert env_id and isinstance(env_id, str), "env_id must be a non-empty string"
@@ -223,8 +277,8 @@ class ConfigManager:
logger.info(f"[delete_environment][Action] Deleted {env_id}") logger.info(f"[delete_environment][Action] Deleted {env_id}")
else: else:
logger.warning(f"[delete_environment][Coherence:Failed] Environment {env_id} not found") logger.warning(f"[delete_environment][Coherence:Failed] Environment {env_id} not found")
# [/DEF:delete_environment] # [/DEF:delete_environment:Function]
# [/DEF:ConfigManager] # [/DEF:ConfigManager:Class]
# [/DEF:ConfigManagerModule] # [/DEF:ConfigManagerModule:Module]

View File

@@ -7,6 +7,14 @@
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
from typing import List, Optional from typing import List, Optional
from ..models.storage import StorageConfig
# [DEF:Schedule:DataClass]
# @PURPOSE: Represents a backup schedule configuration.
class Schedule(BaseModel):
enabled: bool = False
cron_expression: str = "0 0 * * *" # Default: daily at midnight
# [/DEF:Schedule:DataClass]
# [DEF:Environment:DataClass] # [DEF:Environment:DataClass]
# @PURPOSE: Represents a Superset environment configuration. # @PURPOSE: Represents a Superset environment configuration.
@@ -16,21 +24,40 @@ class Environment(BaseModel):
url: str url: str
username: str username: str
password: str # Will be masked in UI password: str # Will be masked in UI
verify_ssl: bool = True
timeout: int = 30
is_default: bool = False is_default: bool = False
# [/DEF:Environment] backup_schedule: Schedule = Field(default_factory=Schedule)
# [/DEF:Environment:DataClass]
# [DEF:LoggingConfig:DataClass]
# @PURPOSE: Defines the configuration for the application's logging system.
class LoggingConfig(BaseModel):
level: str = "INFO"
file_path: Optional[str] = "logs/app.log"
max_bytes: int = 10 * 1024 * 1024
backup_count: int = 5
enable_belief_state: bool = True
# [/DEF:LoggingConfig:DataClass]
# [DEF:GlobalSettings:DataClass] # [DEF:GlobalSettings:DataClass]
# @PURPOSE: Represents global application settings. # @PURPOSE: Represents global application settings.
class GlobalSettings(BaseModel): class GlobalSettings(BaseModel):
backup_path: str storage: StorageConfig = Field(default_factory=StorageConfig)
default_environment_id: Optional[str] = None default_environment_id: Optional[str] = None
# [/DEF:GlobalSettings] logging: LoggingConfig = Field(default_factory=LoggingConfig)
# Task retention settings
task_retention_days: int = 30
task_retention_limit: int = 100
pagination_limit: int = 10
# [/DEF:GlobalSettings:DataClass]
# [DEF:AppConfig:DataClass] # [DEF:AppConfig:DataClass]
# @PURPOSE: The root configuration model containing all application settings. # @PURPOSE: The root configuration model containing all application settings.
class AppConfig(BaseModel): class AppConfig(BaseModel):
environments: List[Environment] = [] environments: List[Environment] = []
settings: GlobalSettings settings: GlobalSettings
# [/DEF:AppConfig] # [/DEF:AppConfig:DataClass]
# [/DEF:ConfigModels] # [/DEF:ConfigModels:Module]

View File

@@ -5,44 +5,137 @@
# @LAYER: Core # @LAYER: Core
# @RELATION: DEPENDS_ON -> sqlalchemy # @RELATION: DEPENDS_ON -> sqlalchemy
# @RELATION: USES -> backend.src.models.mapping # @RELATION: USES -> backend.src.models.mapping
# @RELATION: USES -> backend.src.core.auth.config
# #
# @INVARIANT: A single engine instance is used for the entire application. # @INVARIANT: A single engine instance is used for the entire application.
# [SECTION: IMPORTS] # [SECTION: IMPORTS]
from sqlalchemy import create_engine from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker, Session from sqlalchemy.orm import sessionmaker, Session
from backend.src.models.mapping import Base from ..models.mapping import Base
# Import models to ensure they're registered with Base
from ..models.task import TaskRecord
from ..models.connection import ConnectionConfig
from ..models.git import GitServerConfig, GitRepository, DeploymentEnvironment
from ..models.auth import User, Role, Permission, ADGroupMapping
from ..models.llm import LLMProvider, ValidationRecord
from .logger import belief_scope
from .auth.config import auth_config
import os import os
from pathlib import Path
# [/SECTION] # [/SECTION]
# [DEF:BASE_DIR:Variable]
# @PURPOSE: Base directory for the backend (where .db files should reside).
BASE_DIR = Path(__file__).resolve().parent.parent.parent
# [/DEF:BASE_DIR:Variable]
# [DEF:DATABASE_URL:Constant] # [DEF:DATABASE_URL:Constant]
DATABASE_URL = os.getenv("DATABASE_URL", "sqlite:///./mappings.db") # @PURPOSE: URL for the main mappings database.
# [/DEF:DATABASE_URL] DATABASE_URL = os.getenv("DATABASE_URL", f"sqlite:///{BASE_DIR}/mappings.db")
# [/DEF:DATABASE_URL:Constant]
# [DEF:TASKS_DATABASE_URL:Constant]
# @PURPOSE: URL for the tasks execution database.
TASKS_DATABASE_URL = os.getenv("TASKS_DATABASE_URL", f"sqlite:///{BASE_DIR}/tasks.db")
# [/DEF:TASKS_DATABASE_URL:Constant]
# [DEF:AUTH_DATABASE_URL:Constant]
# @PURPOSE: URL for the authentication database.
AUTH_DATABASE_URL = os.getenv("AUTH_DATABASE_URL", auth_config.AUTH_DATABASE_URL)
# If it's a relative sqlite path starting with ./backend/, fix it to be absolute or relative to BASE_DIR
if AUTH_DATABASE_URL.startswith("sqlite:///./backend/"):
AUTH_DATABASE_URL = AUTH_DATABASE_URL.replace("sqlite:///./backend/", f"sqlite:///{BASE_DIR}/")
elif AUTH_DATABASE_URL.startswith("sqlite:///./") and not AUTH_DATABASE_URL.startswith("sqlite:///./backend/"):
# If it's just ./ but we are in backend, it's fine, but let's make it absolute for robustness
AUTH_DATABASE_URL = AUTH_DATABASE_URL.replace("sqlite:///./", f"sqlite:///{BASE_DIR}/")
# [/DEF:AUTH_DATABASE_URL:Constant]
# [DEF:engine:Variable] # [DEF:engine:Variable]
# @PURPOSE: SQLAlchemy engine for mappings database.
engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False}) engine = create_engine(DATABASE_URL, connect_args={"check_same_thread": False})
# [/DEF:engine] # [/DEF:engine:Variable]
# [DEF:tasks_engine:Variable]
# @PURPOSE: SQLAlchemy engine for tasks database.
tasks_engine = create_engine(TASKS_DATABASE_URL, connect_args={"check_same_thread": False})
# [/DEF:tasks_engine:Variable]
# [DEF:auth_engine:Variable]
# @PURPOSE: SQLAlchemy engine for authentication database.
auth_engine = create_engine(AUTH_DATABASE_URL, connect_args={"check_same_thread": False})
# [/DEF:auth_engine:Variable]
# [DEF:SessionLocal:Class] # [DEF:SessionLocal:Class]
# @PURPOSE: A session factory for the main mappings database.
# @PRE: engine is initialized.
SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine) SessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=engine)
# [/DEF:SessionLocal] # [/DEF:SessionLocal:Class]
# [DEF:TasksSessionLocal:Class]
# @PURPOSE: A session factory for the tasks execution database.
# @PRE: tasks_engine is initialized.
TasksSessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=tasks_engine)
# [/DEF:TasksSessionLocal:Class]
# [DEF:AuthSessionLocal:Class]
# @PURPOSE: A session factory for the authentication database.
# @PRE: auth_engine is initialized.
AuthSessionLocal = sessionmaker(autocommit=False, autoflush=False, bind=auth_engine)
# [/DEF:AuthSessionLocal:Class]
# [DEF:init_db:Function] # [DEF:init_db:Function]
# @PURPOSE: Initializes the database by creating all tables. # @PURPOSE: Initializes the database by creating all tables.
# @PRE: engine, tasks_engine and auth_engine are initialized.
# @POST: Database tables created in all databases.
# @SIDE_EFFECT: Creates physical database files if they don't exist.
def init_db(): def init_db():
Base.metadata.create_all(bind=engine) with belief_scope("init_db"):
# [/DEF:init_db] Base.metadata.create_all(bind=engine)
Base.metadata.create_all(bind=tasks_engine)
Base.metadata.create_all(bind=auth_engine)
# [/DEF:init_db:Function]
# [DEF:get_db:Function] # [DEF:get_db:Function]
# @PURPOSE: Dependency for getting a database session. # @PURPOSE: Dependency for getting a database session.
# @PRE: SessionLocal is initialized.
# @POST: Session is closed after use. # @POST: Session is closed after use.
# @RETURN: Generator[Session, None, None] # @RETURN: Generator[Session, None, None]
def get_db(): def get_db():
db = SessionLocal() with belief_scope("get_db"):
try: db = SessionLocal()
yield db try:
finally: yield db
db.close() finally:
# [/DEF:get_db] db.close()
# [/DEF:get_db:Function]
# [/DEF:backend.src.core.database] # [DEF:get_tasks_db:Function]
# @PURPOSE: Dependency for getting a tasks database session.
# @PRE: TasksSessionLocal is initialized.
# @POST: Session is closed after use.
# @RETURN: Generator[Session, None, None]
def get_tasks_db():
with belief_scope("get_tasks_db"):
db = TasksSessionLocal()
try:
yield db
finally:
db.close()
# [/DEF:get_tasks_db:Function]
# [DEF:get_auth_db:Function]
# @PURPOSE: Dependency for getting an authentication database session.
# @PRE: AuthSessionLocal is initialized.
# @POST: Session is closed after use.
# @RETURN: Generator[Session, None, None]
def get_auth_db():
with belief_scope("get_auth_db"):
db = AuthSessionLocal()
try:
yield db
finally:
db.close()
# [/DEF:get_auth_db:Function]
# [/DEF:backend.src.core.database:Module]

View File

@@ -4,12 +4,39 @@
# @LAYER: Core # @LAYER: Core
# @RELATION: Used by the main application and other modules to log events. The WebSocketLogHandler is used by the WebSocket endpoint in app.py. # @RELATION: Used by the main application and other modules to log events. The WebSocketLogHandler is used by the WebSocket endpoint in app.py.
import logging import logging
import threading
from datetime import datetime from datetime import datetime
from typing import Dict, Any, List, Optional from typing import Dict, Any, List, Optional
from collections import deque from collections import deque
from contextlib import contextmanager
from logging.handlers import RotatingFileHandler
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
# Thread-local storage for belief state
_belief_state = threading.local()
# Global flag for belief state logging
_enable_belief_state = True
# [DEF:BeliefFormatter:Class]
# @PURPOSE: Custom logging formatter that adds belief state prefixes to log messages.
class BeliefFormatter(logging.Formatter):
# [DEF:format:Function]
# @PURPOSE: Formats the log record, adding belief state context if available.
# @PRE: record is a logging.LogRecord.
# @POST: Returns formatted string.
# @PARAM: record (logging.LogRecord) - The log record to format.
# @RETURN: str - The formatted log message.
# @SEMANTICS: logging, formatter, context
def format(self, record):
anchor_id = getattr(_belief_state, 'anchor_id', None)
if anchor_id:
record.msg = f"[{anchor_id}][Action] {record.msg}"
return super().format(record)
# [/DEF:format:Function]
# [/DEF:BeliefFormatter:Class]
# Re-using LogEntry from task_manager for consistency # Re-using LogEntry from task_manager for consistency
# [DEF:LogEntry:Class] # [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic # @SEMANTICS: log, entry, record, pydantic
@@ -20,7 +47,88 @@ class LogEntry(BaseModel):
message: str message: str
context: Optional[Dict[str, Any]] = None context: Optional[Dict[str, Any]] = None
# [/DEF] # [/DEF:LogEntry:Class]
# [DEF:belief_scope:Function]
# @PURPOSE: Context manager for structured Belief State logging.
# @PARAM: anchor_id (str) - The identifier for the current semantic block.
# @PARAM: message (str) - Optional entry message.
# @PRE: anchor_id must be provided.
# @POST: Thread-local belief state is updated and entry/exit logs are generated.
# @SEMANTICS: logging, context, belief_state
@contextmanager
def belief_scope(anchor_id: str, message: str = ""):
# Log Entry if enabled
if _enable_belief_state:
entry_msg = f"[{anchor_id}][Entry]"
if message:
entry_msg += f" {message}"
logger.info(entry_msg)
# Set thread-local anchor_id
old_anchor = getattr(_belief_state, 'anchor_id', None)
_belief_state.anchor_id = anchor_id
try:
yield
# Log Coherence OK and Exit
logger.info(f"[{anchor_id}][Coherence:OK]")
if _enable_belief_state:
logger.info(f"[{anchor_id}][Exit]")
except Exception as e:
# Log Coherence Failed
logger.info(f"[{anchor_id}][Coherence:Failed] {str(e)}")
raise
finally:
# Restore old anchor
_belief_state.anchor_id = old_anchor
# [/DEF:belief_scope:Function]
# [DEF:configure_logger:Function]
# @PURPOSE: Configures the logger with the provided logging settings.
# @PRE: config is a valid LoggingConfig instance.
# @POST: Logger level, handlers, and belief state flag are updated.
# @PARAM: config (LoggingConfig) - The logging configuration.
# @SEMANTICS: logging, configuration, initialization
def configure_logger(config):
global _enable_belief_state
_enable_belief_state = config.enable_belief_state
# Set logger level
level = getattr(logging, config.level.upper(), logging.INFO)
logger.setLevel(level)
# Remove existing file handlers
handlers_to_remove = [h for h in logger.handlers if isinstance(h, RotatingFileHandler)]
for h in handlers_to_remove:
logger.removeHandler(h)
h.close()
# Add file handler if file_path is set
if config.file_path:
import os
from pathlib import Path
log_file = Path(config.file_path)
log_file.parent.mkdir(parents=True, exist_ok=True)
file_handler = RotatingFileHandler(
config.file_path,
maxBytes=config.max_bytes,
backupCount=config.backup_count
)
file_handler.setFormatter(BeliefFormatter(
'[%(asctime)s][%(levelname)s][%(name)s] %(message)s'
))
logger.addHandler(file_handler)
# Update existing handlers' formatters to BeliefFormatter
for handler in logger.handlers:
if not isinstance(handler, RotatingFileHandler):
handler.setFormatter(BeliefFormatter(
'[%(asctime)s][%(levelname)s][%(name)s] %(message)s'
))
# [/DEF:configure_logger:Function]
# [DEF:WebSocketLogHandler:Class] # [DEF:WebSocketLogHandler:Class]
# @SEMANTICS: logging, handler, websocket, buffer # @SEMANTICS: logging, handler, websocket, buffer
@@ -30,12 +138,25 @@ class WebSocketLogHandler(logging.Handler):
A logging handler that stores log records and can be extended to send them A logging handler that stores log records and can be extended to send them
over WebSockets. over WebSockets.
""" """
# [DEF:__init__:Function]
# @PURPOSE: Initializes the handler with a fixed-capacity buffer.
# @PRE: capacity is an integer.
# @POST: Instance initialized with empty deque.
# @PARAM: capacity (int) - Maximum number of logs to keep in memory.
# @SEMANTICS: logging, initialization, buffer
def __init__(self, capacity: int = 1000): def __init__(self, capacity: int = 1000):
super().__init__() super().__init__()
self.log_buffer: deque[LogEntry] = deque(maxlen=capacity) self.log_buffer: deque[LogEntry] = deque(maxlen=capacity)
# In a real implementation, you'd have a way to manage active WebSocket connections # In a real implementation, you'd have a way to manage active WebSocket connections
# e.g., self.active_connections: Set[WebSocket] = set() # e.g., self.active_connections: Set[WebSocket] = set()
# [/DEF:__init__:Function]
# [DEF:emit:Function]
# @PURPOSE: Captures a log record, formats it, and stores it in the buffer.
# @PRE: record is a logging.LogRecord.
# @POST: Log is added to the log_buffer.
# @PARAM: record (logging.LogRecord) - The log record to emit.
# @SEMANTICS: logging, handler, buffer
def emit(self, record: logging.LogRecord): def emit(self, record: logging.LogRecord):
try: try:
log_entry = LogEntry( log_entry = LogEntry(
@@ -56,23 +177,55 @@ class WebSocketLogHandler(logging.Handler):
# Example: for ws in self.active_connections: await ws.send_json(log_entry.dict()) # Example: for ws in self.active_connections: await ws.send_json(log_entry.dict())
except Exception: except Exception:
self.handleError(record) self.handleError(record)
# [/DEF:emit:Function]
# [DEF:get_recent_logs:Function]
# @PURPOSE: Returns a list of recent log entries from the buffer.
# @PRE: None.
# @POST: Returns list of LogEntry objects.
# @RETURN: List[LogEntry] - List of buffered log entries.
# @SEMANTICS: logging, buffer, retrieval
def get_recent_logs(self) -> List[LogEntry]: def get_recent_logs(self) -> List[LogEntry]:
""" """
Returns a list of recent log entries from the buffer. Returns a list of recent log entries from the buffer.
""" """
return list(self.log_buffer) return list(self.log_buffer)
# [/DEF:get_recent_logs:Function]
# [/DEF] # [/DEF:WebSocketLogHandler:Class]
# [DEF:Logger:Global] # [DEF:Logger:Global]
# @SEMANTICS: logger, global, instance # @SEMANTICS: logger, global, instance
# @PURPOSE: The global logger instance for the application, configured with both a console handler and the custom WebSocket handler. # @PURPOSE: The global logger instance for the application, configured with both a console handler and the custom WebSocket handler.
logger = logging.getLogger("superset_tools_app") logger = logging.getLogger("superset_tools_app")
# [DEF:believed:Function]
# @PURPOSE: A decorator that wraps a function in a belief scope.
# @PARAM: anchor_id (str) - The identifier for the semantic block.
# @PRE: anchor_id must be a string.
# @POST: Returns a decorator function.
def believed(anchor_id: str):
# [DEF:decorator:Function]
# @PURPOSE: Internal decorator for belief scope.
# @PRE: func must be a callable.
# @POST: Returns the wrapped function.
def decorator(func):
# [DEF:wrapper:Function]
# @PURPOSE: Internal wrapper that enters belief scope.
# @PRE: None.
# @POST: Executes the function within a belief scope.
def wrapper(*args, **kwargs):
with belief_scope(anchor_id):
return func(*args, **kwargs)
# [/DEF:wrapper:Function]
return wrapper
# [/DEF:decorator:Function]
return decorator
# [/DEF:believed:Function]
logger.setLevel(logging.INFO) logger.setLevel(logging.INFO)
# Create a formatter # Create a formatter
formatter = logging.Formatter( formatter = BeliefFormatter(
'[%(asctime)s][%(levelname)s][%(name)s] %(message)s' '[%(asctime)s][%(levelname)s][%(name)s] %(message)s'
) )
@@ -89,4 +242,5 @@ logger.addHandler(websocket_log_handler)
# Example usage: # Example usage:
# logger.info("Application started", extra={"context_key": "context_value"}) # logger.info("Application started", extra={"context_key": "context_value"})
# logger.error("An error occurred", exc_info=True) # logger.error("An error occurred", exc_info=True)
# [/DEF] # [/DEF:Logger:Global]
# [/DEF:LoggerModule:Module]

View File

@@ -15,51 +15,74 @@ import shutil
import tempfile import tempfile
from pathlib import Path from pathlib import Path
from typing import Dict from typing import Dict
from .logger import logger, belief_scope
import yaml
# [/SECTION] # [/SECTION]
# [DEF:MigrationEngine:Class] # [DEF:MigrationEngine:Class]
# @PURPOSE: Engine for transforming Superset export ZIPs. # @PURPOSE: Engine for transforming Superset export ZIPs.
class MigrationEngine: class MigrationEngine:
# [DEF:MigrationEngine.transform_zip:Function] # [DEF:transform_zip:Function]
# @PURPOSE: Extracts ZIP, replaces database UUIDs in YAMLs, and re-packages. # @PURPOSE: Extracts ZIP, replaces database UUIDs in YAMLs, and re-packages.
# @PARAM: zip_path (str) - Path to the source ZIP file. # @PARAM: zip_path (str) - Path to the source ZIP file.
# @PARAM: output_path (str) - Path where the transformed ZIP will be saved. # @PARAM: output_path (str) - Path where the transformed ZIP will be saved.
# @PARAM: db_mapping (Dict[str, str]) - Mapping of source UUID to target UUID. # @PARAM: db_mapping (Dict[str, str]) - Mapping of source UUID to target UUID.
# @PARAM: strip_databases (bool) - Whether to remove the databases directory from the archive.
# @PRE: zip_path must point to a valid Superset export archive.
# @POST: Transformed archive is saved to output_path.
# @RETURN: bool - True if successful. # @RETURN: bool - True if successful.
def transform_zip(self, zip_path: str, output_path: str, db_mapping: Dict[str, str]) -> bool: def transform_zip(self, zip_path: str, output_path: str, db_mapping: Dict[str, str], strip_databases: bool = True) -> bool:
""" """
Transform a Superset export ZIP by replacing database UUIDs. Transform a Superset export ZIP by replacing database UUIDs.
""" """
with tempfile.TemporaryDirectory() as temp_dir_str: with belief_scope("MigrationEngine.transform_zip"):
temp_dir = Path(temp_dir_str) with tempfile.TemporaryDirectory() as temp_dir_str:
temp_dir = Path(temp_dir_str)
try: try:
# 1. Extract # 1. Extract
with zipfile.ZipFile(zip_path, 'r') as zf: logger.info(f"[MigrationEngine.transform_zip][Action] Extracting ZIP: {zip_path}")
zf.extractall(temp_dir) with zipfile.ZipFile(zip_path, 'r') as zf:
zf.extractall(temp_dir)
# 2. Transform YAMLs # 2. Transform YAMLs
# Datasets are usually in datasets/*.yaml # Datasets are usually in datasets/*.yaml
dataset_files = list(temp_dir.glob("**/datasets/*.yaml")) dataset_files = list(temp_dir.glob("**/datasets/**/*.yaml")) + list(temp_dir.glob("**/datasets/*.yaml"))
for ds_file in dataset_files: dataset_files = list(set(dataset_files))
self._transform_yaml(ds_file, db_mapping)
logger.info(f"[MigrationEngine.transform_zip][State] Found {len(dataset_files)} dataset files.")
for ds_file in dataset_files:
logger.info(f"[MigrationEngine.transform_zip][Action] Transforming dataset: {ds_file}")
self._transform_yaml(ds_file, db_mapping)
# 3. Re-package # 3. Re-package
with zipfile.ZipFile(output_path, 'w', zipfile.ZIP_DEFLATED) as zf: logger.info(f"[MigrationEngine.transform_zip][Action] Re-packaging ZIP to: {output_path} (strip_databases={strip_databases})")
for root, dirs, files in os.walk(temp_dir): with zipfile.ZipFile(output_path, 'w', zipfile.ZIP_DEFLATED) as zf:
for file in files: for root, dirs, files in os.walk(temp_dir):
file_path = Path(root) / file rel_root = Path(root).relative_to(temp_dir)
arcname = file_path.relative_to(temp_dir)
zf.write(file_path, arcname) if strip_databases and "databases" in rel_root.parts:
logger.info(f"[MigrationEngine.transform_zip][Action] Skipping file in databases directory: {rel_root}")
return True continue
except Exception as e:
print(f"Error transforming ZIP: {e}")
return False
# [DEF:MigrationEngine._transform_yaml:Function] for file in files:
file_path = Path(root) / file
arcname = file_path.relative_to(temp_dir)
zf.write(file_path, arcname)
return True
except Exception as e:
logger.error(f"[MigrationEngine.transform_zip][Coherence:Failed] Error transforming ZIP: {e}")
return False
# [/DEF:transform_zip:Function]
# [DEF:_transform_yaml:Function]
# @PURPOSE: Replaces database_uuid in a single YAML file. # @PURPOSE: Replaces database_uuid in a single YAML file.
# @PARAM: file_path (Path) - Path to the YAML file.
# @PARAM: db_mapping (Dict[str, str]) - UUID mapping dictionary.
# @PRE: file_path must exist and be readable.
# @POST: File is modified in-place if source UUID matches mapping.
def _transform_yaml(self, file_path: Path, db_mapping: Dict[str, str]): def _transform_yaml(self, file_path: Path, db_mapping: Dict[str, str]):
with open(file_path, 'r') as f: with open(file_path, 'r') as f:
data = yaml.safe_load(f) data = yaml.safe_load(f)
@@ -74,8 +97,8 @@ class MigrationEngine:
data['database_uuid'] = db_mapping[source_uuid] data['database_uuid'] = db_mapping[source_uuid]
with open(file_path, 'w') as f: with open(file_path, 'w') as f:
yaml.dump(data, f) yaml.dump(data, f)
# [/DEF:MigrationEngine._transform_yaml] # [/DEF:_transform_yaml:Function]
# [/DEF:MigrationEngine] # [/DEF:MigrationEngine:Class]
# [/DEF:backend.src.core.migration_engine] # [/DEF:backend.src.core.migration_engine:Module]

View File

@@ -1,5 +1,6 @@
from abc import ABC, abstractmethod from abc import ABC, abstractmethod
from typing import Dict, Any from typing import Dict, Any, Optional
from .logger import belief_scope
from pydantic import BaseModel, Field from pydantic import BaseModel, Field
@@ -17,44 +18,114 @@ class PluginBase(ABC):
@property @property
@abstractmethod @abstractmethod
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string ID.
# @RETURN: str - Plugin ID.
def id(self) -> str: def id(self) -> str:
"""A unique identifier for the plugin.""" """A unique identifier for the plugin."""
pass with belief_scope("id"):
pass
# [/DEF:id:Function]
@property @property
@abstractmethod @abstractmethod
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string name.
# @RETURN: str - Plugin name.
def name(self) -> str: def name(self) -> str:
"""A human-readable name for the plugin.""" """A human-readable name for the plugin."""
pass with belief_scope("name"):
pass
# [/DEF:name:Function]
@property @property
@abstractmethod @abstractmethod
# [DEF:description:Function]
# @PURPOSE: Returns a brief description of the plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string description.
# @RETURN: str - Plugin description.
def description(self) -> str: def description(self) -> str:
"""A brief description of what the plugin does.""" """A brief description of what the plugin does."""
pass with belief_scope("description"):
pass
# [/DEF:description:Function]
@property @property
@abstractmethod @abstractmethod
# [DEF:version:Function]
# @PURPOSE: Returns the version of the plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string version.
# @RETURN: str - Plugin version.
def version(self) -> str: def version(self) -> str:
"""The version of the plugin.""" """The version of the plugin."""
pass with belief_scope("version"):
pass
# [/DEF:version:Function]
@property
# [DEF:required_permission:Function]
# @PURPOSE: Returns the required permission string to execute this plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string permission.
# @RETURN: str - Required permission (e.g., "plugin:backup:execute").
def required_permission(self) -> str:
"""The permission string required to execute this plugin."""
with belief_scope("required_permission"):
return f"plugin:{self.id}:execute"
# [/DEF:required_permission:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the plugin's UI, if applicable.
# @PRE: Plugin instance exists.
# @POST: Returns string route or None.
# @RETURN: Optional[str] - Frontend route.
def ui_route(self) -> Optional[str]:
"""
The frontend route for the plugin's UI.
Returns None if the plugin does not have a dedicated UI page.
"""
with belief_scope("ui_route"):
return None
# [/DEF:ui_route:Function]
@abstractmethod @abstractmethod
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for the plugin's input parameters.
# @PRE: Plugin instance exists.
# @POST: Returns dict schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]: def get_schema(self) -> Dict[str, Any]:
""" """
Returns the JSON schema for the plugin's input parameters. Returns the JSON schema for the plugin's input parameters.
This schema will be used to generate the frontend form. This schema will be used to generate the frontend form.
""" """
pass with belief_scope("get_schema"):
pass
# [/DEF:get_schema:Function]
@abstractmethod @abstractmethod
# [DEF:execute:Function]
# @PURPOSE: Executes the plugin's core logic.
# @PARAM: params (Dict[str, Any]) - Validated input parameters.
# @PRE: params must be a dictionary.
# @POST: Plugin execution is completed.
async def execute(self, params: Dict[str, Any]): async def execute(self, params: Dict[str, Any]):
with belief_scope("execute"):
pass
""" """
Executes the plugin's logic. Executes the plugin's logic.
The `params` argument will be validated against the schema returned by `get_schema()`. The `params` argument will be validated against the schema returned by `get_schema()`.
""" """
pass pass
# [/DEF] # [/DEF:execute:Function]
# [/DEF:PluginBase:Class]
# [DEF:PluginConfig:Class] # [DEF:PluginConfig:Class]
# @SEMANTICS: plugin, config, schema, pydantic # @SEMANTICS: plugin, config, schema, pydantic
@@ -67,5 +138,6 @@ class PluginConfig(BaseModel):
name: str = Field(..., description="Human-readable name for the plugin") name: str = Field(..., description="Human-readable name for the plugin")
description: str = Field(..., description="Brief description of what the plugin does") description: str = Field(..., description="Brief description of what the plugin does")
version: str = Field(..., description="Version of the plugin") version: str = Field(..., description="Version of the plugin")
ui_route: Optional[str] = Field(None, description="Frontend route for the plugin UI")
input_schema: Dict[str, Any] = Field(..., description="JSON schema for input parameters", alias="schema") input_schema: Dict[str, Any] = Field(..., description="JSON schema for input parameters", alias="schema")
# [/DEF] # [/DEF:PluginConfig:Class]

View File

@@ -4,6 +4,7 @@ import sys # Added this line
from typing import Dict, Type, List, Optional from typing import Dict, Type, List, Optional
from .plugin_base import PluginBase, PluginConfig from .plugin_base import PluginBase, PluginConfig
from jsonschema import validate from jsonschema import validate
from .logger import belief_scope
# [DEF:PluginLoader:Class] # [DEF:PluginLoader:Class]
# @SEMANTICS: plugin, loader, dynamic, import # @SEMANTICS: plugin, loader, dynamic, import
@@ -16,16 +17,28 @@ class PluginLoader:
that inherit from PluginBase. that inherit from PluginBase.
""" """
# [DEF:__init__:Function]
# @PURPOSE: Initializes the PluginLoader with a directory to scan.
# @PRE: plugin_dir is a valid directory path.
# @POST: Plugins are loaded and registered.
# @PARAM: plugin_dir (str) - The directory containing plugin modules.
def __init__(self, plugin_dir: str): def __init__(self, plugin_dir: str):
self.plugin_dir = plugin_dir with belief_scope("__init__"):
self._plugins: Dict[str, PluginBase] = {} self.plugin_dir = plugin_dir
self._plugin_configs: Dict[str, PluginConfig] = {} self._plugins: Dict[str, PluginBase] = {}
self._load_plugins() self._plugin_configs: Dict[str, PluginConfig] = {}
self._load_plugins()
# [/DEF:__init__:Function]
# [DEF:_load_plugins:Function]
# @PURPOSE: Scans the plugin directory and loads all valid plugins.
# @PRE: plugin_dir exists or can be created.
# @POST: _load_module is called for each .py file.
def _load_plugins(self): def _load_plugins(self):
""" with belief_scope("_load_plugins"):
Scans the plugin directory, imports modules, and registers valid plugins. """
""" Scans the plugin directory, imports modules, and registers valid plugins.
"""
if not os.path.exists(self.plugin_dir): if not os.path.exists(self.plugin_dir):
os.makedirs(self.plugin_dir) os.makedirs(self.plugin_dir)
@@ -37,22 +50,44 @@ class PluginLoader:
sys.path.insert(0, plugin_parent_dir) sys.path.insert(0, plugin_parent_dir)
for filename in os.listdir(self.plugin_dir): for filename in os.listdir(self.plugin_dir):
file_path = os.path.join(self.plugin_dir, filename)
# Handle directory-based plugins (packages)
if os.path.isdir(file_path):
init_file = os.path.join(file_path, "__init__.py")
if os.path.exists(init_file):
self._load_module(filename, init_file)
continue
# Handle single-file plugins
if filename.endswith(".py") and filename != "__init__.py": if filename.endswith(".py") and filename != "__init__.py":
module_name = filename[:-3] module_name = filename[:-3]
file_path = os.path.join(self.plugin_dir, filename)
self._load_module(module_name, file_path) self._load_module(module_name, file_path)
# [/DEF:_load_plugins:Function]
# [DEF:_load_module:Function]
# @PURPOSE: Loads a single Python module and discovers PluginBase implementations.
# @PRE: module_name and file_path are valid.
# @POST: Plugin classes are instantiated and registered.
# @PARAM: module_name (str) - The name of the module.
# @PARAM: file_path (str) - The path to the module file.
def _load_module(self, module_name: str, file_path: str): def _load_module(self, module_name: str, file_path: str):
""" with belief_scope("_load_module"):
Loads a single Python module and extracts PluginBase subclasses. """
""" Loads a single Python module and extracts PluginBase subclasses.
"""
# Try to determine the correct package prefix based on how the app is running # Try to determine the correct package prefix based on how the app is running
if "backend.src" in __name__: # For standalone execution, we need to handle the import differently
if __name__ == "__main__" or "test" in __name__:
# When running as standalone or in tests, use relative import
package_name = f"plugins.{module_name}"
elif "backend.src" in __name__:
package_prefix = "backend.src.plugins" package_prefix = "backend.src.plugins"
package_name = f"{package_prefix}.{module_name}"
else: else:
package_prefix = "src.plugins" package_prefix = "src.plugins"
package_name = f"{package_prefix}.{module_name}"
package_name = f"{package_prefix}.{module_name}"
# print(f"DEBUG: Loading plugin {module_name} as {package_name}") # print(f"DEBUG: Loading plugin {module_name} as {package_name}")
spec = importlib.util.spec_from_file_location(package_name, file_path) spec = importlib.util.spec_from_file_location(package_name, file_path)
if spec is None or spec.loader is None: if spec is None or spec.loader is None:
@@ -78,11 +113,18 @@ class PluginLoader:
self._register_plugin(plugin_instance) self._register_plugin(plugin_instance)
except Exception as e: except Exception as e:
print(f"Error instantiating plugin {attribute_name} in {module_name}: {e}") # Replace with proper logging print(f"Error instantiating plugin {attribute_name} in {module_name}: {e}") # Replace with proper logging
# [/DEF:_load_module:Function]
# [DEF:_register_plugin:Function]
# @PURPOSE: Registers a PluginBase instance and its configuration.
# @PRE: plugin_instance is a valid implementation of PluginBase.
# @POST: Plugin is added to _plugins and _plugin_configs.
# @PARAM: plugin_instance (PluginBase) - The plugin instance to register.
def _register_plugin(self, plugin_instance: PluginBase): def _register_plugin(self, plugin_instance: PluginBase):
""" with belief_scope("_register_plugin"):
Registers a valid plugin instance. """
""" Registers a valid plugin instance.
"""
plugin_id = plugin_instance.id plugin_id = plugin_instance.id
if plugin_id in self._plugins: if plugin_id in self._plugins:
print(f"Warning: Duplicate plugin ID '{plugin_id}' found. Skipping.") # Replace with proper logging print(f"Warning: Duplicate plugin ID '{plugin_id}' found. Skipping.") # Replace with proper logging
@@ -99,6 +141,7 @@ class PluginLoader:
name=plugin_instance.name, name=plugin_instance.name,
description=plugin_instance.description, description=plugin_instance.description,
version=plugin_instance.version, version=plugin_instance.version,
ui_route=plugin_instance.ui_route,
schema=schema, schema=schema,
) )
# The following line is commented out because it requires a schema to be passed to validate against. # The following line is commented out because it requires a schema to be passed to validate against.
@@ -106,25 +149,53 @@ class PluginLoader:
# validate(instance={}, schema=schema) # validate(instance={}, schema=schema)
self._plugins[plugin_id] = plugin_instance self._plugins[plugin_id] = plugin_instance
self._plugin_configs[plugin_id] = plugin_config self._plugin_configs[plugin_id] = plugin_config
print(f"Plugin '{plugin_instance.name}' (ID: {plugin_id}) loaded successfully.") # Replace with proper logging from ..core.logger import logger
logger.info(f"Plugin '{plugin_instance.name}' (ID: {plugin_id}) loaded successfully.")
except Exception as e: except Exception as e:
print(f"Error validating plugin '{plugin_instance.name}' (ID: {plugin_id}): {e}") # Replace with proper logging from ..core.logger import logger
logger.error(f"Error validating plugin '{plugin_instance.name}' (ID: {plugin_id}): {e}")
# [/DEF:_register_plugin:Function]
# [DEF:get_plugin:Function]
# @PURPOSE: Retrieves a loaded plugin instance by its ID.
# @PRE: plugin_id is a string.
# @POST: Returns plugin instance or None.
# @PARAM: plugin_id (str) - The unique identifier of the plugin.
# @RETURN: Optional[PluginBase] - The plugin instance if found, otherwise None.
def get_plugin(self, plugin_id: str) -> Optional[PluginBase]: def get_plugin(self, plugin_id: str) -> Optional[PluginBase]:
""" with belief_scope("get_plugin"):
Returns a loaded plugin instance by its ID. """
""" Returns a loaded plugin instance by its ID.
"""
return self._plugins.get(plugin_id) return self._plugins.get(plugin_id)
# [/DEF:get_plugin:Function]
# [DEF:get_all_plugin_configs:Function]
# @PURPOSE: Returns a list of all registered plugin configurations.
# @PRE: None.
# @POST: Returns list of all PluginConfig objects.
# @RETURN: List[PluginConfig] - A list of plugin configurations.
def get_all_plugin_configs(self) -> List[PluginConfig]: def get_all_plugin_configs(self) -> List[PluginConfig]:
""" with belief_scope("get_all_plugin_configs"):
Returns a list of all loaded plugin configurations. """
""" Returns a list of all loaded plugin configurations.
"""
return list(self._plugin_configs.values()) return list(self._plugin_configs.values())
# [/DEF:get_all_plugin_configs:Function]
# [DEF:has_plugin:Function]
# @PURPOSE: Checks if a plugin with the given ID is registered.
# @PRE: plugin_id is a string.
# @POST: Returns True if plugin exists.
# @PARAM: plugin_id (str) - The unique identifier of the plugin.
# @RETURN: bool - True if the plugin is registered, False otherwise.
def has_plugin(self, plugin_id: str) -> bool: def has_plugin(self, plugin_id: str) -> bool:
""" with belief_scope("has_plugin"):
Checks if a plugin with the given ID is loaded. """
""" Checks if a plugin with the given ID is loaded.
return plugin_id in self._plugins """
return plugin_id in self._plugins
# [/DEF:has_plugin:Function]
# [/DEF:PluginLoader:Class]

View File

@@ -0,0 +1,119 @@
# [DEF:SchedulerModule:Module]
# @SEMANTICS: scheduler, apscheduler, cron, backup
# @PURPOSE: Manages scheduled tasks using APScheduler.
# @LAYER: Core
# @RELATION: Uses TaskManager to run scheduled backups.
# [SECTION: IMPORTS]
from apscheduler.schedulers.background import BackgroundScheduler
from apscheduler.triggers.cron import CronTrigger
from .logger import logger, belief_scope
from .config_manager import ConfigManager
from typing import Optional
import asyncio
# [/SECTION]
# [DEF:SchedulerService:Class]
# @SEMANTICS: scheduler, service, apscheduler
# @PURPOSE: Provides a service to manage scheduled backup tasks.
class SchedulerService:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the scheduler service with task and config managers.
# @PRE: task_manager and config_manager must be provided.
# @POST: Scheduler instance is created but not started.
def __init__(self, task_manager, config_manager: ConfigManager):
with belief_scope("SchedulerService.__init__"):
self.task_manager = task_manager
self.config_manager = config_manager
self.scheduler = BackgroundScheduler()
self.loop = asyncio.get_event_loop()
# [/DEF:__init__:Function]
# [DEF:start:Function]
# @PURPOSE: Starts the background scheduler and loads initial schedules.
# @PRE: Scheduler should be initialized.
# @POST: Scheduler is running and schedules are loaded.
def start(self):
with belief_scope("SchedulerService.start"):
if not self.scheduler.running:
self.scheduler.start()
logger.info("Scheduler started.")
self.load_schedules()
# [/DEF:start:Function]
# [DEF:stop:Function]
# @PURPOSE: Stops the background scheduler.
# @PRE: Scheduler should be running.
# @POST: Scheduler is shut down.
def stop(self):
with belief_scope("SchedulerService.stop"):
if self.scheduler.running:
self.scheduler.shutdown()
logger.info("Scheduler stopped.")
# [/DEF:stop:Function]
# [DEF:load_schedules:Function]
# @PURPOSE: Loads backup schedules from configuration and registers them.
# @PRE: config_manager must have valid configuration.
# @POST: All enabled backup jobs are added to the scheduler.
def load_schedules(self):
with belief_scope("SchedulerService.load_schedules"):
# Clear existing jobs
self.scheduler.remove_all_jobs()
config = self.config_manager.get_config()
for env in config.environments:
if env.backup_schedule and env.backup_schedule.enabled:
self.add_backup_job(env.id, env.backup_schedule.cron_expression)
# [/DEF:load_schedules:Function]
# [DEF:add_backup_job:Function]
# @PURPOSE: Adds a scheduled backup job for an environment.
# @PRE: env_id and cron_expression must be valid strings.
# @POST: A new job is added to the scheduler or replaced if it already exists.
# @PARAM: env_id (str) - The ID of the environment.
# @PARAM: cron_expression (str) - The cron expression for the schedule.
def add_backup_job(self, env_id: str, cron_expression: str):
with belief_scope("SchedulerService.add_backup_job", f"env_id={env_id}, cron={cron_expression}"):
job_id = f"backup_{env_id}"
try:
self.scheduler.add_job(
self._trigger_backup,
CronTrigger.from_crontab(cron_expression),
id=job_id,
args=[env_id],
replace_existing=True
)
logger.info(f"Scheduled backup job added for environment {env_id}: {cron_expression}")
except Exception as e:
logger.error(f"Failed to add backup job for environment {env_id}: {e}")
# [/DEF:add_backup_job:Function]
# [DEF:_trigger_backup:Function]
# @PURPOSE: Triggered by the scheduler to start a backup task.
# @PRE: env_id must be a valid environment ID.
# @POST: A new backup task is created in the task manager if not already running.
# @PARAM: env_id (str) - The ID of the environment.
def _trigger_backup(self, env_id: str):
with belief_scope("SchedulerService._trigger_backup", f"env_id={env_id}"):
logger.info(f"Triggering scheduled backup for environment {env_id}")
# Check if a backup is already running for this environment
active_tasks = self.task_manager.get_tasks(limit=100)
for task in active_tasks:
if (task.plugin_id == "superset-backup" and
task.status in ["PENDING", "RUNNING"] and
task.params.get("environment_id") == env_id):
logger.warning(f"Backup already running for environment {env_id}. Skipping scheduled run.")
return
# Run the backup task
# We need to run this in the event loop since create_task is async
asyncio.run_coroutine_threadsafe(
self.task_manager.create_task("superset-backup", {"environment_id": env_id}),
self.loop
)
# [/DEF:_trigger_backup:Function]
# [/DEF:SchedulerService:Class]
# [/DEF:SchedulerModule:Module]

View File

@@ -1,57 +1,450 @@
# [DEF:backend.src.core.superset_client:Module] # [DEF:backend.src.core.superset_client:Module]
# #
# @SEMANTICS: superset, api, client, database, metadata # @SEMANTICS: superset, api, client, rest, http, dashboard, dataset, import, export
# @PURPOSE: Extends the base SupersetClient with database-specific metadata fetching. # @PURPOSE: Предоставляет высокоуровневый клиент для взаимодействия с Superset REST API, инкапсулируя логику запросов, обработку ошибок и пагинацию.
# @LAYER: Core # @LAYER: Core
# @RELATION: INHERITS_FROM -> superset_tool.client.SupersetClient # @RELATION: USES -> backend.src.core.utils.network.APIClient
# @RELATION: USES -> backend.src.core.config_models.Environment
# #
# @INVARIANT: All database metadata requests must include UUID and name. # @INVARIANT: All network operations must use the internal APIClient instance.
# @PUBLIC_API: SupersetClient
# [SECTION: IMPORTS] # [SECTION: IMPORTS]
from typing import List, Dict, Optional, Tuple import json
from superset_tool.client import SupersetClient as BaseSupersetClient import zipfile
from superset_tool.models import SupersetConfig from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple, Union, cast
from requests import Response
from .logger import logger as app_logger, belief_scope
from .utils.network import APIClient, SupersetAPIError, AuthenticationError, DashboardNotFoundError, NetworkError
from .utils.fileio import get_filename_from_headers
from .config_models import Environment
# [/SECTION] # [/SECTION]
# [DEF:SupersetClient:Class] # [DEF:SupersetClient:Class]
# @PURPOSE: Extended SupersetClient for migration-specific operations. # @PURPOSE: Класс-обёртка над Superset REST API, предоставляющий методы для работы с дашбордами и датасетами.
class SupersetClient(BaseSupersetClient): class SupersetClient:
# [DEF:__init__:Function]
# [DEF:SupersetClient.get_databases_summary:Function] # @PURPOSE: Инициализирует клиент, проверяет конфигурацию и создает сетевой клиент.
# @PURPOSE: Fetch a summary of databases including uuid, name, and engine. # @PRE: `env` должен быть валидным объектом Environment.
# @POST: Returns a list of database dictionaries with 'engine' field. # @POST: Атрибуты `env` и `network` созданы и готовы к работе.
# @RETURN: List[Dict] - Summary of databases. # @PARAM: env (Environment) - Конфигурация окружения.
def get_databases_summary(self) -> List[Dict]: def __init__(self, env: Environment):
""" with belief_scope("__init__"):
Fetch a summary of databases including uuid, name, and engine. app_logger.info("[SupersetClient.__init__][Enter] Initializing SupersetClient for env %s.", env.name)
""" self.env = env
query = { # Construct auth payload expected by Superset API
"columns": ["uuid", "database_name", "backend"] auth_payload = {
"username": env.username,
"password": env.password,
"provider": "db",
"refresh": "true"
} }
_, databases = self.get_databases(query=query) self.network = APIClient(
config={
# Map 'backend' to 'engine' for consistency with contracts "base_url": env.url,
for db in databases: "auth": auth_payload
db['engine'] = db.pop('backend', None) },
verify_ssl=env.verify_ssl,
timeout=env.timeout
)
self.delete_before_reimport: bool = False
app_logger.info("[SupersetClient.__init__][Exit] SupersetClient initialized.")
# [/DEF:__init__:Function]
# [DEF:authenticate:Function]
# @PURPOSE: Authenticates the client using the configured credentials.
# @PRE: self.network must be initialized with valid auth configuration.
# @POST: Client is authenticated and tokens are stored.
# @RETURN: Dict[str, str] - Authentication tokens.
def authenticate(self) -> Dict[str, str]:
with belief_scope("SupersetClient.authenticate"):
return self.network.authenticate()
# [/DEF:authenticate:Function]
@property
# [DEF:headers:Function]
# @PURPOSE: Возвращает базовые HTTP-заголовки, используемые сетевым клиентом.
# @PRE: APIClient is initialized and authenticated.
# @POST: Returns a dictionary of HTTP headers.
def headers(self) -> dict:
with belief_scope("headers"):
return self.network.headers
# [/DEF:headers:Function]
# [SECTION: DASHBOARD OPERATIONS]
# [DEF:get_dashboards:Function]
# @PURPOSE: Получает полный список дашбордов, автоматически обрабатывая пагинацию.
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса для API.
# @PRE: Client is authenticated.
# @POST: Returns a tuple with total count and list of dashboards.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список дашбордов).
def get_dashboards(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
with belief_scope("get_dashboards"):
app_logger.info("[get_dashboards][Enter] Fetching dashboards.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = ["slug", "id", "changed_on_utc", "dashboard_title", "published"]
return databases total_count = self._fetch_total_object_count(endpoint="/dashboard/")
# [/DEF:SupersetClient.get_databases_summary] paginated_data = self._fetch_all_pages(
endpoint="/dashboard/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
app_logger.info("[get_dashboards][Exit] Found %d dashboards.", total_count)
return total_count, paginated_data
# [/DEF:get_dashboards:Function]
# [DEF:SupersetClient.get_database_by_uuid:Function] # [DEF:get_dashboards_summary:Function]
# @PURPOSE: Fetches dashboard metadata optimized for the grid.
# @PRE: Client is authenticated.
# @POST: Returns a list of dashboard metadata summaries.
# @RETURN: List[Dict]
def get_dashboards_summary(self) -> List[Dict]:
with belief_scope("SupersetClient.get_dashboards_summary"):
query = {
"columns": ["id", "dashboard_title", "changed_on_utc", "published"]
}
_, dashboards = self.get_dashboards(query=query)
# Map fields to DashboardMetadata schema
result = []
for dash in dashboards:
result.append({
"id": dash.get("id"),
"title": dash.get("dashboard_title"),
"last_modified": dash.get("changed_on_utc"),
"status": "published" if dash.get("published") else "draft"
})
return result
# [/DEF:get_dashboards_summary:Function]
# [DEF:export_dashboard:Function]
# @PURPOSE: Экспортирует дашборд в виде ZIP-архива.
# @PARAM: dashboard_id (int) - ID дашборда для экспорта.
# @PRE: dashboard_id must exist in Superset.
# @POST: Returns ZIP content and filename.
# @RETURN: Tuple[bytes, str] - Бинарное содержимое ZIP-архива и имя файла.
def export_dashboard(self, dashboard_id: int) -> Tuple[bytes, str]:
with belief_scope("export_dashboard"):
app_logger.info("[export_dashboard][Enter] Exporting dashboard %s.", dashboard_id)
response = self.network.request(
method="GET",
endpoint="/dashboard/export/",
params={"q": json.dumps([dashboard_id])},
stream=True,
raw_response=True,
)
response = cast(Response, response)
self._validate_export_response(response, dashboard_id)
filename = self._resolve_export_filename(response, dashboard_id)
app_logger.info("[export_dashboard][Exit] Exported dashboard %s to %s.", dashboard_id, filename)
return response.content, filename
# [/DEF:export_dashboard:Function]
# [DEF:import_dashboard:Function]
# @PURPOSE: Импортирует дашборд из ZIP-файла.
# @PARAM: file_name (Union[str, Path]) - Путь к ZIP-архиву.
# @PARAM: dash_id (Optional[int]) - ID дашборда для удаления при сбое.
# @PARAM: dash_slug (Optional[str]) - Slug дашборда для поиска ID.
# @PRE: file_name must be a valid ZIP dashboard export.
# @POST: Dashboard is imported or re-imported after deletion.
# @RETURN: Dict - Ответ API в случае успеха.
def import_dashboard(self, file_name: Union[str, Path], dash_id: Optional[int] = None, dash_slug: Optional[str] = None) -> Dict:
with belief_scope("import_dashboard"):
file_path = str(file_name)
self._validate_import_file(file_path)
try:
return self._do_import(file_path)
except Exception as exc:
app_logger.error("[import_dashboard][Failure] First import attempt failed: %s", exc, exc_info=True)
if not self.delete_before_reimport:
raise
target_id = self._resolve_target_id_for_delete(dash_id, dash_slug)
if target_id is None:
app_logger.error("[import_dashboard][Failure] No ID available for delete-retry.")
raise
self.delete_dashboard(target_id)
app_logger.info("[import_dashboard][State] Deleted dashboard ID %s, retrying import.", target_id)
return self._do_import(file_path)
# [/DEF:import_dashboard:Function]
# [DEF:delete_dashboard:Function]
# @PURPOSE: Удаляет дашборд по его ID или slug.
# @PARAM: dashboard_id (Union[int, str]) - ID или slug дашборда.
# @PRE: dashboard_id must exist.
# @POST: Dashboard is removed from Superset.
def delete_dashboard(self, dashboard_id: Union[int, str]) -> None:
with belief_scope("delete_dashboard"):
app_logger.info("[delete_dashboard][Enter] Deleting dashboard %s.", dashboard_id)
response = self.network.request(method="DELETE", endpoint=f"/dashboard/{dashboard_id}")
response = cast(Dict, response)
if response.get("result", True) is not False:
app_logger.info("[delete_dashboard][Success] Dashboard %s deleted.", dashboard_id)
else:
app_logger.warning("[delete_dashboard][Warning] Unexpected response while deleting %s: %s", dashboard_id, response)
# [/DEF:delete_dashboard:Function]
# [/SECTION]
# [SECTION: DATASET OPERATIONS]
# [DEF:get_datasets:Function]
# @PURPOSE: Получает полный список датасетов, автоматически обрабатывая пагинацию.
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: Client is authenticated.
# @POST: Returns total count and list of datasets.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список датасетов).
def get_datasets(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
with belief_scope("get_datasets"):
app_logger.info("[get_datasets][Enter] Fetching datasets.")
validated_query = self._validate_query_params(query)
total_count = self._fetch_total_object_count(endpoint="/dataset/")
paginated_data = self._fetch_all_pages(
endpoint="/dataset/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
app_logger.info("[get_datasets][Exit] Found %d datasets.", total_count)
return total_count, paginated_data
# [/DEF:get_datasets:Function]
# [DEF:get_dataset:Function]
# @PURPOSE: Получает информацию о конкретном датасете по его ID.
# @PARAM: dataset_id (int) - ID датасета.
# @PRE: dataset_id must exist.
# @POST: Returns dataset details.
# @RETURN: Dict - Информация о датасете.
def get_dataset(self, dataset_id: int) -> Dict:
with belief_scope("SupersetClient.get_dataset", f"id={dataset_id}"):
app_logger.info("[get_dataset][Enter] Fetching dataset %s.", dataset_id)
response = self.network.request(method="GET", endpoint=f"/dataset/{dataset_id}")
response = cast(Dict, response)
app_logger.info("[get_dataset][Exit] Got dataset %s.", dataset_id)
return response
# [/DEF:get_dataset:Function]
# [DEF:update_dataset:Function]
# @PURPOSE: Обновляет данные датасета по его ID.
# @PARAM: dataset_id (int) - ID датасета.
# @PARAM: data (Dict) - Данные для обновления.
# @PRE: dataset_id must exist.
# @POST: Dataset is updated in Superset.
# @RETURN: Dict - Ответ API.
def update_dataset(self, dataset_id: int, data: Dict) -> Dict:
with belief_scope("SupersetClient.update_dataset", f"id={dataset_id}"):
app_logger.info("[update_dataset][Enter] Updating dataset %s.", dataset_id)
response = self.network.request(
method="PUT",
endpoint=f"/dataset/{dataset_id}",
data=json.dumps(data),
headers={'Content-Type': 'application/json'}
)
response = cast(Dict, response)
app_logger.info("[update_dataset][Exit] Updated dataset %s.", dataset_id)
return response
# [/DEF:update_dataset:Function]
# [/SECTION]
# [SECTION: DATABASE OPERATIONS]
# [DEF:get_databases:Function]
# @PURPOSE: Получает полный список баз данных.
# @PARAM: query (Optional[Dict]) - Дополнительные параметры запроса.
# @PRE: Client is authenticated.
# @POST: Returns total count and list of databases.
# @RETURN: Tuple[int, List[Dict]] - Кортеж (общее количество, список баз данных).
def get_databases(self, query: Optional[Dict] = None) -> Tuple[int, List[Dict]]:
with belief_scope("get_databases"):
app_logger.info("[get_databases][Enter] Fetching databases.")
validated_query = self._validate_query_params(query or {})
if 'columns' not in validated_query:
validated_query['columns'] = []
total_count = self._fetch_total_object_count(endpoint="/database/")
paginated_data = self._fetch_all_pages(
endpoint="/database/",
pagination_options={"base_query": validated_query, "total_count": total_count, "results_field": "result"},
)
app_logger.info("[get_databases][Exit] Found %d databases.", total_count)
return total_count, paginated_data
# [/DEF:get_databases:Function]
# [DEF:get_database:Function]
# @PURPOSE: Получает информацию о конкретной базе данных по её ID.
# @PARAM: database_id (int) - ID базы данных.
# @PRE: database_id must exist.
# @POST: Returns database details.
# @RETURN: Dict - Информация о базе данных.
def get_database(self, database_id: int) -> Dict:
with belief_scope("get_database"):
app_logger.info("[get_database][Enter] Fetching database %s.", database_id)
response = self.network.request(method="GET", endpoint=f"/database/{database_id}")
response = cast(Dict, response)
app_logger.info("[get_database][Exit] Got database %s.", database_id)
return response
# [/DEF:get_database:Function]
# [DEF:get_databases_summary:Function]
# @PURPOSE: Fetch a summary of databases including uuid, name, and engine.
# @PRE: Client is authenticated.
# @POST: Returns list of database summaries.
# @RETURN: List[Dict] - Summary of databases.
def get_databases_summary(self) -> List[Dict]:
with belief_scope("SupersetClient.get_databases_summary"):
query = {
"columns": ["uuid", "database_name", "backend"]
}
_, databases = self.get_databases(query=query)
# Map 'backend' to 'engine' for consistency with contracts
for db in databases:
db['engine'] = db.pop('backend', None)
return databases
# [/DEF:get_databases_summary:Function]
# [DEF:get_database_by_uuid:Function]
# @PURPOSE: Find a database by its UUID. # @PURPOSE: Find a database by its UUID.
# @PARAM: db_uuid (str) - The UUID of the database. # @PARAM: db_uuid (str) - The UUID of the database.
# @RETURN: Optional[Dict] - Database info if found, else None. # @PRE: db_uuid must be a valid UUID string.
# @POST: Returns database info or None.
# @RETURN: Optional[Dict] - Database info if found, else None.
def get_database_by_uuid(self, db_uuid: str) -> Optional[Dict]: def get_database_by_uuid(self, db_uuid: str) -> Optional[Dict]:
""" with belief_scope("SupersetClient.get_database_by_uuid", f"uuid={db_uuid}"):
Find a database by its UUID. query = {
""" "filters": [{"col": "uuid", "op": "eq", "value": db_uuid}]
query = { }
"filters": [{"col": "uuid", "op": "eq", "value": db_uuid}] _, databases = self.get_databases(query=query)
} return databases[0] if databases else None
_, databases = self.get_databases(query=query) # [/DEF:get_database_by_uuid:Function]
return databases[0] if databases else None
# [/DEF:SupersetClient.get_database_by_uuid]
# [/DEF:SupersetClient] # [/SECTION]
# [/DEF:backend.src.core.superset_client] # [SECTION: HELPERS]
# [DEF:_resolve_target_id_for_delete:Function]
# @PURPOSE: Resolves a dashboard ID from either an ID or a slug.
# @PRE: Either dash_id or dash_slug should be provided.
# @POST: Returns the resolved ID or None.
def _resolve_target_id_for_delete(self, dash_id: Optional[int], dash_slug: Optional[str]) -> Optional[int]:
with belief_scope("_resolve_target_id_for_delete"):
if dash_id is not None:
return dash_id
if dash_slug is not None:
app_logger.debug("[_resolve_target_id_for_delete][State] Resolving ID by slug '%s'.", dash_slug)
try:
_, candidates = self.get_dashboards(query={"filters": [{"col": "slug", "op": "eq", "value": dash_slug}]})
if candidates:
target_id = candidates[0]["id"]
app_logger.debug("[_resolve_target_id_for_delete][Success] Resolved slug to ID %s.", target_id)
return target_id
except Exception as e:
app_logger.warning("[_resolve_target_id_for_delete][Warning] Could not resolve slug '%s' to ID: %s", dash_slug, e)
return None
# [/DEF:_resolve_target_id_for_delete:Function]
# [DEF:_do_import:Function]
# @PURPOSE: Performs the actual multipart upload for import.
# @PRE: file_name must be a path to an existing ZIP file.
# @POST: Returns the API response from the upload.
def _do_import(self, file_name: Union[str, Path]) -> Dict:
with belief_scope("_do_import"):
app_logger.debug(f"[_do_import][State] Uploading file: {file_name}")
file_path = Path(file_name)
if not file_path.exists():
app_logger.error(f"[_do_import][Failure] File does not exist: {file_name}")
raise FileNotFoundError(f"File does not exist: {file_name}")
return self.network.upload_file(
endpoint="/dashboard/import/",
file_info={"file_obj": file_path, "file_name": file_path.name, "form_field": "formData"},
extra_data={"overwrite": "true"},
timeout=self.env.timeout * 2,
)
# [/DEF:_do_import:Function]
# [DEF:_validate_export_response:Function]
# @PURPOSE: Validates that the export response is a non-empty ZIP archive.
# @PRE: response must be a valid requests.Response object.
# @POST: Raises SupersetAPIError if validation fails.
def _validate_export_response(self, response: Response, dashboard_id: int) -> None:
with belief_scope("_validate_export_response"):
content_type = response.headers.get("Content-Type", "")
if "application/zip" not in content_type:
raise SupersetAPIError(f"Получен не ZIP-архив (Content-Type: {content_type})")
if not response.content:
raise SupersetAPIError("Получены пустые данные при экспорте")
# [/DEF:_validate_export_response:Function]
# [DEF:_resolve_export_filename:Function]
# @PURPOSE: Determines the filename for an exported dashboard.
# @PRE: response must contain Content-Disposition header or dashboard_id must be provided.
# @POST: Returns a sanitized filename string.
def _resolve_export_filename(self, response: Response, dashboard_id: int) -> str:
with belief_scope("_resolve_export_filename"):
filename = get_filename_from_headers(dict(response.headers))
if not filename:
from datetime import datetime
timestamp = datetime.now().strftime("%Y%m%dT%H%M%S")
filename = f"dashboard_export_{dashboard_id}_{timestamp}.zip"
app_logger.warning("[_resolve_export_filename][Warning] Generated filename: %s", filename)
return filename
# [/DEF:_resolve_export_filename:Function]
# [DEF:_validate_query_params:Function]
# @PURPOSE: Ensures query parameters have default page and page_size.
# @PRE: query can be None or a dictionary.
# @POST: Returns a dictionary with at least page and page_size.
def _validate_query_params(self, query: Optional[Dict]) -> Dict:
with belief_scope("_validate_query_params"):
base_query = {"page": 0, "page_size": 1000}
return {**base_query, **(query or {})}
# [/DEF:_validate_query_params:Function]
# [DEF:_fetch_total_object_count:Function]
# @PURPOSE: Fetches the total number of items for a given endpoint.
# @PRE: endpoint must be a valid Superset API path.
# @POST: Returns the total count as an integer.
def _fetch_total_object_count(self, endpoint: str) -> int:
with belief_scope("_fetch_total_object_count"):
return self.network.fetch_paginated_count(
endpoint=endpoint,
query_params={"page": 0, "page_size": 1},
count_field="count",
)
# [/DEF:_fetch_total_object_count:Function]
# [DEF:_fetch_all_pages:Function]
# @PURPOSE: Iterates through all pages to collect all data items.
# @PRE: pagination_options must contain base_query, total_count, and results_field.
# @POST: Returns a combined list of all items.
def _fetch_all_pages(self, endpoint: str, pagination_options: Dict) -> List[Dict]:
with belief_scope("_fetch_all_pages"):
return self.network.fetch_paginated_data(endpoint=endpoint, pagination_options=pagination_options)
# [/DEF:_fetch_all_pages:Function]
# [DEF:_validate_import_file:Function]
# @PURPOSE: Validates that the file to be imported is a valid ZIP with metadata.yaml.
# @PRE: zip_path must be a path to a file.
# @POST: Raises error if file is missing, not a ZIP, or missing metadata.
def _validate_import_file(self, zip_path: Union[str, Path]) -> None:
with belief_scope("_validate_import_file"):
path = Path(zip_path)
if not path.exists():
raise FileNotFoundError(f"Файл {zip_path} не существует")
if not zipfile.is_zipfile(path):
raise SupersetAPIError(f"Файл {zip_path} не является ZIP-архивом")
with zipfile.ZipFile(path, "r") as zf:
if not any(n.endswith("metadata.yaml") for n in zf.namelist()):
raise SupersetAPIError(f"Архив {zip_path} не содержит 'metadata.yaml'")
# [/DEF:_validate_import_file:Function]
# [/SECTION]
# [/DEF:SupersetClient:Class]
# [/DEF:backend.src.core.superset_client:Module]

View File

@@ -1,203 +0,0 @@
# [DEF:TaskManagerModule:Module]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking. It uses a thread pool to run plugins asynchronously.
# @LAYER: Core
# @RELATION: Depends on PluginLoader to get plugin instances. It is used by the API layer to create and query tasks.
import asyncio
import uuid
from datetime import datetime
from enum import Enum
from typing import Dict, Any, List, Optional
from concurrent.futures import ThreadPoolExecutor
from pydantic import BaseModel, Field
# Assuming PluginBase and PluginConfig are defined in plugin_base.py
# from .plugin_base import PluginBase, PluginConfig # Not needed here, TaskManager interacts with the PluginLoader
# [DEF:TaskStatus:Enum]
# @SEMANTICS: task, status, state, enum
# @PURPOSE: Defines the possible states a task can be in during its lifecycle.
class TaskStatus(str, Enum):
PENDING = "PENDING"
RUNNING = "RUNNING"
SUCCESS = "SUCCESS"
FAILED = "FAILED"
AWAITING_MAPPING = "AWAITING_MAPPING"
# [/DEF]
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry associated with a task.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF]
# [DEF:Task:Class]
# @SEMANTICS: task, job, execution, state, pydantic
# @PURPOSE: A Pydantic model representing a single execution instance of a plugin, including its status, parameters, and logs.
class Task(BaseModel):
id: str = Field(default_factory=lambda: str(uuid.uuid4()))
plugin_id: str
status: TaskStatus = TaskStatus.PENDING
started_at: Optional[datetime] = None
finished_at: Optional[datetime] = None
user_id: Optional[str] = None
logs: List[LogEntry] = Field(default_factory=list)
params: Dict[str, Any] = Field(default_factory=dict)
# [/DEF]
# [DEF:TaskManager:Class]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking.
class TaskManager:
"""
Manages the lifecycle of tasks, including their creation, execution, and state tracking.
"""
def __init__(self, plugin_loader):
self.plugin_loader = plugin_loader
self.tasks: Dict[str, Task] = {}
self.subscribers: Dict[str, List[asyncio.Queue]] = {}
self.executor = ThreadPoolExecutor(max_workers=5) # For CPU-bound plugin execution
self.loop = asyncio.get_event_loop()
self.task_futures: Dict[str, asyncio.Future] = {}
# [/DEF]
async def create_task(self, plugin_id: str, params: Dict[str, Any], user_id: Optional[str] = None) -> Task:
"""
Creates and queues a new task for execution.
"""
if not self.plugin_loader.has_plugin(plugin_id):
raise ValueError(f"Plugin with ID '{plugin_id}' not found.")
plugin = self.plugin_loader.get_plugin(plugin_id)
# Validate params against plugin schema (this will be done at a higher level, e.g., API route)
# For now, a basic check
if not isinstance(params, dict):
raise ValueError("Task parameters must be a dictionary.")
task = Task(plugin_id=plugin_id, params=params, user_id=user_id)
self.tasks[task.id] = task
self.loop.create_task(self._run_task(task.id)) # Schedule task for execution
return task
async def _run_task(self, task_id: str):
"""
Internal method to execute a task.
"""
task = self.tasks[task_id]
plugin = self.plugin_loader.get_plugin(task.plugin_id)
task.status = TaskStatus.RUNNING
task.started_at = datetime.utcnow()
self._add_log(task_id, "INFO", f"Task started for plugin '{plugin.name}'")
try:
# Execute plugin in a separate thread to avoid blocking the event loop
# if the plugin's execute method is synchronous and potentially CPU-bound.
# If the plugin's execute method is already async, this can be simplified.
# Pass task_id to plugin so it can signal pause
params = {**task.params, "_task_id": task_id}
await self.loop.run_in_executor(
self.executor,
lambda: asyncio.run(plugin.execute(params)) if asyncio.iscoroutinefunction(plugin.execute) else plugin.execute(params)
)
task.status = TaskStatus.SUCCESS
self._add_log(task_id, "INFO", f"Task completed successfully for plugin '{plugin.name}'")
except Exception as e:
task.status = TaskStatus.FAILED
self._add_log(task_id, "ERROR", f"Task failed: {e}", {"error_type": type(e).__name__})
finally:
task.finished_at = datetime.utcnow()
# In a real system, you might notify clients via WebSocket here
async def resolve_task(self, task_id: str, resolution_params: Dict[str, Any]):
"""
Resumes a task that is awaiting mapping.
"""
task = self.tasks.get(task_id)
if not task or task.status != TaskStatus.AWAITING_MAPPING:
raise ValueError("Task is not awaiting mapping.")
# Update task params with resolution
task.params.update(resolution_params)
task.status = TaskStatus.RUNNING
self._add_log(task_id, "INFO", "Task resumed after mapping resolution.")
# Signal the future to continue
if task_id in self.task_futures:
self.task_futures[task_id].set_result(True)
async def wait_for_resolution(self, task_id: str):
"""
Pauses execution and waits for a resolution signal.
"""
task = self.tasks.get(task_id)
if not task: return
task.status = TaskStatus.AWAITING_MAPPING
self.task_futures[task_id] = self.loop.create_future()
try:
await self.task_futures[task_id]
finally:
del self.task_futures[task_id]
def get_task(self, task_id: str) -> Optional[Task]:
"""
Retrieves a task by its ID.
"""
return self.tasks.get(task_id)
def get_all_tasks(self) -> List[Task]:
"""
Retrieves all registered tasks.
"""
return list(self.tasks.values())
def get_task_logs(self, task_id: str) -> List[LogEntry]:
"""
Retrieves logs for a specific task.
"""
task = self.tasks.get(task_id)
return task.logs if task else []
def _add_log(self, task_id: str, level: str, message: str, context: Optional[Dict[str, Any]] = None):
"""
Adds a log entry to a task and notifies subscribers.
"""
task = self.tasks.get(task_id)
if not task:
return
log_entry = LogEntry(level=level, message=message, context=context)
task.logs.append(log_entry)
# Notify subscribers
if task_id in self.subscribers:
for queue in self.subscribers[task_id]:
self.loop.call_soon_threadsafe(queue.put_nowait, log_entry)
async def subscribe_logs(self, task_id: str) -> asyncio.Queue:
"""
Subscribes to real-time logs for a task.
"""
queue = asyncio.Queue()
if task_id not in self.subscribers:
self.subscribers[task_id] = []
self.subscribers[task_id].append(queue)
return queue
def unsubscribe_logs(self, task_id: str, queue: asyncio.Queue):
"""
Unsubscribes from real-time logs for a task.
"""
if task_id in self.subscribers:
self.subscribers[task_id].remove(queue)
if not self.subscribers[task_id]:
del self.subscribers[task_id]

View File

@@ -0,0 +1,12 @@
# [DEF:TaskManagerPackage:Module]
# @SEMANTICS: task, manager, package, exports
# @PURPOSE: Exports the public API of the task manager package.
# @LAYER: Core
# @RELATION: Aggregates models and manager.
from .models import Task, TaskStatus, LogEntry
from .manager import TaskManager
__all__ = ["TaskManager", "Task", "TaskStatus", "LogEntry"]
# [/DEF:TaskManagerPackage:Module]

View File

@@ -0,0 +1,47 @@
# [DEF:TaskCleanupModule:Module]
# @SEMANTICS: task, cleanup, retention
# @PURPOSE: Implements task cleanup and retention policies.
# @LAYER: Core
# @RELATION: Uses TaskPersistenceService to delete old tasks.
from datetime import datetime, timedelta
from .persistence import TaskPersistenceService
from ..logger import logger, belief_scope
from ..config_manager import ConfigManager
# [DEF:TaskCleanupService:Class]
# @PURPOSE: Provides methods to clean up old task records.
class TaskCleanupService:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the cleanup service with dependencies.
# @PRE: persistence_service and config_manager are valid.
# @POST: Cleanup service is ready.
def __init__(self, persistence_service: TaskPersistenceService, config_manager: ConfigManager):
self.persistence_service = persistence_service
self.config_manager = config_manager
# [/DEF:__init__:Function]
# [DEF:run_cleanup:Function]
# @PURPOSE: Deletes tasks older than the configured retention period.
# @PRE: Config manager has valid settings.
# @POST: Old tasks are deleted from persistence.
def run_cleanup(self):
with belief_scope("TaskCleanupService.run_cleanup"):
settings = self.config_manager.get_config().settings
retention_days = settings.task_retention_days
# This is a simplified implementation.
# In a real scenario, we would query IDs of tasks older than retention_days.
# For now, we'll log the action.
logger.info(f"Cleaning up tasks older than {retention_days} days.")
# Re-loading tasks to check for limit
tasks = self.persistence_service.load_tasks(limit=1000)
if len(tasks) > settings.task_retention_limit:
to_delete = [t.id for t in tasks[settings.task_retention_limit:]]
self.persistence_service.delete_tasks(to_delete)
logger.info(f"Deleted {len(to_delete)} tasks exceeding limit of {settings.task_retention_limit}")
# [/DEF:run_cleanup:Function]
# [/DEF:TaskCleanupService:Class]
# [/DEF:TaskCleanupModule:Module]

View File

@@ -0,0 +1,398 @@
# [DEF:TaskManagerModule:Module]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking. It uses a thread pool to run plugins asynchronously.
# @LAYER: Core
# @RELATION: Depends on PluginLoader to get plugin instances. It is used by the API layer to create and query tasks.
# @INVARIANT: Task IDs are unique.
# @CONSTRAINT: Must use belief_scope for logging.
# [SECTION: IMPORTS]
import asyncio
from datetime import datetime
from typing import Dict, Any, List, Optional
from concurrent.futures import ThreadPoolExecutor
from .models import Task, TaskStatus, LogEntry
from .persistence import TaskPersistenceService
from ..logger import logger, belief_scope
# [/SECTION]
# [DEF:TaskManager:Class]
# @SEMANTICS: task, manager, lifecycle, execution, state
# @PURPOSE: Manages the lifecycle of tasks, including their creation, execution, and state tracking.
class TaskManager:
"""
Manages the lifecycle of tasks, including their creation, execution, and state tracking.
"""
# [DEF:__init__:Function]
# @PURPOSE: Initialize the TaskManager with dependencies.
# @PRE: plugin_loader is initialized.
# @POST: TaskManager is ready to accept tasks.
# @PARAM: plugin_loader - The plugin loader instance.
def __init__(self, plugin_loader):
with belief_scope("TaskManager.__init__"):
self.plugin_loader = plugin_loader
self.tasks: Dict[str, Task] = {}
self.subscribers: Dict[str, List[asyncio.Queue]] = {}
self.executor = ThreadPoolExecutor(max_workers=5) # For CPU-bound plugin execution
self.persistence_service = TaskPersistenceService()
try:
self.loop = asyncio.get_running_loop()
except RuntimeError:
self.loop = asyncio.get_event_loop()
self.task_futures: Dict[str, asyncio.Future] = {}
# Load persisted tasks on startup
self.load_persisted_tasks()
# [/DEF:__init__:Function]
# [DEF:create_task:Function]
# @PURPOSE: Creates and queues a new task for execution.
# @PRE: Plugin with plugin_id exists. Params are valid.
# @POST: Task is created, added to registry, and scheduled for execution.
# @PARAM: plugin_id (str) - The ID of the plugin to run.
# @PARAM: params (Dict[str, Any]) - Parameters for the plugin.
# @PARAM: user_id (Optional[str]) - ID of the user requesting the task.
# @RETURN: Task - The created task instance.
# @THROWS: ValueError if plugin not found or params invalid.
async def create_task(self, plugin_id: str, params: Dict[str, Any], user_id: Optional[str] = None) -> Task:
with belief_scope("TaskManager.create_task", f"plugin_id={plugin_id}"):
if not self.plugin_loader.has_plugin(plugin_id):
logger.error(f"Plugin with ID '{plugin_id}' not found.")
raise ValueError(f"Plugin with ID '{plugin_id}' not found.")
plugin = self.plugin_loader.get_plugin(plugin_id)
if not isinstance(params, dict):
logger.error("Task parameters must be a dictionary.")
raise ValueError("Task parameters must be a dictionary.")
task = Task(plugin_id=plugin_id, params=params, user_id=user_id)
self.tasks[task.id] = task
self.persistence_service.persist_task(task)
logger.info(f"Task {task.id} created and scheduled for execution")
self.loop.create_task(self._run_task(task.id)) # Schedule task for execution
return task
# [/DEF:create_task:Function]
# [DEF:_run_task:Function]
# @PURPOSE: Internal method to execute a task.
# @PRE: Task exists in registry.
# @POST: Task is executed, status updated to SUCCESS or FAILED.
# @PARAM: task_id (str) - The ID of the task to run.
async def _run_task(self, task_id: str):
with belief_scope("TaskManager._run_task", f"task_id={task_id}"):
task = self.tasks[task_id]
plugin = self.plugin_loader.get_plugin(task.plugin_id)
logger.info(f"Starting execution of task {task_id} for plugin '{plugin.name}'")
task.status = TaskStatus.RUNNING
task.started_at = datetime.utcnow()
self.persistence_service.persist_task(task)
self._add_log(task_id, "INFO", f"Task started for plugin '{plugin.name}'")
try:
# Execute plugin
params = {**task.params, "_task_id": task_id}
if asyncio.iscoroutinefunction(plugin.execute):
task.result = await plugin.execute(params)
else:
task.result = await self.loop.run_in_executor(
self.executor,
plugin.execute,
params
)
logger.info(f"Task {task_id} completed successfully")
task.status = TaskStatus.SUCCESS
self._add_log(task_id, "INFO", f"Task completed successfully for plugin '{plugin.name}'")
except Exception as e:
logger.error(f"Task {task_id} failed: {e}")
task.status = TaskStatus.FAILED
self._add_log(task_id, "ERROR", f"Task failed: {e}", {"error_type": type(e).__name__})
finally:
task.finished_at = datetime.utcnow()
self.persistence_service.persist_task(task)
logger.info(f"Task {task_id} execution finished with status: {task.status}")
# [/DEF:_run_task:Function]
# [DEF:resolve_task:Function]
# @PURPOSE: Resumes a task that is awaiting mapping.
# @PRE: Task exists and is in AWAITING_MAPPING state.
# @POST: Task status updated to RUNNING, params updated, execution resumed.
# @PARAM: task_id (str) - The ID of the task.
# @PARAM: resolution_params (Dict[str, Any]) - Params to resolve the wait.
# @THROWS: ValueError if task not found or not awaiting mapping.
async def resolve_task(self, task_id: str, resolution_params: Dict[str, Any]):
with belief_scope("TaskManager.resolve_task", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task or task.status != TaskStatus.AWAITING_MAPPING:
raise ValueError("Task is not awaiting mapping.")
# Update task params with resolution
task.params.update(resolution_params)
task.status = TaskStatus.RUNNING
self.persistence_service.persist_task(task)
self._add_log(task_id, "INFO", "Task resumed after mapping resolution.")
# Signal the future to continue
if task_id in self.task_futures:
self.task_futures[task_id].set_result(True)
# [/DEF:resolve_task:Function]
# [DEF:wait_for_resolution:Function]
# @PURPOSE: Pauses execution and waits for a resolution signal.
# @PRE: Task exists.
# @POST: Execution pauses until future is set.
# @PARAM: task_id (str) - The ID of the task.
async def wait_for_resolution(self, task_id: str):
with belief_scope("TaskManager.wait_for_resolution", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task: return
task.status = TaskStatus.AWAITING_MAPPING
self.persistence_service.persist_task(task)
self.task_futures[task_id] = self.loop.create_future()
try:
await self.task_futures[task_id]
finally:
if task_id in self.task_futures:
del self.task_futures[task_id]
# [/DEF:wait_for_resolution:Function]
# [DEF:wait_for_input:Function]
# @PURPOSE: Pauses execution and waits for user input.
# @PRE: Task exists.
# @POST: Execution pauses until future is set via resume_task_with_password.
# @PARAM: task_id (str) - The ID of the task.
async def wait_for_input(self, task_id: str):
with belief_scope("TaskManager.wait_for_input", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task: return
# Status is already set to AWAITING_INPUT by await_input()
self.task_futures[task_id] = self.loop.create_future()
try:
await self.task_futures[task_id]
finally:
if task_id in self.task_futures:
del self.task_futures[task_id]
# [/DEF:wait_for_input:Function]
# [DEF:get_task:Function]
# @PURPOSE: Retrieves a task by its ID.
# @PRE: task_id is a string.
# @POST: Returns Task object or None.
# @PARAM: task_id (str) - ID of the task.
# @RETURN: Optional[Task] - The task or None.
def get_task(self, task_id: str) -> Optional[Task]:
with belief_scope("TaskManager.get_task", f"task_id={task_id}"):
return self.tasks.get(task_id)
# [/DEF:get_task:Function]
# [DEF:get_all_tasks:Function]
# @PURPOSE: Retrieves all registered tasks.
# @PRE: None.
# @POST: Returns list of all Task objects.
# @RETURN: List[Task] - All tasks.
def get_all_tasks(self) -> List[Task]:
with belief_scope("TaskManager.get_all_tasks"):
return list(self.tasks.values())
# [/DEF:get_all_tasks:Function]
# [DEF:get_tasks:Function]
# @PURPOSE: Retrieves tasks with pagination and optional status filter.
# @PRE: limit and offset are non-negative integers.
# @POST: Returns a list of tasks sorted by start_time descending.
# @PARAM: limit (int) - Maximum number of tasks to return.
# @PARAM: offset (int) - Number of tasks to skip.
# @PARAM: status (Optional[TaskStatus]) - Filter by task status.
# @RETURN: List[Task] - List of tasks matching criteria.
def get_tasks(self, limit: int = 10, offset: int = 0, status: Optional[TaskStatus] = None) -> List[Task]:
with belief_scope("TaskManager.get_tasks"):
tasks = list(self.tasks.values())
if status:
tasks = [t for t in tasks if t.status == status]
# Sort by start_time descending (most recent first)
tasks.sort(key=lambda t: t.started_at or datetime.min, reverse=True)
return tasks[offset:offset + limit]
# [/DEF:get_tasks:Function]
# [DEF:get_task_logs:Function]
# @PURPOSE: Retrieves logs for a specific task.
# @PRE: task_id is a string.
# @POST: Returns list of LogEntry objects.
# @PARAM: task_id (str) - ID of the task.
# @RETURN: List[LogEntry] - List of log entries.
def get_task_logs(self, task_id: str) -> List[LogEntry]:
with belief_scope("TaskManager.get_task_logs", f"task_id={task_id}"):
task = self.tasks.get(task_id)
return task.logs if task else []
# [/DEF:get_task_logs:Function]
# [DEF:_add_log:Function]
# @PURPOSE: Adds a log entry to a task and notifies subscribers.
# @PRE: Task exists.
# @POST: Log added to task and pushed to queues.
# @PARAM: task_id (str) - ID of the task.
# @PARAM: level (str) - Log level.
# @PARAM: message (str) - Log message.
# @PARAM: context (Optional[Dict]) - Log context.
def _add_log(self, task_id: str, level: str, message: str, context: Optional[Dict[str, Any]] = None):
with belief_scope("TaskManager._add_log", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task:
return
log_entry = LogEntry(level=level, message=message, context=context)
task.logs.append(log_entry)
self.persistence_service.persist_task(task)
# Notify subscribers
if task_id in self.subscribers:
for queue in self.subscribers[task_id]:
self.loop.call_soon_threadsafe(queue.put_nowait, log_entry)
# [/DEF:_add_log:Function]
# [DEF:subscribe_logs:Function]
# @PURPOSE: Subscribes to real-time logs for a task.
# @PRE: task_id is a string.
# @POST: Returns an asyncio.Queue for log entries.
# @PARAM: task_id (str) - ID of the task.
# @RETURN: asyncio.Queue - Queue for log entries.
async def subscribe_logs(self, task_id: str) -> asyncio.Queue:
with belief_scope("TaskManager.subscribe_logs", f"task_id={task_id}"):
queue = asyncio.Queue()
if task_id not in self.subscribers:
self.subscribers[task_id] = []
self.subscribers[task_id].append(queue)
return queue
# [/DEF:subscribe_logs:Function]
# [DEF:unsubscribe_logs:Function]
# @PURPOSE: Unsubscribes from real-time logs for a task.
# @PRE: task_id is a string, queue is asyncio.Queue.
# @POST: Queue removed from subscribers.
# @PARAM: task_id (str) - ID of the task.
# @PARAM: queue (asyncio.Queue) - Queue to remove.
def unsubscribe_logs(self, task_id: str, queue: asyncio.Queue):
with belief_scope("TaskManager.unsubscribe_logs", f"task_id={task_id}"):
if task_id in self.subscribers:
if queue in self.subscribers[task_id]:
self.subscribers[task_id].remove(queue)
if not self.subscribers[task_id]:
del self.subscribers[task_id]
# [/DEF:unsubscribe_logs:Function]
# [DEF:load_persisted_tasks:Function]
# @PURPOSE: Load persisted tasks using persistence service.
# @PRE: None.
# @POST: Persisted tasks loaded into self.tasks.
def load_persisted_tasks(self) -> None:
with belief_scope("TaskManager.load_persisted_tasks"):
loaded_tasks = self.persistence_service.load_tasks(limit=100)
for task in loaded_tasks:
if task.id not in self.tasks:
self.tasks[task.id] = task
# [/DEF:load_persisted_tasks:Function]
# [DEF:await_input:Function]
# @PURPOSE: Transition a task to AWAITING_INPUT state with input request.
# @PRE: Task exists and is in RUNNING state.
# @POST: Task status changed to AWAITING_INPUT, input_request set, persisted.
# @PARAM: task_id (str) - ID of the task.
# @PARAM: input_request (Dict) - Details about required input.
# @THROWS: ValueError if task not found or not RUNNING.
def await_input(self, task_id: str, input_request: Dict[str, Any]) -> None:
with belief_scope("TaskManager.await_input", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task:
raise ValueError(f"Task {task_id} not found")
if task.status != TaskStatus.RUNNING:
raise ValueError(f"Task {task_id} is not RUNNING (current: {task.status})")
task.status = TaskStatus.AWAITING_INPUT
task.input_required = True
task.input_request = input_request
self.persistence_service.persist_task(task)
self._add_log(task_id, "INFO", "Task paused for user input", {"input_request": input_request})
# [/DEF:await_input:Function]
# [DEF:resume_task_with_password:Function]
# @PURPOSE: Resume a task that is awaiting input with provided passwords.
# @PRE: Task exists and is in AWAITING_INPUT state.
# @POST: Task status changed to RUNNING, passwords injected, task resumed.
# @PARAM: task_id (str) - ID of the task.
# @PARAM: passwords (Dict[str, str]) - Mapping of database name to password.
# @THROWS: ValueError if task not found, not awaiting input, or passwords invalid.
def resume_task_with_password(self, task_id: str, passwords: Dict[str, str]) -> None:
with belief_scope("TaskManager.resume_task_with_password", f"task_id={task_id}"):
task = self.tasks.get(task_id)
if not task:
raise ValueError(f"Task {task_id} not found")
if task.status != TaskStatus.AWAITING_INPUT:
raise ValueError(f"Task {task_id} is not AWAITING_INPUT (current: {task.status})")
if not isinstance(passwords, dict) or not passwords:
raise ValueError("Passwords must be a non-empty dictionary")
task.params["passwords"] = passwords
task.input_required = False
task.input_request = None
task.status = TaskStatus.RUNNING
self.persistence_service.persist_task(task)
self._add_log(task_id, "INFO", "Task resumed with passwords", {"databases": list(passwords.keys())})
if task_id in self.task_futures:
self.task_futures[task_id].set_result(True)
# [/DEF:resume_task_with_password:Function]
# [DEF:clear_tasks:Function]
# @PURPOSE: Clears tasks based on status filter.
# @PRE: status is Optional[TaskStatus].
# @POST: Tasks matching filter (or all non-active) cleared from registry and database.
# @PARAM: status (Optional[TaskStatus]) - Filter by task status.
# @RETURN: int - Number of tasks cleared.
def clear_tasks(self, status: Optional[TaskStatus] = None) -> int:
with belief_scope("TaskManager.clear_tasks"):
tasks_to_remove = []
for task_id, task in list(self.tasks.items()):
# If status is provided, match it.
# If status is None, match everything EXCEPT RUNNING (unless they are awaiting input/mapping which are technically running but paused?)
# Actually, AWAITING_INPUT and AWAITING_MAPPING are distinct statuses in TaskStatus enum.
# RUNNING is active execution.
should_remove = False
if status:
if task.status == status:
should_remove = True
else:
# Clear all non-active tasks (keep RUNNING, AWAITING_INPUT, AWAITING_MAPPING)
if task.status not in [TaskStatus.RUNNING, TaskStatus.AWAITING_INPUT, TaskStatus.AWAITING_MAPPING]:
should_remove = True
if should_remove:
tasks_to_remove.append(task_id)
for tid in tasks_to_remove:
# Cancel future if exists (e.g. for AWAITING_INPUT/MAPPING)
if tid in self.task_futures:
self.task_futures[tid].cancel()
del self.task_futures[tid]
del self.tasks[tid]
# Remove from persistence
self.persistence_service.delete_tasks(tasks_to_remove)
logger.info(f"Cleared {len(tasks_to_remove)} tasks.")
return len(tasks_to_remove)
# [/DEF:clear_tasks:Function]
# [/DEF:TaskManager:Class]
# [/DEF:TaskManagerModule:Module]

View File

@@ -0,0 +1,68 @@
# [DEF:TaskManagerModels:Module]
# @SEMANTICS: task, models, pydantic, enum, state
# @PURPOSE: Defines the data models and enumerations used by the Task Manager.
# @LAYER: Core
# @RELATION: Used by TaskManager and API routes.
# @INVARIANT: Task IDs are immutable once created.
# @CONSTRAINT: Must use Pydantic for data validation.
# [SECTION: IMPORTS]
import uuid
from datetime import datetime
from enum import Enum
from typing import Dict, Any, List, Optional
from pydantic import BaseModel, Field
# [/SECTION]
# [DEF:TaskStatus:Enum]
# @SEMANTICS: task, status, state, enum
# @PURPOSE: Defines the possible states a task can be in during its lifecycle.
class TaskStatus(str, Enum):
PENDING = "PENDING"
RUNNING = "RUNNING"
SUCCESS = "SUCCESS"
FAILED = "FAILED"
AWAITING_MAPPING = "AWAITING_MAPPING"
AWAITING_INPUT = "AWAITING_INPUT"
# [/DEF:TaskStatus:Enum]
# [DEF:LogEntry:Class]
# @SEMANTICS: log, entry, record, pydantic
# @PURPOSE: A Pydantic model representing a single, structured log entry associated with a task.
class LogEntry(BaseModel):
timestamp: datetime = Field(default_factory=datetime.utcnow)
level: str
message: str
context: Optional[Dict[str, Any]] = None
# [/DEF:LogEntry:Class]
# [DEF:Task:Class]
# @SEMANTICS: task, job, execution, state, pydantic
# @PURPOSE: A Pydantic model representing a single execution instance of a plugin, including its status, parameters, and logs.
class Task(BaseModel):
id: str = Field(default_factory=lambda: str(uuid.uuid4()))
plugin_id: str
status: TaskStatus = TaskStatus.PENDING
started_at: Optional[datetime] = None
finished_at: Optional[datetime] = None
user_id: Optional[str] = None
logs: List[LogEntry] = Field(default_factory=list)
params: Dict[str, Any] = Field(default_factory=dict)
input_required: bool = False
input_request: Optional[Dict[str, Any]] = None
result: Optional[Dict[str, Any]] = None
# [DEF:__init__:Function]
# @PURPOSE: Initializes the Task model and validates input_request for AWAITING_INPUT status.
# @PRE: If status is AWAITING_INPUT, input_request must be provided.
# @POST: Task instance is created or ValueError is raised.
# @PARAM: **data - Keyword arguments for model initialization.
def __init__(self, **data):
super().__init__(**data)
if self.status == TaskStatus.AWAITING_INPUT and not self.input_request:
raise ValueError("input_request is required when status is AWAITING_INPUT")
# [/DEF:__init__:Function]
# [/DEF:Task:Class]
# [/DEF:TaskManagerModels:Module]

View File

@@ -0,0 +1,173 @@
# [DEF:TaskPersistenceModule:Module]
# @SEMANTICS: persistence, sqlite, sqlalchemy, task, storage
# @PURPOSE: Handles the persistence of tasks using SQLAlchemy and the tasks.db database.
# @LAYER: Core
# @RELATION: Used by TaskManager to save and load tasks.
# @INVARIANT: Database schema must match the TaskRecord model structure.
# [SECTION: IMPORTS]
from datetime import datetime
from typing import List, Optional, Dict, Any
import json
from sqlalchemy.orm import Session
from ...models.task import TaskRecord
from ..database import TasksSessionLocal
from .models import Task, TaskStatus, LogEntry
from ..logger import logger, belief_scope
# [/SECTION]
# [DEF:TaskPersistenceService:Class]
# @SEMANTICS: persistence, service, database, sqlalchemy
# @PURPOSE: Provides methods to save and load tasks from the tasks.db database using SQLAlchemy.
class TaskPersistenceService:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the persistence service.
# @PRE: None.
# @POST: Service is ready.
def __init__(self):
with belief_scope("TaskPersistenceService.__init__"):
# We use TasksSessionLocal from database.py
pass
# [/DEF:__init__:Function]
# [DEF:persist_task:Function]
# @PURPOSE: Persists or updates a single task in the database.
# @PRE: isinstance(task, Task)
# @POST: Task record created or updated in database.
# @PARAM: task (Task) - The task object to persist.
# @SIDE_EFFECT: Writes to task_records table in tasks.db
def persist_task(self, task: Task) -> None:
with belief_scope("TaskPersistenceService.persist_task", f"task_id={task.id}"):
session: Session = TasksSessionLocal()
try:
record = session.query(TaskRecord).filter(TaskRecord.id == task.id).first()
if not record:
record = TaskRecord(id=task.id)
session.add(record)
record.type = task.plugin_id
record.status = task.status.value
record.environment_id = task.params.get("environment_id") or task.params.get("source_env_id")
record.started_at = task.started_at
record.finished_at = task.finished_at
# Ensure params and result are JSON serializable
def json_serializable(obj):
if isinstance(obj, dict):
return {k: json_serializable(v) for k, v in obj.items()}
elif isinstance(obj, list):
return [json_serializable(v) for v in obj]
elif isinstance(obj, datetime):
return obj.isoformat()
return obj
record.params = json_serializable(task.params)
record.result = json_serializable(task.result)
# Store logs as JSON, converting datetime to string
record.logs = []
for log in task.logs:
log_dict = log.dict()
if isinstance(log_dict.get('timestamp'), datetime):
log_dict['timestamp'] = log_dict['timestamp'].isoformat()
# Also clean up any datetimes in context
if log_dict.get('context'):
log_dict['context'] = json_serializable(log_dict['context'])
record.logs.append(log_dict)
# Extract error if failed
if task.status == TaskStatus.FAILED:
for log in reversed(task.logs):
if log.level == "ERROR":
record.error = log.message
break
session.commit()
except Exception as e:
session.rollback()
logger.error(f"Failed to persist task {task.id}: {e}")
finally:
session.close()
# [/DEF:persist_task:Function]
# [DEF:persist_tasks:Function]
# @PURPOSE: Persists multiple tasks.
# @PRE: isinstance(tasks, list)
# @POST: All tasks in list are persisted.
# @PARAM: tasks (List[Task]) - The list of tasks to persist.
def persist_tasks(self, tasks: List[Task]) -> None:
with belief_scope("TaskPersistenceService.persist_tasks"):
for task in tasks:
self.persist_task(task)
# [/DEF:persist_tasks:Function]
# [DEF:load_tasks:Function]
# @PURPOSE: Loads tasks from the database.
# @PRE: limit is an integer.
# @POST: Returns list of Task objects.
# @PARAM: limit (int) - Max tasks to load.
# @PARAM: status (Optional[TaskStatus]) - Filter by status.
# @RETURN: List[Task] - The loaded tasks.
def load_tasks(self, limit: int = 100, status: Optional[TaskStatus] = None) -> List[Task]:
with belief_scope("TaskPersistenceService.load_tasks"):
session: Session = TasksSessionLocal()
try:
query = session.query(TaskRecord)
if status:
query = query.filter(TaskRecord.status == status.value)
records = query.order_by(TaskRecord.created_at.desc()).limit(limit).all()
loaded_tasks = []
for record in records:
try:
logs = []
if record.logs:
for log_data in record.logs:
# Handle timestamp conversion if it's a string
if isinstance(log_data.get('timestamp'), str):
log_data['timestamp'] = datetime.fromisoformat(log_data['timestamp'])
logs.append(LogEntry(**log_data))
task = Task(
id=record.id,
plugin_id=record.type,
status=TaskStatus(record.status),
started_at=record.started_at,
finished_at=record.finished_at,
params=record.params or {},
result=record.result,
logs=logs
)
loaded_tasks.append(task)
except Exception as e:
logger.error(f"Failed to reconstruct task {record.id}: {e}")
return loaded_tasks
finally:
session.close()
# [/DEF:load_tasks:Function]
# [DEF:delete_tasks:Function]
# @PURPOSE: Deletes specific tasks from the database.
# @PRE: task_ids is a list of strings.
# @POST: Specified task records deleted from database.
# @PARAM: task_ids (List[str]) - List of task IDs to delete.
def delete_tasks(self, task_ids: List[str]) -> None:
if not task_ids:
return
with belief_scope("TaskPersistenceService.delete_tasks"):
session: Session = TasksSessionLocal()
try:
session.query(TaskRecord).filter(TaskRecord.id.in_(task_ids)).delete(synchronize_session=False)
session.commit()
except Exception as e:
session.rollback()
logger.error(f"Failed to delete tasks: {e}")
finally:
session.close()
# [/DEF:delete_tasks:Function]
# [/DEF:TaskPersistenceService:Class]
# [/DEF:TaskPersistenceModule:Module]

View File

@@ -0,0 +1,237 @@
# [DEF:backend.core.utils.dataset_mapper:Module]
#
# @SEMANTICS: dataset, mapping, postgresql, xlsx, superset
# @PURPOSE: Этот модуль отвечает за обновление метаданных (verbose_map) в датасетах Superset, извлекая их из PostgreSQL или XLSX-файлов.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> backend.core.superset_client
# @RELATION: DEPENDS_ON -> pandas
# @RELATION: DEPENDS_ON -> psycopg2
# @PUBLIC_API: DatasetMapper
# [SECTION: IMPORTS]
import pandas as pd # type: ignore
import psycopg2 # type: ignore
from typing import Dict, List, Optional, Any
from ..logger import logger as app_logger, belief_scope
# [/SECTION]
# [DEF:DatasetMapper:Class]
# @PURPOSE: Класс для меппинга и обновления verbose_map в датасетах Superset.
class DatasetMapper:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the mapper.
# @POST: Объект DatasetMapper инициализирован.
def __init__(self):
pass
# [/DEF:__init__:Function]
# [DEF:get_postgres_comments:Function]
# @PURPOSE: Извлекает комментарии к колонкам из системного каталога PostgreSQL.
# @PRE: db_config должен содержать валидные параметры подключения (host, port, user, password, dbname).
# @PRE: table_name и table_schema должны быть строками.
# @POST: Возвращается словарь, где ключи - имена колонок, значения - комментарии из БД.
# @THROW: Exception - При ошибках подключения или выполнения запроса к БД.
# @PARAM: db_config (Dict) - Конфигурация для подключения к БД.
# @PARAM: table_name (str) - Имя таблицы.
# @PARAM: table_schema (str) - Схема таблицы.
# @RETURN: Dict[str, str] - Словарь с комментариями к колонкам.
def get_postgres_comments(self, db_config: Dict, table_name: str, table_schema: str) -> Dict[str, str]:
with belief_scope("Fetch comments from PostgreSQL"):
app_logger.info("[get_postgres_comments][Enter] Fetching comments from PostgreSQL for %s.%s.", table_schema, table_name)
query = f"""
SELECT
cols.column_name,
CASE
WHEN pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
) LIKE '%|%' THEN
split_part(
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
),
'|',
1
)
ELSE
pg_catalog.col_description(
(SELECT c.oid
FROM pg_catalog.pg_class c
JOIN pg_catalog.pg_namespace n ON n.oid = c.relnamespace
WHERE c.relname = cols.table_name
AND n.nspname = cols.table_schema),
cols.ordinal_position::int
)
END AS column_comment
FROM
information_schema.columns cols
WHERE cols.table_catalog = '{db_config.get('dbname')}' AND cols.table_name = '{table_name}' AND cols.table_schema = '{table_schema}';
"""
comments = {}
try:
with psycopg2.connect(**db_config) as conn, conn.cursor() as cursor:
cursor.execute(query)
for row in cursor.fetchall():
if row[1]:
comments[row[0]] = row[1]
app_logger.info("[get_postgres_comments][Success] Fetched %d comments.", len(comments))
except Exception as e:
app_logger.error("[get_postgres_comments][Failure] %s", e, exc_info=True)
raise
return comments
# [/DEF:get_postgres_comments:Function]
# [DEF:load_excel_mappings:Function]
# @PURPOSE: Загружает меппинги 'column_name' -> 'column_comment' из XLSX файла.
# @PRE: file_path должен указывать на существующий XLSX файл.
# @POST: Возвращается словарь с меппингами из файла.
# @THROW: Exception - При ошибках чтения файла или парсинга.
# @PARAM: file_path (str) - Путь к XLSX файлу.
# @RETURN: Dict[str, str] - Словарь с меппингами.
def load_excel_mappings(self, file_path: str) -> Dict[str, str]:
with belief_scope("Load mappings from Excel"):
app_logger.info("[load_excel_mappings][Enter] Loading mappings from %s.", file_path)
try:
df = pd.read_excel(file_path)
mappings = df.set_index('column_name')['verbose_name'].to_dict()
app_logger.info("[load_excel_mappings][Success] Loaded %d mappings.", len(mappings))
return mappings
except Exception as e:
app_logger.error("[load_excel_mappings][Failure] %s", e, exc_info=True)
raise
# [/DEF:load_excel_mappings:Function]
# [DEF:run_mapping:Function]
# @PURPOSE: Основная функция для выполнения меппинга и обновления verbose_map датасета в Superset.
# @PRE: superset_client должен быть авторизован.
# @PRE: dataset_id должен быть существующим ID в Superset.
# @POST: Если найдены изменения, датасет в Superset обновлен через API.
# @RELATION: CALLS -> self.get_postgres_comments
# @RELATION: CALLS -> self.load_excel_mappings
# @RELATION: CALLS -> superset_client.get_dataset
# @RELATION: CALLS -> superset_client.update_dataset
# @PARAM: superset_client (Any) - Клиент Superset.
# @PARAM: dataset_id (int) - ID датасета для обновления.
# @PARAM: source (str) - Источник данных ('postgres', 'excel', 'both').
# @PARAM: postgres_config (Optional[Dict]) - Конфигурация для подключения к PostgreSQL.
# @PARAM: excel_path (Optional[str]) - Путь к XLSX файлу.
# @PARAM: table_name (Optional[str]) - Имя таблицы в PostgreSQL.
# @PARAM: table_schema (Optional[str]) - Схема таблицы в PostgreSQL.
def run_mapping(self, superset_client: Any, dataset_id: int, source: str, postgres_config: Optional[Dict] = None, excel_path: Optional[str] = None, table_name: Optional[str] = None, table_schema: Optional[str] = None):
with belief_scope(f"Run dataset mapping for ID {dataset_id}"):
app_logger.info("[run_mapping][Enter] Starting dataset mapping for ID %d from source '%s'.", dataset_id, source)
mappings: Dict[str, str] = {}
try:
if source in ['postgres', 'both']:
assert postgres_config and table_name and table_schema, "Postgres config is required."
mappings.update(self.get_postgres_comments(postgres_config, table_name, table_schema))
if source in ['excel', 'both']:
assert excel_path, "Excel path is required."
mappings.update(self.load_excel_mappings(excel_path))
if source not in ['postgres', 'excel', 'both']:
app_logger.error("[run_mapping][Failure] Invalid source: %s.", source)
return
dataset_response = superset_client.get_dataset(dataset_id)
dataset_data = dataset_response['result']
original_columns = dataset_data.get('columns', [])
updated_columns = []
changes_made = False
for column in original_columns:
col_name = column.get('column_name')
new_column = {
"column_name": col_name,
"id": column.get("id"),
"advanced_data_type": column.get("advanced_data_type"),
"description": column.get("description"),
"expression": column.get("expression"),
"extra": column.get("extra"),
"filterable": column.get("filterable"),
"groupby": column.get("groupby"),
"is_active": column.get("is_active"),
"is_dttm": column.get("is_dttm"),
"python_date_format": column.get("python_date_format"),
"type": column.get("type"),
"uuid": column.get("uuid"),
"verbose_name": column.get("verbose_name"),
}
new_column = {k: v for k, v in new_column.items() if v is not None}
if col_name in mappings:
mapping_value = mappings[col_name]
if isinstance(mapping_value, str) and new_column.get('verbose_name') != mapping_value:
new_column['verbose_name'] = mapping_value
changes_made = True
updated_columns.append(new_column)
updated_metrics = []
for metric in dataset_data.get("metrics", []):
new_metric = {
"id": metric.get("id"),
"metric_name": metric.get("metric_name"),
"expression": metric.get("expression"),
"verbose_name": metric.get("verbose_name"),
"description": metric.get("description"),
"d3format": metric.get("d3format"),
"currency": metric.get("currency"),
"extra": metric.get("extra"),
"warning_text": metric.get("warning_text"),
"metric_type": metric.get("metric_type"),
"uuid": metric.get("uuid"),
}
updated_metrics.append({k: v for k, v in new_metric.items() if v is not None})
if changes_made:
payload_for_update = {
"database_id": dataset_data.get("database", {}).get("id"),
"table_name": dataset_data.get("table_name"),
"schema": dataset_data.get("schema"),
"columns": updated_columns,
"owners": [owner["id"] for owner in dataset_data.get("owners", [])],
"metrics": updated_metrics,
"extra": dataset_data.get("extra"),
"description": dataset_data.get("description"),
"sql": dataset_data.get("sql"),
"cache_timeout": dataset_data.get("cache_timeout"),
"catalog": dataset_data.get("catalog"),
"default_endpoint": dataset_data.get("default_endpoint"),
"external_url": dataset_data.get("external_url"),
"fetch_values_predicate": dataset_data.get("fetch_values_predicate"),
"filter_select_enabled": dataset_data.get("filter_select_enabled"),
"is_managed_externally": dataset_data.get("is_managed_externally"),
"is_sqllab_view": dataset_data.get("is_sqllab_view"),
"main_dttm_col": dataset_data.get("main_dttm_col"),
"normalize_columns": dataset_data.get("normalize_columns"),
"offset": dataset_data.get("offset"),
"template_params": dataset_data.get("template_params"),
}
payload_for_update = {k: v for k, v in payload_for_update.items() if v is not None}
superset_client.update_dataset(dataset_id, payload_for_update)
app_logger.info("[run_mapping][Success] Dataset %d columns' verbose_name updated.", dataset_id)
else:
app_logger.info("[run_mapping][State] No changes in columns' verbose_name, skipping update.")
except (AssertionError, FileNotFoundError, Exception) as e:
app_logger.error("[run_mapping][Failure] %s", e, exc_info=True)
return
# [/DEF:run_mapping:Function]
# [/DEF:DatasetMapper:Class]
# [/DEF:backend.core.utils.dataset_mapper:Module]

View File

@@ -0,0 +1,488 @@
# [DEF:backend.core.utils.fileio:Module]
#
# @SEMANTICS: file, io, zip, yaml, temp, archive, utility
# @PURPOSE: Предоставляет набор утилит для управления файловыми операциями, включая работу с временными файлами, архивами ZIP, файлами YAML и очистку директорий.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> backend.src.core.logger
# @RELATION: DEPENDS_ON -> pyyaml
# @PUBLIC_API: create_temp_file, remove_empty_directories, read_dashboard_from_disk, calculate_crc32, RetentionPolicy, archive_exports, save_and_unpack_dashboard, update_yamls, create_dashboard_export, sanitize_filename, get_filename_from_headers, consolidate_archive_folders
# [SECTION: IMPORTS]
import os
import re
import zipfile
from pathlib import Path
from typing import Any, Optional, Tuple, Dict, List, Union, LiteralString, Generator
from contextlib import contextmanager
import tempfile
from datetime import date, datetime
import shutil
import zlib
from dataclasses import dataclass
import yaml
from ..logger import logger as app_logger, belief_scope
# [/SECTION]
# [DEF:InvalidZipFormatError:Class]
# @PURPOSE: Exception raised when a file is not a valid ZIP archive.
class InvalidZipFormatError(Exception):
pass
# [/DEF:InvalidZipFormatError:Class]
# [DEF:create_temp_file:Function]
# @PURPOSE: Контекстный менеджер для создания временного файла или директории с гарантированным удалением.
# @PRE: suffix должен быть строкой, определяющей тип ресурса.
# @POST: Временный ресурс создан и путь к нему возвращен; ресурс удален после выхода из контекста.
# @PARAM: content (Optional[bytes]) - Бинарное содержимое для записи во временный файл.
# @PARAM: suffix (str) - Суффикс ресурса. Если `.dir`, создается директория.
# @PARAM: mode (str) - Режим записи в файл (e.g., 'wb').
# @YIELDS: Path - Путь к временному ресурсу.
# @THROW: IOError - При ошибках создания ресурса.
@contextmanager
def create_temp_file(content: Optional[bytes] = None, suffix: str = ".zip", mode: str = 'wb', dry_run = False) -> Generator[Path, None, None]:
with belief_scope("Create temporary resource"):
resource_path = None
is_dir = suffix.startswith('.dir')
try:
if is_dir:
with tempfile.TemporaryDirectory(suffix=suffix) as temp_dir:
resource_path = Path(temp_dir)
app_logger.debug("[create_temp_file][State] Created temporary directory: %s", resource_path)
yield resource_path
else:
fd, temp_path_str = tempfile.mkstemp(suffix=suffix)
resource_path = Path(temp_path_str)
os.close(fd)
if content:
resource_path.write_bytes(content)
app_logger.debug("[create_temp_file][State] Created temporary file: %s", resource_path)
yield resource_path
finally:
if resource_path and resource_path.exists() and not dry_run:
try:
if resource_path.is_dir():
shutil.rmtree(resource_path)
app_logger.debug("[create_temp_file][Cleanup] Removed temporary directory: %s", resource_path)
else:
resource_path.unlink()
app_logger.debug("[create_temp_file][Cleanup] Removed temporary file: %s", resource_path)
except OSError as e:
app_logger.error("[create_temp_file][Failure] Error during cleanup of %s: %s", resource_path, e)
# [/DEF:create_temp_file:Function]
# [DEF:remove_empty_directories:Function]
# @PURPOSE: Рекурсивно удаляет все пустые поддиректории, начиная с указанного пути.
# @PRE: root_dir должен быть путем к существующей директории.
# @POST: Все пустые поддиректории удалены, возвращено их количество.
# @PARAM: root_dir (str) - Путь к корневой директории для очистки.
# @RETURN: int - Количество удаленных директорий.
def remove_empty_directories(root_dir: str) -> int:
with belief_scope(f"Remove empty directories in {root_dir}"):
app_logger.info("[remove_empty_directories][Enter] Starting cleanup of empty directories in %s", root_dir)
removed_count = 0
if not os.path.isdir(root_dir):
app_logger.error("[remove_empty_directories][Failure] Directory not found: %s", root_dir)
return 0
for current_dir, _, _ in os.walk(root_dir, topdown=False):
if not os.listdir(current_dir):
try:
os.rmdir(current_dir)
removed_count += 1
app_logger.info("[remove_empty_directories][State] Removed empty directory: %s", current_dir)
except OSError as e:
app_logger.error("[remove_empty_directories][Failure] Failed to remove %s: %s", current_dir, e)
app_logger.info("[remove_empty_directories][Exit] Removed %d empty directories.", removed_count)
return removed_count
# [/DEF:remove_empty_directories:Function]
# [DEF:read_dashboard_from_disk:Function]
# @PURPOSE: Читает бинарное содержимое файла с диска.
# @PRE: file_path должен указывать на существующий файл.
# @POST: Возвращает байты содержимого и имя файла.
# @PARAM: file_path (str) - Путь к файлу.
# @RETURN: Tuple[bytes, str] - Кортеж (содержимое, имя файла).
# @THROW: FileNotFoundError - Если файл не найден.
def read_dashboard_from_disk(file_path: str) -> Tuple[bytes, str]:
with belief_scope(f"Read dashboard from {file_path}"):
path = Path(file_path)
assert path.is_file(), f"Файл дашборда не найден: {file_path}"
app_logger.info("[read_dashboard_from_disk][Enter] Reading file: %s", file_path)
content = path.read_bytes()
if not content:
app_logger.warning("[read_dashboard_from_disk][Warning] File is empty: %s", file_path)
return content, path.name
# [/DEF:read_dashboard_from_disk:Function]
# [DEF:calculate_crc32:Function]
# @PURPOSE: Вычисляет контрольную сумму CRC32 для файла.
# @PRE: file_path должен быть объектом Path к существующему файлу.
# @POST: Возвращает 8-значную hex-строку CRC32.
# @PARAM: file_path (Path) - Путь к файлу.
# @RETURN: str - 8-значное шестнадцатеричное представление CRC32.
# @THROW: IOError - При ошибках чтения файла.
def calculate_crc32(file_path: Path) -> str:
with belief_scope(f"Calculate CRC32 for {file_path}"):
with open(file_path, 'rb') as f:
crc32_value = zlib.crc32(f.read())
return f"{crc32_value:08x}"
# [/DEF:calculate_crc32:Function]
# [SECTION: DATA_CLASSES]
# [DEF:RetentionPolicy:DataClass]
# @PURPOSE: Определяет политику хранения для архивов (ежедневные, еженедельные, ежемесячные).
@dataclass
class RetentionPolicy:
daily: int = 7
weekly: int = 4
monthly: int = 12
# [/DEF:RetentionPolicy:DataClass]
# [/SECTION]
# [DEF:archive_exports:Function]
# @PURPOSE: Управляет архивом экспортированных файлов, применяя политику хранения и дедупликацию.
# @PRE: output_dir должен быть путем к существующей директории.
# @POST: Старые или дублирующиеся архивы удалены согласно политике.
# @RELATION: CALLS -> apply_retention_policy
# @RELATION: CALLS -> calculate_crc32
# @PARAM: output_dir (str) - Директория с архивами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @PARAM: deduplicate (bool) - Флаг для включения удаления дубликатов по CRC32.
def archive_exports(output_dir: str, policy: RetentionPolicy, deduplicate: bool = False) -> None:
with belief_scope(f"Archive exports in {output_dir}"):
output_path = Path(output_dir)
if not output_path.is_dir():
app_logger.warning("[archive_exports][Skip] Archive directory not found: %s", output_dir)
return
app_logger.info("[archive_exports][Enter] Managing archive in %s", output_dir)
# 1. Collect all zip files
zip_files = list(output_path.glob("*.zip"))
if not zip_files:
app_logger.info("[archive_exports][State] No zip files found in %s", output_dir)
return
# 2. Deduplication
if deduplicate:
app_logger.info("[archive_exports][State] Starting deduplication...")
checksums = {}
files_to_remove = []
# Sort by modification time (newest first) to keep the latest version
zip_files.sort(key=lambda f: f.stat().st_mtime, reverse=True)
for file_path in zip_files:
try:
crc = calculate_crc32(file_path)
if crc in checksums:
files_to_remove.append(file_path)
app_logger.debug("[archive_exports][State] Duplicate found: %s (same as %s)", file_path.name, checksums[crc].name)
else:
checksums[crc] = file_path
except Exception as e:
app_logger.error("[archive_exports][Failure] Failed to calculate CRC32 for %s: %s", file_path, e)
for f in files_to_remove:
try:
f.unlink()
zip_files.remove(f)
app_logger.info("[archive_exports][State] Removed duplicate: %s", f.name)
except OSError as e:
app_logger.error("[archive_exports][Failure] Failed to remove duplicate %s: %s", f, e)
# 3. Retention Policy
files_with_dates = []
for file_path in zip_files:
# Try to extract date from filename
# Pattern: ..._YYYYMMDD_HHMMSS.zip or ..._YYYYMMDD.zip
match = re.search(r'_(\d{8})_', file_path.name)
file_date = None
if match:
try:
date_str = match.group(1)
file_date = datetime.strptime(date_str, "%Y%m%d").date()
except ValueError:
pass
if not file_date:
# Fallback to modification time
file_date = datetime.fromtimestamp(file_path.stat().st_mtime).date()
files_with_dates.append((file_path, file_date))
files_to_keep = apply_retention_policy(files_with_dates, policy)
for file_path, _ in files_with_dates:
if file_path not in files_to_keep:
try:
file_path.unlink()
app_logger.info("[archive_exports][State] Removed by retention policy: %s", file_path.name)
except OSError as e:
app_logger.error("[archive_exports][Failure] Failed to remove %s: %s", file_path, e)
# [/DEF:archive_exports:Function]
# [DEF:apply_retention_policy:Function]
# @PURPOSE: (Helper) Применяет политику хранения к списку файлов, возвращая те, что нужно сохранить.
# @PRE: files_with_dates is a list of (Path, date) tuples.
# @POST: Returns a set of files to keep.
# @PARAM: files_with_dates (List[Tuple[Path, date]]) - Список файлов с датами.
# @PARAM: policy (RetentionPolicy) - Политика хранения.
# @RETURN: set - Множество путей к файлам, которые должны быть сохранены.
def apply_retention_policy(files_with_dates: List[Tuple[Path, date]], policy: RetentionPolicy) -> set:
with belief_scope("Apply retention policy"):
# Сортируем по дате (от новой к старой)
sorted_files = sorted(files_with_dates, key=lambda x: x[1], reverse=True)
# Словарь для хранения файлов по категориям
daily_files = []
weekly_files = []
monthly_files = []
today = date.today()
for file_path, file_date in sorted_files:
# Ежедневные
if (today - file_date).days < policy.daily:
daily_files.append(file_path)
# Еженедельные
elif (today - file_date).days < policy.weekly * 7:
weekly_files.append(file_path)
# Ежемесячные
elif (today - file_date).days < policy.monthly * 30:
monthly_files.append(file_path)
# Возвращаем множество файлов, которые нужно сохранить
files_to_keep = set()
files_to_keep.update(daily_files)
files_to_keep.update(weekly_files[:policy.weekly])
files_to_keep.update(monthly_files[:policy.monthly])
app_logger.debug("[apply_retention_policy][State] Keeping %d files according to retention policy", len(files_to_keep))
return files_to_keep
# [/DEF:apply_retention_policy:Function]
# [DEF:save_and_unpack_dashboard:Function]
# @PURPOSE: Сохраняет бинарное содержимое ZIP-архива на диск и опционально распаковывает его.
# @PRE: zip_content должен быть байтами валидного ZIP-архива.
# @POST: ZIP-файл сохранен, и если unpack=True, он распакован в output_dir.
# @PARAM: zip_content (bytes) - Содержимое ZIP-архива.
# @PARAM: output_dir (Union[str, Path]) - Директория для сохранения.
# @PARAM: unpack (bool) - Флаг, нужно ли распаковывать архив.
# @PARAM: original_filename (Optional[str]) - Исходное имя файла для сохранения.
# @RETURN: Tuple[Path, Optional[Path]] - Путь к ZIP-файлу и, если применимо, путь к директории с распаковкой.
# @THROW: InvalidZipFormatError - При ошибке формата ZIP.
def save_and_unpack_dashboard(zip_content: bytes, output_dir: Union[str, Path], unpack: bool = False, original_filename: Optional[str] = None) -> Tuple[Path, Optional[Path]]:
with belief_scope("Save and unpack dashboard"):
app_logger.info("[save_and_unpack_dashboard][Enter] Processing dashboard. Unpack: %s", unpack)
try:
output_path = Path(output_dir)
output_path.mkdir(parents=True, exist_ok=True)
zip_name = sanitize_filename(original_filename) if original_filename else f"dashboard_export_{datetime.now().strftime('%Y%m%d_%H%M%S')}.zip"
zip_path = output_path / zip_name
zip_path.write_bytes(zip_content)
app_logger.info("[save_and_unpack_dashboard][State] Dashboard saved to: %s", zip_path)
if unpack:
with zipfile.ZipFile(zip_path, 'r') as zip_ref:
zip_ref.extractall(output_path)
app_logger.info("[save_and_unpack_dashboard][State] Dashboard unpacked to: %s", output_path)
return zip_path, output_path
return zip_path, None
except zipfile.BadZipFile as e:
app_logger.error("[save_and_unpack_dashboard][Failure] Invalid ZIP archive: %s", e)
raise InvalidZipFormatError(f"Invalid ZIP file: {e}") from e
# [/DEF:save_and_unpack_dashboard:Function]
# [DEF:update_yamls:Function]
# @PURPOSE: Обновляет конфигурации в YAML-файлах, заменяя значения или применяя regex.
# @PRE: path должен быть существующей директорией.
# @POST: Все YAML файлы в директории обновлены согласно переданным параметрам.
# @RELATION: CALLS -> _update_yaml_file
# @THROW: FileNotFoundError - Если `path` не существует.
# @PARAM: db_configs (Optional[List[Dict]]) - Список конфигураций для замены.
# @PARAM: path (str) - Путь к директории с YAML файлами.
# @PARAM: regexp_pattern (Optional[LiteralString]) - Паттерн для поиска.
# @PARAM: replace_string (Optional[LiteralString]) - Строка для замены.
def update_yamls(db_configs: Optional[List[Dict[str, Any]]] = None, path: str = "dashboards", regexp_pattern: Optional[LiteralString] = None, replace_string: Optional[LiteralString] = None) -> None:
with belief_scope("Update YAML configurations"):
app_logger.info("[update_yamls][Enter] Starting YAML configuration update.")
dir_path = Path(path)
assert dir_path.is_dir(), f"Путь {path} не существует или не является директорией"
configs: List[Dict[str, Any]] = db_configs or []
for file_path in dir_path.rglob("*.yaml"):
_update_yaml_file(file_path, configs, regexp_pattern, replace_string)
# [/DEF:update_yamls:Function]
# [DEF:_update_yaml_file:Function]
# @PURPOSE: (Helper) Обновляет один YAML файл.
# @PRE: file_path должен быть объектом Path к существующему YAML файлу.
# @POST: Файл обновлен согласно переданным конфигурациям или регулярному выражению.
# @PARAM: file_path (Path) - Путь к файлу.
# @PARAM: db_configs (List[Dict]) - Конфигурации.
# @PARAM: regexp_pattern (Optional[str]) - Паттерн.
# @PARAM: replace_string (Optional[str]) - Замена.
def _update_yaml_file(file_path: Path, db_configs: List[Dict[str, Any]], regexp_pattern: Optional[str], replace_string: Optional[str]) -> None:
with belief_scope(f"Update YAML file: {file_path}"):
# Читаем содержимое файла
try:
with open(file_path, 'r', encoding='utf-8') as f:
content = f.read()
except Exception as e:
app_logger.error("[_update_yaml_file][Failure] Failed to read %s: %s", file_path, e)
return
# Если задан pattern и replace_string, применяем замену по регулярному выражению
if regexp_pattern and replace_string:
try:
new_content = re.sub(regexp_pattern, replace_string, content)
if new_content != content:
with open(file_path, 'w', encoding='utf-8') as f:
f.write(new_content)
app_logger.info("[_update_yaml_file][State] Updated %s using regex pattern", file_path)
except Exception as e:
app_logger.error("[_update_yaml_file][Failure] Error applying regex to %s: %s", file_path, e)
# Если заданы конфигурации, заменяем значения (поддержка old/new)
if db_configs:
try:
# Прямой текстовый заменитель для старых/новых значений, чтобы сохранить структуру файла
modified_content = content
for cfg in db_configs:
# Ожидаем структуру: {'old': {...}, 'new': {...}}
old_cfg = cfg.get('old', {})
new_cfg = cfg.get('new', {})
for key, old_val in old_cfg.items():
if key in new_cfg:
new_val = new_cfg[key]
# Заменяем только точные совпадения старого значения в тексте YAML, используя ключ для контекста
if isinstance(old_val, str):
# Ищем паттерн: key: "value" или key: value
key_pattern = re.escape(key)
val_pattern = re.escape(old_val)
# Группы: 1=ключ+разделитель, 2=открывающая кавычка (опц), 3=значение, 4=закрывающая кавычка (опц)
pattern = rf'({key_pattern}\s*:\s*)(["\']?)({val_pattern})(["\']?)'
# [DEF:replacer:Function]
# @PURPOSE: Функция замены, сохраняющая кавычки если они были.
# @PRE: match должен быть объектом совпадения регулярного выражения.
# @POST: Возвращает строку с новым значением, сохраняя префикс и кавычки.
def replacer(match):
prefix = match.group(1)
quote_open = match.group(2)
quote_close = match.group(4)
return f"{prefix}{quote_open}{new_val}{quote_close}"
# [/DEF:replacer:Function]
modified_content = re.sub(pattern, replacer, modified_content)
app_logger.info("[_update_yaml_file][State] Replaced '%s' with '%s' for key %s in %s", old_val, new_val, key, file_path)
# Записываем обратно изменённый контент без парсинга YAML, сохраняем оригинальное форматирование
with open(file_path, 'w', encoding='utf-8') as f:
f.write(modified_content)
except Exception as e:
app_logger.error("[_update_yaml_file][Failure] Error performing raw replacement in %s: %s", file_path, e)
# [/DEF:_update_yaml_file:Function]
# [DEF:create_dashboard_export:Function]
# @PURPOSE: Создает ZIP-архив из указанных исходных путей.
# @PRE: source_paths должен содержать существующие пути.
# @POST: ZIP-архив создан по пути zip_path.
# @PARAM: zip_path (Union[str, Path]) - Путь для сохранения ZIP архива.
# @PARAM: source_paths (List[Union[str, Path]]) - Список исходных путей для архивации.
# @PARAM: exclude_extensions (Optional[List[str]]) - Список расширений для исключения.
# @RETURN: bool - `True` при успехе, `False` при ошибке.
def create_dashboard_export(zip_path: Union[str, Path], source_paths: List[Union[str, Path]], exclude_extensions: Optional[List[str]] = None) -> bool:
with belief_scope(f"Create dashboard export: {zip_path}"):
app_logger.info("[create_dashboard_export][Enter] Packing dashboard: %s -> %s", source_paths, zip_path)
try:
exclude_ext = [ext.lower() for ext in exclude_extensions or []]
with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
for src_path_str in source_paths:
src_path = Path(src_path_str)
assert src_path.exists(), f"Путь не найден: {src_path}"
for item in src_path.rglob('*'):
if item.is_file() and item.suffix.lower() not in exclude_ext:
arcname = item.relative_to(src_path.parent)
zipf.write(item, arcname)
app_logger.info("[create_dashboard_export][Exit] Archive created: %s", zip_path)
return True
except (IOError, zipfile.BadZipFile, AssertionError) as e:
app_logger.error("[create_dashboard_export][Failure] Error: %s", e, exc_info=True)
return False
# [/DEF:create_dashboard_export:Function]
# [DEF:sanitize_filename:Function]
# @PURPOSE: Очищает строку от символов, недопустимых в именах файлов.
# @PRE: filename должен быть строкой.
# @POST: Возвращает строку без спецсимволов.
# @PARAM: filename (str) - Исходное имя файла.
# @RETURN: str - Очищенная строка.
def sanitize_filename(filename: str) -> str:
with belief_scope(f"Sanitize filename: {filename}"):
return re.sub(r'[\\/*?:"<>|]', "_", filename).strip()
# [/DEF:sanitize_filename:Function]
# [DEF:get_filename_from_headers:Function]
# @PURPOSE: Извлекает имя файла из HTTP заголовка 'Content-Disposition'.
# @PRE: headers должен быть словарем заголовков.
# @POST: Возвращает имя файла или None, если заголовок отсутствует.
# @PARAM: headers (dict) - Словарь HTTP заголовков.
# @RETURN: Optional[str] - Имя файла or `None`.
def get_filename_from_headers(headers: dict) -> Optional[str]:
with belief_scope("Get filename from headers"):
content_disposition = headers.get("Content-Disposition", "")
if match := re.search(r'filename="?([^"]+)"?', content_disposition):
return match.group(1).strip()
return None
# [/DEF:get_filename_from_headers:Function]
# [DEF:consolidate_archive_folders:Function]
# @PURPOSE: Консолидирует директории архивов на основе общего слага в имени.
# @PRE: root_directory должен быть объектом Path к существующей директории.
# @POST: Директории с одинаковым префиксом объединены в одну.
# @THROW: TypeError, ValueError - Если `root_directory` невалиден.
# @PARAM: root_directory (Path) - Корневая директория для консолидации.
def consolidate_archive_folders(root_directory: Path) -> None:
with belief_scope(f"Consolidate archives in {root_directory}"):
assert isinstance(root_directory, Path), "root_directory must be a Path object."
assert root_directory.is_dir(), "root_directory must be an existing directory."
app_logger.info("[consolidate_archive_folders][Enter] Consolidating archives in %s", root_directory)
# Собираем все директории с архивами
archive_dirs = []
for item in root_directory.iterdir():
if item.is_dir():
# Проверяем, есть ли в директории ZIP-архивы
if any(item.glob("*.zip")):
archive_dirs.append(item)
# Группируем по слагу (части имени до первого '_')
slug_groups = {}
for dir_path in archive_dirs:
dir_name = dir_path.name
slug = dir_name.split('_')[0] if '_' in dir_name else dir_name
if slug not in slug_groups:
slug_groups[slug] = []
slug_groups[slug].append(dir_path)
# Для каждой группы консолидируем
for slug, dirs in slug_groups.items():
if len(dirs) <= 1:
continue
# Создаем целевую директорию
target_dir = root_directory / slug
target_dir.mkdir(exist_ok=True)
app_logger.info("[consolidate_archive_folders][State] Consolidating %d directories under %s", len(dirs), target_dir)
# Перемещаем содержимое
for source_dir in dirs:
if source_dir == target_dir:
continue
for item in source_dir.iterdir():
dest_item = target_dir / item.name
try:
if item.is_dir():
shutil.move(str(item), str(dest_item))
else:
shutil.move(str(item), str(dest_item))
except Exception as e:
app_logger.error("[consolidate_archive_folders][Failure] Failed to move %s to %s: %s", item, dest_item, e)
# Удаляем исходную директорию
try:
source_dir.rmdir()
app_logger.info("[consolidate_archive_folders][State] Removed source directory: %s", source_dir)
except Exception as e:
app_logger.error("[consolidate_archive_folders][Failure] Failed to remove source directory %s: %s", source_dir, e)
# [/DEF:consolidate_archive_folders:Function]
# [/DEF:backend.core.utils.fileio:Module]

View File

@@ -48,6 +48,6 @@ def suggest_mappings(source_databases: List[Dict], target_databases: List[Dict],
}) })
return suggestions return suggestions
# [/DEF:suggest_mappings] # [/DEF:suggest_mappings:Function]
# [/DEF:backend.src.core.utils.matching] # [/DEF:backend.src.core.utils.matching:Module]

View File

@@ -0,0 +1,340 @@
# [DEF:backend.core.utils.network:Module]
#
# @SEMANTICS: network, http, client, api, requests, session, authentication
# @PURPOSE: Инкапсулирует низкоуровневую HTTP-логику для взаимодействия с Superset API, включая аутентификацию, управление сессией, retry-логику и обработку ошибок.
# @LAYER: Infra
# @RELATION: DEPENDS_ON -> backend.src.core.logger
# @RELATION: DEPENDS_ON -> requests
# @PUBLIC_API: APIClient
# [SECTION: IMPORTS]
from typing import Optional, Dict, Any, List, Union, cast
import json
import io
from pathlib import Path
import requests
from requests.adapters import HTTPAdapter
import urllib3
from urllib3.util.retry import Retry
from ..logger import logger as app_logger, belief_scope
# [/SECTION]
# [DEF:SupersetAPIError:Class]
# @PURPOSE: Base exception for all Superset API related errors.
class SupersetAPIError(Exception):
# [DEF:__init__:Function]
# @PURPOSE: Initializes the exception with a message and context.
# @PRE: message is a string, context is a dict.
# @POST: Exception is initialized with context.
def __init__(self, message: str = "Superset API error", **context: Any):
with belief_scope("SupersetAPIError.__init__"):
self.context = context
super().__init__(f"[API_FAILURE] {message} | Context: {self.context}")
# [/DEF:__init__:Function]
# [/DEF:SupersetAPIError:Class]
# [DEF:AuthenticationError:Class]
# @PURPOSE: Exception raised when authentication fails.
class AuthenticationError(SupersetAPIError):
# [DEF:__init__:Function]
# @PURPOSE: Initializes the authentication error.
# @PRE: message is a string, context is a dict.
# @POST: AuthenticationError is initialized.
def __init__(self, message: str = "Authentication failed", **context: Any):
with belief_scope("AuthenticationError.__init__"):
super().__init__(message, type="authentication", **context)
# [/DEF:__init__:Function]
# [/DEF:AuthenticationError:Class]
# [DEF:PermissionDeniedError:Class]
# @PURPOSE: Exception raised when access is denied.
class PermissionDeniedError(AuthenticationError):
# [DEF:__init__:Function]
# @PURPOSE: Initializes the permission denied error.
# @PRE: message is a string, context is a dict.
# @POST: PermissionDeniedError is initialized.
def __init__(self, message: str = "Permission denied", **context: Any):
with belief_scope("PermissionDeniedError.__init__"):
super().__init__(message, **context)
# [/DEF:__init__:Function]
# [/DEF:PermissionDeniedError:Class]
# [DEF:DashboardNotFoundError:Class]
# @PURPOSE: Exception raised when a dashboard cannot be found.
class DashboardNotFoundError(SupersetAPIError):
# [DEF:__init__:Function]
# @PURPOSE: Initializes the not found error with resource ID.
# @PRE: resource_id is provided.
# @POST: DashboardNotFoundError is initialized.
def __init__(self, resource_id: Union[int, str], message: str = "Dashboard not found", **context: Any):
with belief_scope("DashboardNotFoundError.__init__"):
super().__init__(f"Dashboard '{resource_id}' {message}", subtype="not_found", resource_id=resource_id, **context)
# [/DEF:__init__:Function]
# [/DEF:DashboardNotFoundError:Class]
# [DEF:NetworkError:Class]
# @PURPOSE: Exception raised when a network level error occurs.
class NetworkError(Exception):
# [DEF:__init__:Function]
# @PURPOSE: Initializes the network error.
# @PRE: message is a string.
# @POST: NetworkError is initialized.
def __init__(self, message: str = "Network connection failed", **context: Any):
with belief_scope("NetworkError.__init__"):
self.context = context
super().__init__(f"[NETWORK_FAILURE] {message} | Context: {self.context}")
# [/DEF:__init__:Function]
# [/DEF:NetworkError:Class]
# [DEF:APIClient:Class]
# @PURPOSE: Инкапсулирует HTTP-логику для работы с API, включая сессии, аутентификацию, и обработку запросов.
class APIClient:
DEFAULT_TIMEOUT = 30
# [DEF:__init__:Function]
# @PURPOSE: Инициализирует API клиент с конфигурацией, сессией и логгером.
# @PARAM: config (Dict[str, Any]) - Конфигурация.
# @PARAM: verify_ssl (bool) - Проверять ли SSL.
# @PARAM: timeout (int) - Таймаут запросов.
# @PRE: config must contain 'base_url' and 'auth'.
# @POST: APIClient instance is initialized with a session.
def __init__(self, config: Dict[str, Any], verify_ssl: bool = True, timeout: int = DEFAULT_TIMEOUT):
with belief_scope("__init__"):
app_logger.info("[APIClient.__init__][Entry] Initializing APIClient.")
self.base_url: str = config.get("base_url", "")
self.auth = config.get("auth")
self.request_settings = {"verify_ssl": verify_ssl, "timeout": timeout}
self.session = self._init_session()
self._tokens: Dict[str, str] = {}
self._authenticated = False
app_logger.info("[APIClient.__init__][Exit] APIClient initialized.")
# [/DEF:__init__:Function]
# [DEF:_init_session:Function]
# @PURPOSE: Создает и настраивает `requests.Session` с retry-логикой.
# @PRE: self.request_settings must be initialized.
# @POST: Returns a configured requests.Session instance.
# @RETURN: requests.Session - Настроенная сессия.
def _init_session(self) -> requests.Session:
with belief_scope("_init_session"):
session = requests.Session()
retries = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])
adapter = HTTPAdapter(max_retries=retries)
session.mount('http://', adapter)
session.mount('https://', adapter)
if not self.request_settings["verify_ssl"]:
urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
app_logger.warning("[_init_session][State] SSL verification disabled.")
session.verify = self.request_settings["verify_ssl"]
return session
# [/DEF:_init_session:Function]
# [DEF:authenticate:Function]
# @PURPOSE: Выполняет аутентификацию в Superset API и получает access и CSRF токены.
# @PRE: self.auth and self.base_url must be valid.
# @POST: `self._tokens` заполнен, `self._authenticated` установлен в `True`.
# @RETURN: Dict[str, str] - Словарь с токенами.
# @THROW: AuthenticationError, NetworkError - при ошибках.
def authenticate(self) -> Dict[str, str]:
with belief_scope("authenticate"):
app_logger.info("[authenticate][Enter] Authenticating to %s", self.base_url)
try:
login_url = f"{self.base_url}/security/login"
# Log the payload keys and values (masking password)
masked_auth = {k: ("******" if k == "password" else v) for k, v in self.auth.items()}
app_logger.info(f"[authenticate][Debug] Login URL: {login_url}")
app_logger.info(f"[authenticate][Debug] Auth payload: {masked_auth}")
response = self.session.post(login_url, json=self.auth, timeout=self.request_settings["timeout"])
if response.status_code != 200:
app_logger.error(f"[authenticate][Error] Status: {response.status_code}, Response: {response.text}")
response.raise_for_status()
access_token = response.json()["access_token"]
csrf_url = f"{self.base_url}/security/csrf_token/"
csrf_response = self.session.get(csrf_url, headers={"Authorization": f"Bearer {access_token}"}, timeout=self.request_settings["timeout"])
csrf_response.raise_for_status()
self._tokens = {"access_token": access_token, "csrf_token": csrf_response.json()["result"]}
self._authenticated = True
app_logger.info("[authenticate][Exit] Authenticated successfully.")
return self._tokens
except requests.exceptions.HTTPError as e:
status_code = e.response.status_code if e.response is not None else None
if status_code in [502, 503, 504]:
raise NetworkError(f"Environment unavailable during authentication (Status {status_code})", status_code=status_code) from e
raise AuthenticationError(f"Authentication failed: {e}") from e
except (requests.exceptions.RequestException, KeyError) as e:
raise NetworkError(f"Network or parsing error during authentication: {e}") from e
# [/DEF:authenticate:Function]
@property
# [DEF:headers:Function]
# @PURPOSE: Возвращает HTTP-заголовки для аутентифицированных запросов.
# @PRE: APIClient is initialized and authenticated or can be authenticated.
# @POST: Returns headers including auth tokens.
def headers(self) -> Dict[str, str]:
with belief_scope("headers"):
if not self._authenticated: self.authenticate()
return {
"Authorization": f"Bearer {self._tokens['access_token']}",
"X-CSRFToken": self._tokens.get("csrf_token", ""),
"Referer": self.base_url,
"Content-Type": "application/json"
}
# [/DEF:headers:Function]
# [DEF:request:Function]
# @PURPOSE: Выполняет универсальный HTTP-запрос к API.
# @PARAM: method (str) - HTTP метод.
# @PARAM: endpoint (str) - API эндпоинт.
# @PARAM: headers (Optional[Dict]) - Дополнительные заголовки.
# @PARAM: raw_response (bool) - Возвращать ли сырой ответ.
# @PRE: method and endpoint must be strings.
# @POST: Returns response content or raw Response object.
# @RETURN: `requests.Response` если `raw_response=True`, иначе `dict`.
# @THROW: SupersetAPIError, NetworkError и их подклассы.
def request(self, method: str, endpoint: str, headers: Optional[Dict] = None, raw_response: bool = False, **kwargs) -> Union[requests.Response, Dict[str, Any]]:
with belief_scope("request"):
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy()
if headers: _headers.update(headers)
try:
response = self.session.request(method, full_url, headers=_headers, **kwargs)
response.raise_for_status()
return response if raw_response else response.json()
except requests.exceptions.HTTPError as e:
self._handle_http_error(e, endpoint)
except requests.exceptions.RequestException as e:
self._handle_network_error(e, full_url)
# [/DEF:request:Function]
# [DEF:_handle_http_error:Function]
# @PURPOSE: (Helper) Преобразует HTTP ошибки в кастомные исключения.
# @PARAM: e (requests.exceptions.HTTPError) - Ошибка.
# @PARAM: endpoint (str) - Эндпоинт.
# @PRE: e must be a valid HTTPError with a response.
# @POST: Raises a specific SupersetAPIError or subclass.
def _handle_http_error(self, e: requests.exceptions.HTTPError, endpoint: str):
with belief_scope("_handle_http_error"):
status_code = e.response.status_code
if status_code == 502 or status_code == 503 or status_code == 504:
raise NetworkError(f"Environment unavailable (Status {status_code})", status_code=status_code) from e
if status_code == 404: raise DashboardNotFoundError(endpoint) from e
if status_code == 403: raise PermissionDeniedError() from e
if status_code == 401: raise AuthenticationError() from e
raise SupersetAPIError(f"API Error {status_code}: {e.response.text}") from e
# [/DEF:_handle_http_error:Function]
# [DEF:_handle_network_error:Function]
# @PURPOSE: (Helper) Преобразует сетевые ошибки в `NetworkError`.
# @PARAM: e (requests.exceptions.RequestException) - Ошибка.
# @PARAM: url (str) - URL.
# @PRE: e must be a RequestException.
# @POST: Raises a NetworkError.
def _handle_network_error(self, e: requests.exceptions.RequestException, url: str):
with belief_scope("_handle_network_error"):
if isinstance(e, requests.exceptions.Timeout): msg = "Request timeout"
elif isinstance(e, requests.exceptions.ConnectionError): msg = "Connection error"
else: msg = f"Unknown network error: {e}"
raise NetworkError(msg, url=url) from e
# [/DEF:_handle_network_error:Function]
# [DEF:upload_file:Function]
# @PURPOSE: Загружает файл на сервер через multipart/form-data.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: file_info (Dict[str, Any]) - Информация о файле.
# @PARAM: extra_data (Optional[Dict]) - Дополнительные данные.
# @PARAM: timeout (Optional[int]) - Таймаут.
# @PRE: file_info must contain 'file_obj' and 'file_name'.
# @POST: File is uploaded and response returned.
# @RETURN: Ответ API в виде словаря.
# @THROW: SupersetAPIError, NetworkError, TypeError.
def upload_file(self, endpoint: str, file_info: Dict[str, Any], extra_data: Optional[Dict] = None, timeout: Optional[int] = None) -> Dict:
with belief_scope("upload_file"):
full_url = f"{self.base_url}{endpoint}"
_headers = self.headers.copy(); _headers.pop('Content-Type', None)
file_obj, file_name, form_field = file_info.get("file_obj"), file_info.get("file_name"), file_info.get("form_field", "file")
files_payload = {}
if isinstance(file_obj, (str, Path)):
with open(file_obj, 'rb') as f:
files_payload = {form_field: (file_name, f.read(), 'application/x-zip-compressed')}
elif isinstance(file_obj, io.BytesIO):
files_payload = {form_field: (file_name, file_obj.getvalue(), 'application/x-zip-compressed')}
else:
raise TypeError(f"Unsupported file_obj type: {type(file_obj)}")
return self._perform_upload(full_url, files_payload, extra_data, _headers, timeout)
# [/DEF:upload_file:Function]
# [DEF:_perform_upload:Function]
# @PURPOSE: (Helper) Выполняет POST запрос с файлом.
# @PARAM: url (str) - URL.
# @PARAM: files (Dict) - Файлы.
# @PARAM: data (Optional[Dict]) - Данные.
# @PARAM: headers (Dict) - Заголовки.
# @PARAM: timeout (Optional[int]) - Таймаут.
# @PRE: url, files, and headers must be provided.
# @POST: POST request is performed and JSON response returned.
# @RETURN: Dict - Ответ.
def _perform_upload(self, url: str, files: Dict, data: Optional[Dict], headers: Dict, timeout: Optional[int]) -> Dict:
with belief_scope("_perform_upload"):
try:
response = self.session.post(url, files=files, data=data or {}, headers=headers, timeout=timeout or self.request_settings["timeout"])
response.raise_for_status()
if response.status_code == 200:
try:
return response.json()
except Exception as json_e:
app_logger.debug(f"[_perform_upload][Debug] Response is not valid JSON: {response.text[:200]}...")
raise SupersetAPIError(f"API error during upload: Response is not valid JSON: {json_e}") from json_e
return response.json()
except requests.exceptions.HTTPError as e:
raise SupersetAPIError(f"API error during upload: {e.response.text}") from e
except requests.exceptions.RequestException as e:
raise NetworkError(f"Network error during upload: {e}", url=url) from e
# [/DEF:_perform_upload:Function]
# [DEF:fetch_paginated_count:Function]
# @PURPOSE: Получает общее количество элементов для пагинации.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: query_params (Dict) - Параметры запроса.
# @PARAM: count_field (str) - Поле с количеством.
# @PRE: query_params must be a dictionary.
# @POST: Returns total count of items.
# @RETURN: int - Количество.
def fetch_paginated_count(self, endpoint: str, query_params: Dict, count_field: str = "count") -> int:
with belief_scope("fetch_paginated_count"):
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query_params)}))
return response_json.get(count_field, 0)
# [/DEF:fetch_paginated_count:Function]
# [DEF:fetch_paginated_data:Function]
# @PURPOSE: Автоматически собирает данные со всех страниц пагинированного эндпоинта.
# @PARAM: endpoint (str) - Эндпоинт.
# @PARAM: pagination_options (Dict[str, Any]) - Опции пагинации.
# @PRE: pagination_options must contain 'base_query', 'total_count', 'results_field'.
# @POST: Returns all items across all pages.
# @RETURN: List[Any] - Список данных.
def fetch_paginated_data(self, endpoint: str, pagination_options: Dict[str, Any]) -> List[Any]:
with belief_scope("fetch_paginated_data"):
base_query, total_count = pagination_options["base_query"], pagination_options["total_count"]
results_field, page_size = pagination_options["results_field"], base_query.get('page_size')
assert page_size and page_size > 0, "'page_size' must be a positive number."
results = []
for page in range((total_count + page_size - 1) // page_size):
query = {**base_query, 'page': page}
response_json = cast(Dict[str, Any], self.request("GET", endpoint, params={"q": json.dumps(query)}))
results.extend(response_json.get(results_field, []))
return results
# [/DEF:fetch_paginated_data:Function]
# [/DEF:APIClient:Class]
# [/DEF:backend.core.utils.network:Module]

View File

@@ -1,13 +1,23 @@
# [DEF:Dependencies:Module] # [DEF:Dependencies:Module]
# @SEMANTICS: dependency, injection, singleton, factory # @SEMANTICS: dependency, injection, singleton, factory, auth, jwt
# @PURPOSE: Manages the creation and provision of shared application dependencies, such as the PluginLoader and TaskManager, to avoid circular imports. # @PURPOSE: Manages the creation and provision of shared application dependencies, such as the PluginLoader and TaskManager, to avoid circular imports.
# @LAYER: Core # @LAYER: Core
# @RELATION: Used by the main app and API routers to get access to shared instances. # @RELATION: Used by the main app and API routers to get access to shared instances.
from pathlib import Path from pathlib import Path
from typing import Optional
from fastapi import Depends, HTTPException, status
from fastapi.security import OAuth2PasswordBearer
from jose import JWTError
from .core.plugin_loader import PluginLoader from .core.plugin_loader import PluginLoader
from .core.task_manager import TaskManager from .core.task_manager import TaskManager
from .core.config_manager import ConfigManager from .core.config_manager import ConfigManager
from .core.scheduler import SchedulerService
from .core.database import init_db, get_auth_db
from .core.logger import logger, belief_scope
from .core.auth.jwt import decode_token
from .core.auth.repository import AuthRepository
from .models.auth import User
# Initialize singletons # Initialize singletons
# Use absolute path relative to this file to ensure plugins are found regardless of CWD # Use absolute path relative to this file to ensure plugins are found regardless of CWD
@@ -15,19 +25,123 @@ project_root = Path(__file__).parent.parent.parent
config_path = project_root / "config.json" config_path = project_root / "config.json"
config_manager = ConfigManager(config_path=str(config_path)) config_manager = ConfigManager(config_path=str(config_path))
# Initialize database before any other services that might use it
init_db()
# [DEF:get_config_manager:Function]
# @PURPOSE: Dependency injector for the ConfigManager.
# @PRE: Global config_manager must be initialized.
# @POST: Returns shared ConfigManager instance.
# @RETURN: ConfigManager - The shared config manager instance.
def get_config_manager() -> ConfigManager: def get_config_manager() -> ConfigManager:
"""Dependency injector for the ConfigManager.""" """Dependency injector for the ConfigManager."""
return config_manager return config_manager
# [/DEF:get_config_manager:Function]
plugin_dir = Path(__file__).parent / "plugins" plugin_dir = Path(__file__).parent / "plugins"
plugin_loader = PluginLoader(plugin_dir=str(plugin_dir))
task_manager = TaskManager(plugin_loader)
plugin_loader = PluginLoader(plugin_dir=str(plugin_dir))
logger.info(f"PluginLoader initialized with directory: {plugin_dir}")
logger.info(f"Available plugins: {[config.name for config in plugin_loader.get_all_plugin_configs()]}")
task_manager = TaskManager(plugin_loader)
logger.info("TaskManager initialized")
scheduler_service = SchedulerService(task_manager, config_manager)
logger.info("SchedulerService initialized")
# [DEF:get_plugin_loader:Function]
# @PURPOSE: Dependency injector for the PluginLoader.
# @PRE: Global plugin_loader must be initialized.
# @POST: Returns shared PluginLoader instance.
# @RETURN: PluginLoader - The shared plugin loader instance.
def get_plugin_loader() -> PluginLoader: def get_plugin_loader() -> PluginLoader:
"""Dependency injector for the PluginLoader.""" """Dependency injector for the PluginLoader."""
return plugin_loader return plugin_loader
# [/DEF:get_plugin_loader:Function]
# [DEF:get_task_manager:Function]
# @PURPOSE: Dependency injector for the TaskManager.
# @PRE: Global task_manager must be initialized.
# @POST: Returns shared TaskManager instance.
# @RETURN: TaskManager - The shared task manager instance.
def get_task_manager() -> TaskManager: def get_task_manager() -> TaskManager:
"""Dependency injector for the TaskManager.""" """Dependency injector for the TaskManager."""
return task_manager return task_manager
# [/DEF] # [/DEF:get_task_manager:Function]
# [DEF:get_scheduler_service:Function]
# @PURPOSE: Dependency injector for the SchedulerService.
# @PRE: Global scheduler_service must be initialized.
# @POST: Returns shared SchedulerService instance.
# @RETURN: SchedulerService - The shared scheduler service instance.
def get_scheduler_service() -> SchedulerService:
"""Dependency injector for the SchedulerService."""
return scheduler_service
# [/DEF:get_scheduler_service:Function]
# [DEF:oauth2_scheme:Variable]
# @PURPOSE: OAuth2 password bearer scheme for token extraction.
oauth2_scheme = OAuth2PasswordBearer(tokenUrl="/api/auth/login")
# [/DEF:oauth2_scheme:Variable]
# [DEF:get_current_user:Function]
# @PURPOSE: Dependency for retrieving the currently authenticated user from a JWT.
# @PRE: JWT token provided in Authorization header.
# @POST: Returns the User object if token is valid.
# @THROW: HTTPException 401 if token is invalid or user not found.
# @PARAM: token (str) - Extracted JWT token.
# @PARAM: db (Session) - Auth database session.
# @RETURN: User - The authenticated user.
def get_current_user(token: str = Depends(oauth2_scheme), db = Depends(get_auth_db)):
credentials_exception = HTTPException(
status_code=status.HTTP_401_UNAUTHORIZED,
detail="Could not validate credentials",
headers={"WWW-Authenticate": "Bearer"},
)
try:
payload = decode_token(token)
username: str = payload.get("sub")
if username is None:
raise credentials_exception
except JWTError:
raise credentials_exception
repo = AuthRepository(db)
user = repo.get_user_by_username(username)
if user is None:
raise credentials_exception
return user
# [/DEF:get_current_user:Function]
# [DEF:has_permission:Function]
# @PURPOSE: Dependency for checking if the current user has a specific permission.
# @PRE: User is authenticated.
# @POST: Returns True if user has permission.
# @THROW: HTTPException 403 if permission is denied.
# @PARAM: resource (str) - The resource identifier.
# @PARAM: action (str) - The action identifier (READ, EXECUTE, WRITE).
# @RETURN: User - The authenticated user if permission granted.
def has_permission(resource: str, action: str):
def permission_checker(current_user: User = Depends(get_current_user)):
# Union of all permissions across all roles
for role in current_user.roles:
for perm in role.permissions:
if perm.resource == resource and perm.action == action:
return current_user
# Special case for Admin role (full access)
if any(role.name == "Admin" for role in current_user.roles):
return current_user
from .core.auth.logger import log_security_event
log_security_event("PERMISSION_DENIED", current_user.username, {"resource": resource, "action": action})
raise HTTPException(
status_code=status.HTTP_403_FORBIDDEN,
detail=f"Permission denied for {resource}:{action}"
)
return permission_checker
# [/DEF:has_permission:Function]
# [/DEF:Dependencies:Module]

105
backend/src/models/auth.py Normal file
View File

@@ -0,0 +1,105 @@
# [DEF:backend.src.models.auth:Module]
#
# @TIER: STANDARD
# @SEMANTICS: auth, models, user, role, permission, sqlalchemy
# @PURPOSE: SQLAlchemy models for multi-user authentication and authorization.
# @LAYER: Domain
# @RELATION: INHERITS_FROM -> backend.src.models.mapping.Base
#
# @INVARIANT: Usernames and emails must be unique.
# [SECTION: IMPORTS]
import uuid
from datetime import datetime
from sqlalchemy import Column, String, Boolean, DateTime, ForeignKey, Table, Enum
from sqlalchemy.orm import relationship
from .mapping import Base
# [/SECTION]
# [DEF:generate_uuid:Function]
# @PURPOSE: Generates a unique UUID string.
# @POST: Returns a string representation of a new UUID.
def generate_uuid():
return str(uuid.uuid4())
# [/DEF:generate_uuid:Function]
# [DEF:user_roles:Table]
# @PURPOSE: Association table for many-to-many relationship between Users and Roles.
user_roles = Table(
"user_roles",
Base.metadata,
Column("user_id", String, ForeignKey("users.id"), primary_key=True),
Column("role_id", String, ForeignKey("roles.id"), primary_key=True),
)
# [/DEF:user_roles:Table]
# [DEF:role_permissions:Table]
# @PURPOSE: Association table for many-to-many relationship between Roles and Permissions.
role_permissions = Table(
"role_permissions",
Base.metadata,
Column("role_id", String, ForeignKey("roles.id"), primary_key=True),
Column("permission_id", String, ForeignKey("permissions.id"), primary_key=True),
)
# [/DEF:role_permissions:Table]
# [DEF:User:Class]
# @PURPOSE: Represents an identity that can authenticate to the system.
# @RELATION: HAS_MANY -> Role (via user_roles)
class User(Base):
__tablename__ = "users"
id = Column(String, primary_key=True, default=generate_uuid)
username = Column(String, unique=True, index=True, nullable=False)
email = Column(String, unique=True, index=True, nullable=True)
password_hash = Column(String, nullable=True)
auth_source = Column(String, default="LOCAL") # LOCAL or ADFS
is_active = Column(Boolean, default=True)
created_at = Column(DateTime, default=datetime.utcnow)
last_login = Column(DateTime, nullable=True)
roles = relationship("Role", secondary=user_roles, back_populates="users")
# [/DEF:User:Class]
# [DEF:Role:Class]
# @PURPOSE: Represents a collection of permissions.
# @RELATION: HAS_MANY -> User (via user_roles)
# @RELATION: HAS_MANY -> Permission (via role_permissions)
class Role(Base):
__tablename__ = "roles"
id = Column(String, primary_key=True, default=generate_uuid)
name = Column(String, unique=True, index=True, nullable=False)
description = Column(String, nullable=True)
users = relationship("User", secondary=user_roles, back_populates="roles")
permissions = relationship("Permission", secondary=role_permissions, back_populates="roles")
# [/DEF:Role:Class]
# [DEF:Permission:Class]
# @PURPOSE: Represents a specific capability within the system.
# @RELATION: HAS_MANY -> Role (via role_permissions)
class Permission(Base):
__tablename__ = "permissions"
id = Column(String, primary_key=True, default=generate_uuid)
resource = Column(String, nullable=False) # e.g. "plugin:backup"
action = Column(String, nullable=False) # e.g. "READ", "EXECUTE", "WRITE"
roles = relationship("Role", secondary=role_permissions, back_populates="permissions")
# [/DEF:Permission:Class]
# [DEF:ADGroupMapping:Class]
# @PURPOSE: Maps an Active Directory group to a local System Role.
# @RELATION: DEPENDS_ON -> Role
class ADGroupMapping(Base):
__tablename__ = "ad_group_mappings"
id = Column(String, primary_key=True, default=generate_uuid)
ad_group = Column(String, unique=True, index=True, nullable=False)
role_id = Column(String, ForeignKey("roles.id"), nullable=False)
role = relationship("Role")
# [/DEF:ADGroupMapping:Class]
# [/DEF:backend.src.models.auth:Module]

View File

@@ -0,0 +1,34 @@
# [DEF:backend.src.models.connection:Module]
#
# @SEMANTICS: database, connection, configuration, sqlalchemy, sqlite
# @PURPOSE: Defines the database schema for external database connection configurations.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> sqlalchemy
#
# @INVARIANT: All primary keys are UUID strings.
# [SECTION: IMPORTS]
from sqlalchemy import Column, String, Integer, DateTime
from sqlalchemy.sql import func
from .mapping import Base
import uuid
# [/SECTION]
# [DEF:ConnectionConfig:Class]
# @PURPOSE: Stores credentials for external databases used for column mapping.
class ConnectionConfig(Base):
__tablename__ = "connection_configs"
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
name = Column(String, nullable=False)
type = Column(String, nullable=False) # e.g., "postgres"
host = Column(String, nullable=True)
port = Column(Integer, nullable=True)
database = Column(String, nullable=True)
username = Column(String, nullable=True)
password = Column(String, nullable=True) # Encrypted/Obfuscated password
created_at = Column(DateTime(timezone=True), server_default=func.now())
updated_at = Column(DateTime(timezone=True), server_default=func.now(), onupdate=func.now())
# [/DEF:ConnectionConfig:Class]
# [/DEF:backend.src.models.connection:Module]

View File

@@ -0,0 +1,31 @@
# [DEF:backend.src.models.dashboard:Module]
# @TIER: STANDARD
# @SEMANTICS: dashboard, model, metadata, migration
# @PURPOSE: Defines data models for dashboard metadata and selection.
# @LAYER: Model
# @RELATION: USED_BY -> backend.src.api.routes.migration
from pydantic import BaseModel
from typing import List
# [DEF:DashboardMetadata:Class]
# @TIER: TRIVIAL
# @PURPOSE: Represents a dashboard available for migration.
class DashboardMetadata(BaseModel):
id: int
title: str
last_modified: str
status: str
# [/DEF:DashboardMetadata:Class]
# [DEF:DashboardSelection:Class]
# @TIER: TRIVIAL
# @PURPOSE: Represents the user's selection of dashboards to migrate.
class DashboardSelection(BaseModel):
selected_ids: List[int]
source_env_id: str
target_env_id: str
replace_db_config: bool = False
# [/DEF:DashboardSelection:Class]
# [/DEF:backend.src.models.dashboard:Module]

73
backend/src/models/git.py Normal file
View File

@@ -0,0 +1,73 @@
# [DEF:GitModels:Module]
# @SEMANTICS: git, models, sqlalchemy, database, schema
# @PURPOSE: Git-specific SQLAlchemy models for configuration and repository tracking.
# @LAYER: Model
# @RELATION: specs/011-git-integration-dashboard/data-model.md
import enum
from datetime import datetime
from sqlalchemy import Column, String, Integer, DateTime, Enum, ForeignKey, Boolean
from sqlalchemy.dialects.postgresql import UUID
import uuid
from src.core.database import Base
class GitProvider(str, enum.Enum):
GITHUB = "GITHUB"
GITLAB = "GITLAB"
GITEA = "GITEA"
class GitStatus(str, enum.Enum):
CONNECTED = "CONNECTED"
FAILED = "FAILED"
UNKNOWN = "UNKNOWN"
class SyncStatus(str, enum.Enum):
CLEAN = "CLEAN"
DIRTY = "DIRTY"
CONFLICT = "CONFLICT"
class GitServerConfig(Base):
"""
[DEF:GitServerConfig:Class]
Configuration for a Git server connection.
"""
__tablename__ = "git_server_configs"
id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
name = Column(String(255), nullable=False)
provider = Column(Enum(GitProvider), nullable=False)
url = Column(String(255), nullable=False)
pat = Column(String(255), nullable=False) # PERSONAL ACCESS TOKEN
default_repository = Column(String(255), nullable=True)
status = Column(Enum(GitStatus), default=GitStatus.UNKNOWN)
last_validated = Column(DateTime, default=datetime.utcnow)
class GitRepository(Base):
"""
[DEF:GitRepository:Class]
Tracking for a local Git repository linked to a dashboard.
"""
__tablename__ = "git_repositories"
id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
dashboard_id = Column(Integer, nullable=False, unique=True)
config_id = Column(String(36), ForeignKey("git_server_configs.id"), nullable=False)
remote_url = Column(String(255), nullable=False)
local_path = Column(String(255), nullable=False)
current_branch = Column(String(255), default="main")
sync_status = Column(Enum(SyncStatus), default=SyncStatus.CLEAN)
class DeploymentEnvironment(Base):
"""
[DEF:DeploymentEnvironment:Class]
Target Superset environments for dashboard deployment.
"""
__tablename__ = "deployment_environments"
id = Column(String(36), primary_key=True, default=lambda: str(uuid.uuid4()))
name = Column(String(255), nullable=False)
superset_url = Column(String(255), nullable=False)
superset_token = Column(String(255), nullable=False)
is_active = Column(Boolean, default=True)
# [/DEF:GitModels:Module]

46
backend/src/models/llm.py Normal file
View File

@@ -0,0 +1,46 @@
# [DEF:backend.src.models.llm:Module]
# @TIER: STANDARD
# @SEMANTICS: llm, models, sqlalchemy, persistence
# @PURPOSE: SQLAlchemy models for LLM provider configuration and validation results.
# @LAYER: Domain
# @RELATION: INHERITS_FROM -> backend.src.models.mapping.Base
from sqlalchemy import Column, String, Boolean, DateTime, JSON, Enum, Text
from datetime import datetime
import uuid
from .mapping import Base
def generate_uuid():
return str(uuid.uuid4())
# [DEF:LLMProvider:Class]
# @PURPOSE: SQLAlchemy model for LLM provider configuration.
class LLMProvider(Base):
__tablename__ = "llm_providers"
id = Column(String, primary_key=True, default=generate_uuid)
provider_type = Column(String, nullable=False) # openai, openrouter, kilo
name = Column(String, nullable=False)
base_url = Column(String, nullable=False)
api_key = Column(String, nullable=False) # Should be encrypted
default_model = Column(String, nullable=False)
is_active = Column(Boolean, default=True)
created_at = Column(DateTime, default=datetime.utcnow)
# [/DEF:LLMProvider:Class]
# [DEF:ValidationRecord:Class]
# @PURPOSE: SQLAlchemy model for dashboard validation history.
class ValidationRecord(Base):
__tablename__ = "llm_validation_results"
id = Column(String, primary_key=True, default=generate_uuid)
dashboard_id = Column(String, nullable=False, index=True)
timestamp = Column(DateTime, default=datetime.utcnow)
status = Column(String, nullable=False) # PASS, WARN, FAIL
screenshot_path = Column(String, nullable=True)
issues = Column(JSON, nullable=False)
summary = Column(Text, nullable=False)
raw_response = Column(Text, nullable=True)
# [/DEF:ValidationRecord:Class]
# [/DEF:backend.src.models.llm:Module]

View File

@@ -1,5 +1,6 @@
# [DEF:backend.src.models.mapping:Module] # [DEF:backend.src.models.mapping:Module]
# #
# @TIER: STANDARD
# @SEMANTICS: database, mapping, environment, migration, sqlalchemy, sqlite # @SEMANTICS: database, mapping, environment, migration, sqlalchemy, sqlite
# @PURPOSE: Defines the database schema for environment metadata and database mappings using SQLAlchemy. # @PURPOSE: Defines the database schema for environment metadata and database mappings using SQLAlchemy.
# @LAYER: Domain # @LAYER: Domain
@@ -19,6 +20,7 @@ import enum
Base = declarative_base() Base = declarative_base()
# [DEF:MigrationStatus:Class] # [DEF:MigrationStatus:Class]
# @TIER: TRIVIAL
# @PURPOSE: Enumeration of possible migration job statuses. # @PURPOSE: Enumeration of possible migration job statuses.
class MigrationStatus(enum.Enum): class MigrationStatus(enum.Enum):
PENDING = "PENDING" PENDING = "PENDING"
@@ -26,9 +28,10 @@ class MigrationStatus(enum.Enum):
COMPLETED = "COMPLETED" COMPLETED = "COMPLETED"
FAILED = "FAILED" FAILED = "FAILED"
AWAITING_MAPPING = "AWAITING_MAPPING" AWAITING_MAPPING = "AWAITING_MAPPING"
# [/DEF:MigrationStatus] # [/DEF:MigrationStatus:Class]
# [DEF:Environment:Class] # [DEF:Environment:Class]
# @TIER: STANDARD
# @PURPOSE: Represents a Superset instance environment. # @PURPOSE: Represents a Superset instance environment.
class Environment(Base): class Environment(Base):
__tablename__ = "environments" __tablename__ = "environments"
@@ -37,9 +40,10 @@ class Environment(Base):
name = Column(String, nullable=False) name = Column(String, nullable=False)
url = Column(String, nullable=False) url = Column(String, nullable=False)
credentials_id = Column(String, nullable=False) credentials_id = Column(String, nullable=False)
# [/DEF:Environment] # [/DEF:Environment:Class]
# [DEF:DatabaseMapping:Class] # [DEF:DatabaseMapping:Class]
# @TIER: STANDARD
# @PURPOSE: Represents a mapping between source and target databases. # @PURPOSE: Represents a mapping between source and target databases.
class DatabaseMapping(Base): class DatabaseMapping(Base):
__tablename__ = "database_mappings" __tablename__ = "database_mappings"
@@ -52,7 +56,7 @@ class DatabaseMapping(Base):
source_db_name = Column(String, nullable=False) source_db_name = Column(String, nullable=False)
target_db_name = Column(String, nullable=False) target_db_name = Column(String, nullable=False)
engine = Column(String, nullable=True) engine = Column(String, nullable=True)
# [/DEF:DatabaseMapping] # [/DEF:DatabaseMapping:Class]
# [DEF:MigrationJob:Class] # [DEF:MigrationJob:Class]
# @PURPOSE: Represents a single migration execution job. # @PURPOSE: Represents a single migration execution job.
@@ -65,6 +69,6 @@ class MigrationJob(Base):
status = Column(SQLEnum(MigrationStatus), default=MigrationStatus.PENDING) status = Column(SQLEnum(MigrationStatus), default=MigrationStatus.PENDING)
replace_db = Column(Boolean, default=False) replace_db = Column(Boolean, default=False)
created_at = Column(DateTime(timezone=True), server_default=func.now()) created_at = Column(DateTime(timezone=True), server_default=func.now())
# [/DEF:MigrationJob] # [/DEF:MigrationJob:Class]
# [/DEF:backend.src.models.mapping] # [/DEF:backend.src.models.mapping:Module]

View File

@@ -0,0 +1,31 @@
from datetime import datetime
from enum import Enum
from typing import Optional
from pydantic import BaseModel, Field
# [DEF:FileCategory:Class]
# @PURPOSE: Enumeration of supported file categories in the storage system.
class FileCategory(str, Enum):
BACKUP = "backups"
REPOSITORY = "repositorys"
# [/DEF:FileCategory:Class]
# [DEF:StorageConfig:Class]
# @PURPOSE: Configuration model for the storage system, defining paths and naming patterns.
class StorageConfig(BaseModel):
root_path: str = Field(default="backups", description="Absolute path to the storage root directory.")
backup_structure_pattern: str = Field(default="{category}/", description="Pattern for backup directory structure.")
repo_structure_pattern: str = Field(default="{category}/", description="Pattern for repository directory structure.")
filename_pattern: str = Field(default="{name}_{timestamp}", description="Pattern for filenames.")
# [/DEF:StorageConfig:Class]
# [DEF:StoredFile:Class]
# @PURPOSE: Data model representing metadata for a file stored in the system.
class StoredFile(BaseModel):
name: str = Field(..., description="Name of the file (including extension).")
path: str = Field(..., description="Relative path from storage root.")
size: int = Field(..., ge=0, description="Size of the file in bytes.")
created_at: datetime = Field(..., description="Creation timestamp.")
category: FileCategory = Field(..., description="Category of the file.")
mime_type: Optional[str] = Field(None, description="MIME type of the file.")
# [/DEF:StoredFile:Class]

View File

@@ -0,0 +1,35 @@
# [DEF:backend.src.models.task:Module]
#
# @SEMANTICS: database, task, record, sqlalchemy, sqlite
# @PURPOSE: Defines the database schema for task execution records.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> sqlalchemy
#
# @INVARIANT: All primary keys are UUID strings.
# [SECTION: IMPORTS]
from sqlalchemy import Column, String, DateTime, JSON, ForeignKey
from sqlalchemy.sql import func
from .mapping import Base
import uuid
# [/SECTION]
# [DEF:TaskRecord:Class]
# @PURPOSE: Represents a persistent record of a task execution.
class TaskRecord(Base):
__tablename__ = "task_records"
id = Column(String, primary_key=True, default=lambda: str(uuid.uuid4()))
type = Column(String, nullable=False) # e.g., "backup", "migration"
status = Column(String, nullable=False) # Enum: "PENDING", "RUNNING", "SUCCESS", "FAILED"
environment_id = Column(String, ForeignKey("environments.id"), nullable=True)
started_at = Column(DateTime(timezone=True), nullable=True)
finished_at = Column(DateTime(timezone=True), nullable=True)
logs = Column(JSON, nullable=True) # Store structured logs as JSON
error = Column(String, nullable=True)
result = Column(JSON, nullable=True)
created_at = Column(DateTime(timezone=True), server_default=func.now())
params = Column(JSON, nullable=True)
# [/DEF:TaskRecord:Class]
# [/DEF:backend.src.models.task:Module]

View File

@@ -11,10 +11,10 @@ from pathlib import Path
from requests.exceptions import RequestException from requests.exceptions import RequestException
from ..core.plugin_base import PluginBase from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient from ..core.logger import belief_scope
from superset_tool.exceptions import SupersetAPIError from ..core.superset_client import SupersetClient
from superset_tool.utils.logger import SupersetLogger from ..core.utils.network import SupersetAPIError
from superset_tool.utils.fileio import ( from ..core.utils.fileio import (
save_and_unpack_dashboard, save_and_unpack_dashboard,
archive_exports, archive_exports,
sanitize_filename, sanitize_filename,
@@ -22,34 +22,78 @@ from superset_tool.utils.fileio import (
remove_empty_directories, remove_empty_directories,
RetentionPolicy RetentionPolicy
) )
from superset_tool.utils.init_clients import setup_clients
from ..dependencies import get_config_manager from ..dependencies import get_config_manager
# [DEF:BackupPlugin:Class]
# @PURPOSE: Implementation of the backup plugin logic.
class BackupPlugin(PluginBase): class BackupPlugin(PluginBase):
""" """
A plugin to back up Superset dashboards. A plugin to back up Superset dashboards.
""" """
@property @property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the backup plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string ID.
# @RETURN: str - "superset-backup"
def id(self) -> str: def id(self) -> str:
return "superset-backup" with belief_scope("id"):
return "superset-backup"
# [/DEF:id:Function]
@property @property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the backup plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string name.
# @RETURN: str - Plugin name.
def name(self) -> str: def name(self) -> str:
return "Superset Dashboard Backup" with belief_scope("name"):
return "Superset Dashboard Backup"
# [/DEF:name:Function]
@property @property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the backup plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string description.
# @RETURN: str - Plugin description.
def description(self) -> str: def description(self) -> str:
return "Backs up all dashboards from a Superset instance." with belief_scope("description"):
return "Backs up all dashboards from a Superset instance."
# [/DEF:description:Function]
@property @property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the backup plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string version.
# @RETURN: str - "1.0.0"
def version(self) -> str: def version(self) -> str:
return "1.0.0" with belief_scope("version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the backup plugin.
# @RETURN: str - "/tools/backups"
def ui_route(self) -> str:
with belief_scope("ui_route"):
return "/tools/backups"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for backup plugin parameters.
# @PRE: Plugin instance exists.
# @POST: Returns dictionary schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]: def get_schema(self) -> Dict[str, Any]:
config_manager = get_config_manager() with belief_scope("get_schema"):
envs = [e.name for e in config_manager.get_environments()] config_manager = get_config_manager()
default_path = config_manager.get_config().settings.backup_path envs = [e.name for e in config_manager.get_environments()]
default_path = config_manager.get_config().settings.storage.root_path
return { return {
"type": "object", "type": "object",
@@ -60,74 +104,90 @@ class BackupPlugin(PluginBase):
"description": "The Superset environment to back up.", "description": "The Superset environment to back up.",
"enum": envs if envs else [], "enum": envs if envs else [],
}, },
"backup_path": {
"type": "string",
"title": "Backup Path",
"description": "The root directory to save backups to.",
"default": default_path
}
}, },
"required": ["env", "backup_path"], "required": ["env"],
} }
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes the dashboard backup logic.
# @PARAM: params (Dict[str, Any]) - Backup parameters (env, backup_path).
# @PRE: Target environment must be configured. params must be a dictionary.
# @POST: All dashboards are exported and archived.
async def execute(self, params: Dict[str, Any]): async def execute(self, params: Dict[str, Any]):
env = params["env"] with belief_scope("execute"):
backup_path = Path(params["backup_path"])
logger = SupersetLogger(log_dir=backup_path / "Logs", console=True)
logger.info(f"[BackupPlugin][Entry] Starting backup for {env}.")
try:
config_manager = get_config_manager() config_manager = get_config_manager()
if not config_manager.has_environments(): env_id = params.get("environment_id")
raise ValueError("No Superset environments configured. Please add an environment in Settings.")
# Resolve environment name if environment_id is provided
if env_id:
env_config = next((e for e in config_manager.get_environments() if e.id == env_id), None)
if env_config:
params["env"] = env_config.name
env = params.get("env")
if not env:
raise KeyError("env")
storage_settings = config_manager.get_config().settings.storage
# Use 'backups' subfolder within the storage root
backup_path = Path(storage_settings.root_path) / "backups"
from ..core.logger import logger as app_logger
app_logger.info(f"[BackupPlugin][Entry] Starting backup for {env}.")
try:
config_manager = get_config_manager()
if not config_manager.has_environments():
raise ValueError("No Superset environments configured. Please add an environment in Settings.")
env_config = config_manager.get_environment(env)
if not env_config:
raise ValueError(f"Environment '{env}' not found in configuration.")
clients = setup_clients(logger, custom_envs=config_manager.get_environments()) client = SupersetClient(env_config)
client = clients.get(env)
dashboard_count, dashboard_meta = client.get_dashboards()
if not client: app_logger.info(f"[BackupPlugin][Progress] Found {dashboard_count} dashboards to export in {env}.")
raise ValueError(f"Environment '{env}' not found in configuration.")
dashboard_count, dashboard_meta = client.get_dashboards()
logger.info(f"[BackupPlugin][Progress] Found {dashboard_count} dashboards to export in {env}.")
if dashboard_count == 0: if dashboard_count == 0:
logger.info("[BackupPlugin][Exit] No dashboards to back up.") app_logger.info("[BackupPlugin][Exit] No dashboards to back up.")
return return
for db in dashboard_meta: for db in dashboard_meta:
dashboard_id = db.get('id') dashboard_id = db.get('id')
dashboard_title = db.get('dashboard_title', 'Unknown Dashboard') dashboard_title = db.get('dashboard_title', 'Unknown Dashboard')
if not dashboard_id: if not dashboard_id:
continue continue
try: try:
dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}") dashboard_base_dir_name = sanitize_filename(f"{dashboard_title}")
dashboard_dir = backup_path / env.upper() / dashboard_base_dir_name dashboard_dir = backup_path / env.upper() / dashboard_base_dir_name
dashboard_dir.mkdir(parents=True, exist_ok=True) dashboard_dir.mkdir(parents=True, exist_ok=True)
zip_content, filename = client.export_dashboard(dashboard_id) zip_content, filename = client.export_dashboard(dashboard_id)
save_and_unpack_dashboard( save_and_unpack_dashboard(
zip_content=zip_content, zip_content=zip_content,
original_filename=filename, original_filename=filename,
output_dir=dashboard_dir, output_dir=dashboard_dir,
unpack=False, unpack=False
logger=logger )
)
archive_exports(str(dashboard_dir), policy=RetentionPolicy(), logger=logger) archive_exports(str(dashboard_dir), policy=RetentionPolicy())
except (SupersetAPIError, RequestException, IOError, OSError) as db_error: except (SupersetAPIError, RequestException, IOError, OSError) as db_error:
logger.error(f"[BackupPlugin][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True) app_logger.error(f"[BackupPlugin][Failure] Failed to export dashboard {dashboard_title} (ID: {dashboard_id}): {db_error}", exc_info=True)
continue continue
consolidate_archive_folders(backup_path / env.upper(), logger=logger) consolidate_archive_folders(backup_path / env.upper())
remove_empty_directories(str(backup_path / env.upper()), logger=logger) remove_empty_directories(str(backup_path / env.upper()))
logger.info(f"[BackupPlugin][CoherenceCheck:Passed] Backup logic completed for {env}.") app_logger.info(f"[BackupPlugin][CoherenceCheck:Passed] Backup logic completed for {env}.")
except (RequestException, IOError, KeyError) as e: except (RequestException, IOError, KeyError) as e:
logger.critical(f"[BackupPlugin][Failure] Fatal error during backup for {env}: {e}", exc_info=True) app_logger.critical(f"[BackupPlugin][Failure] Fatal error during backup for {env}: {e}", exc_info=True)
raise e raise e
# [/DEF:BackupPlugin] # [/DEF:execute:Function]
# [/DEF:BackupPlugin:Class]
# [/DEF:BackupPlugin:Module]

View File

@@ -0,0 +1,196 @@
# [DEF:DebugPluginModule:Module]
# @SEMANTICS: plugin, debug, api, database, superset
# @PURPOSE: Implements a plugin for system diagnostics and debugging Superset API responses.
# @LAYER: Plugins
# @RELATION: Inherits from PluginBase. Uses SupersetClient from core.
# @CONSTRAINT: Must use belief_scope for logging.
# [SECTION: IMPORTS]
from typing import Dict, Any, Optional
from ..core.plugin_base import PluginBase
from ..core.superset_client import SupersetClient
from ..core.logger import logger, belief_scope
# [/SECTION]
# [DEF:DebugPlugin:Class]
# @PURPOSE: Plugin for system diagnostics and debugging.
class DebugPlugin(PluginBase):
"""
Plugin for system diagnostics and debugging.
"""
@property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the debug plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string ID.
# @RETURN: str - "system-debug"
def id(self) -> str:
with belief_scope("id"):
return "system-debug"
# [/DEF:id:Function]
@property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the debug plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string name.
# @RETURN: str - Plugin name.
def name(self) -> str:
with belief_scope("name"):
return "System Debug"
# [/DEF:name:Function]
@property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the debug plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string description.
# @RETURN: str - Plugin description.
def description(self) -> str:
with belief_scope("description"):
return "Run system diagnostics and debug Superset API responses."
# [/DEF:description:Function]
@property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the debug plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string version.
# @RETURN: str - "1.0.0"
def version(self) -> str:
with belief_scope("version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the debug plugin.
# @RETURN: str - "/tools/debug"
def ui_route(self) -> str:
with belief_scope("ui_route"):
return "/tools/debug"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for the debug plugin parameters.
# @PRE: Plugin instance exists.
# @POST: Returns dictionary schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]:
with belief_scope("get_schema"):
return {
"type": "object",
"properties": {
"action": {
"type": "string",
"title": "Action",
"enum": ["test-db-api", "get-dataset-structure"],
"default": "test-db-api"
},
"env": {
"type": "string",
"title": "Environment",
"description": "The Superset environment (for dataset structure)."
},
"dataset_id": {
"type": "integer",
"title": "Dataset ID",
"description": "The ID of the dataset (for dataset structure)."
},
"source_env": {
"type": "string",
"title": "Source Environment",
"description": "Source env for DB API test."
},
"target_env": {
"type": "string",
"title": "Target Environment",
"description": "Target env for DB API test."
}
},
"required": ["action"]
}
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes the debug logic.
# @PARAM: params (Dict[str, Any]) - Debug parameters.
# @PRE: action must be provided in params.
# @POST: Debug action is executed and results returned.
# @RETURN: Dict[str, Any] - Execution results.
async def execute(self, params: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("execute"):
action = params.get("action")
if action == "test-db-api":
return await self._test_db_api(params)
elif action == "get-dataset-structure":
return await self._get_dataset_structure(params)
else:
raise ValueError(f"Unknown action: {action}")
# [/DEF:execute:Function]
# [DEF:_test_db_api:Function]
# @PURPOSE: Tests database API connectivity for source and target environments.
# @PRE: source_env and target_env params exist in params.
# @POST: Returns DB counts for both envs.
# @PARAM: params (Dict) - Plugin parameters.
# @RETURN: Dict - Comparison results.
async def _test_db_api(self, params: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("_test_db_api"):
source_env_name = params.get("source_env")
target_env_name = params.get("target_env")
if not source_env_name or not target_env_name:
raise ValueError("source_env and target_env are required for test-db-api")
from ..dependencies import get_config_manager
config_manager = get_config_manager()
results = {}
for name in [source_env_name, target_env_name]:
env_config = config_manager.get_environment(name)
if not env_config:
raise ValueError(f"Environment '{name}' not found.")
client = SupersetClient(env_config)
client.authenticate()
count, dbs = client.get_databases()
results[name] = {
"count": count,
"databases": dbs
}
return results
# [/DEF:_test_db_api:Function]
# [DEF:_get_dataset_structure:Function]
# @PURPOSE: Retrieves the structure of a dataset.
# @PRE: env and dataset_id params exist in params.
# @POST: Returns dataset JSON structure.
# @PARAM: params (Dict) - Plugin parameters.
# @RETURN: Dict - Dataset structure.
async def _get_dataset_structure(self, params: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("_get_dataset_structure"):
env_name = params.get("env")
dataset_id = params.get("dataset_id")
if not env_name or dataset_id is None:
raise ValueError("env and dataset_id are required for get-dataset-structure")
from ..dependencies import get_config_manager
config_manager = get_config_manager()
env_config = config_manager.get_environment(env_name)
if not env_config:
raise ValueError(f"Environment '{env_name}' not found.")
client = SupersetClient(env_config)
client.authenticate()
dataset_response = client.get_dataset(dataset_id)
return dataset_response.get('result') or {}
# [/DEF:_get_dataset_structure:Function]
# [/DEF:DebugPlugin:Class]
# [/DEF:DebugPluginModule:Module]

View File

@@ -0,0 +1,66 @@
# [DEF:backend/src/plugins/git/llm_extension:Module]
# @TIER: STANDARD
# @SEMANTICS: git, llm, commit
# @PURPOSE: LLM-based extensions for the Git plugin, specifically for commit message generation.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> backend.src.plugins.llm_analysis.service.LLMClient
from typing import List, Optional
from tenacity import retry, stop_after_attempt, wait_exponential
from ..llm_analysis.service import LLMClient
from ..llm_analysis.models import LLMProviderType
from ...core.logger import belief_scope, logger
# [DEF:GitLLMExtension:Class]
# @PURPOSE: Provides LLM capabilities to the Git plugin.
class GitLLMExtension:
def __init__(self, client: LLMClient):
self.client = client
# [DEF:suggest_commit_message:Function]
# @PURPOSE: Generates a suggested commit message based on a diff and history.
# @PARAM: diff (str) - The git diff of staged changes.
# @PARAM: history (List[str]) - Recent commit messages for context.
# @RETURN: str - The suggested commit message.
@retry(
stop=stop_after_attempt(2),
wait=wait_exponential(multiplier=1, min=2, max=10),
reraise=True
)
async def suggest_commit_message(self, diff: str, history: List[str]) -> str:
with belief_scope("suggest_commit_message"):
history_text = "\n".join(history)
prompt = f"""
Generate a concise and professional git commit message based on the following diff and recent history.
Use Conventional Commits format (e.g., feat: ..., fix: ..., docs: ...).
Recent History:
{history_text}
Diff:
{diff}
Commit Message:
"""
logger.debug(f"[suggest_commit_message] Calling LLM with model: {self.client.default_model}")
response = await self.client.client.chat.completions.create(
model=self.client.default_model,
messages=[{"role": "user", "content": prompt}],
temperature=0.7
)
logger.debug(f"[suggest_commit_message] LLM Response: {response}")
if not response or not hasattr(response, 'choices') or not response.choices:
error_info = getattr(response, 'error', 'No choices in response')
logger.error(f"[suggest_commit_message] Invalid LLM response. Error info: {error_info}")
# If it's a timeout/provider error, we might want to throw to trigger retry if decorated
# but for now we return a safe fallback to avoid UI crash
return "Update dashboard configurations (LLM generation failed)"
return response.choices[0].message.content.strip()
# [/DEF:GitLLMExtension:Class]
# [/DEF:backend/src/plugins/git/llm_extension:Module]

View File

@@ -0,0 +1,385 @@
# [DEF:backend.src.plugins.git_plugin:Module]
#
# @SEMANTICS: git, plugin, dashboard, version_control, sync, deploy
# @PURPOSE: Предоставляет плагин для версионирования и развертывания дашбордов Superset.
# @LAYER: Plugin
# @RELATION: INHERITS_FROM -> src.core.plugin_base.PluginBase
# @RELATION: USES -> src.services.git_service.GitService
# @RELATION: USES -> src.core.superset_client.SupersetClient
# @RELATION: USES -> src.core.config_manager.ConfigManager
#
# @INVARIANT: Все операции с Git должны выполняться через GitService.
# @CONSTRAINT: Плагин работает только с распакованными YAML-экспортами Superset.
# [SECTION: IMPORTS]
import os
import io
import shutil
import zipfile
from pathlib import Path
from typing import Dict, Any, Optional
from src.core.plugin_base import PluginBase
from src.services.git_service import GitService
from src.core.logger import logger, belief_scope
from src.core.config_manager import ConfigManager
from src.core.superset_client import SupersetClient
# [/SECTION]
# [DEF:GitPlugin:Class]
# @PURPOSE: Реализация плагина Git Integration для управления версиями дашбордов.
class GitPlugin(PluginBase):
# [DEF:__init__:Function]
# @PURPOSE: Инициализирует плагин и его зависимости.
# @PRE: config.json exists or shared config_manager is available.
# @POST: Инициализированы git_service и config_manager.
def __init__(self):
with belief_scope("GitPlugin.__init__"):
logger.info("[GitPlugin.__init__][Entry] Initializing GitPlugin.")
self.git_service = GitService()
# Robust config path resolution:
# 1. Try absolute path from src/dependencies.py style if possible
# 2. Try relative paths based on common execution patterns
if os.path.exists("../config.json"):
config_path = "../config.json"
elif os.path.exists("config.json"):
config_path = "config.json"
else:
# Fallback to the one initialized in dependencies if we can import it
try:
from src.dependencies import config_manager
self.config_manager = config_manager
logger.info("[GitPlugin.__init__][Exit] GitPlugin initialized using shared config_manager.")
return
except:
config_path = "config.json"
self.config_manager = ConfigManager(config_path)
logger.info(f"[GitPlugin.__init__][Exit] GitPlugin initialized with {config_path}")
# [/DEF:__init__:Function]
@property
# [DEF:id:Function]
# @PURPOSE: Returns the plugin identifier.
# @PRE: GitPlugin is initialized.
# @POST: Returns 'git-integration'.
def id(self) -> str:
with belief_scope("GitPlugin.id"):
return "git-integration"
# [/DEF:id:Function]
@property
# [DEF:name:Function]
# @PURPOSE: Returns the plugin name.
# @PRE: GitPlugin is initialized.
# @POST: Returns the human-readable name.
def name(self) -> str:
with belief_scope("GitPlugin.name"):
return "Git Integration"
# [/DEF:name:Function]
@property
# [DEF:description:Function]
# @PURPOSE: Returns the plugin description.
# @PRE: GitPlugin is initialized.
# @POST: Returns the plugin's purpose description.
def description(self) -> str:
with belief_scope("GitPlugin.description"):
return "Version control for Superset dashboards"
# [/DEF:description:Function]
@property
# [DEF:version:Function]
# @PURPOSE: Returns the plugin version.
# @PRE: GitPlugin is initialized.
# @POST: Returns the version string.
def version(self) -> str:
with belief_scope("GitPlugin.version"):
return "0.1.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the git plugin.
# @RETURN: str - "/git"
def ui_route(self) -> str:
with belief_scope("GitPlugin.ui_route"):
return "/git"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Возвращает JSON-схему параметров для выполнения задач плагина.
# @PRE: GitPlugin is initialized.
# @POST: Returns a JSON schema dictionary.
# @RETURN: Dict[str, Any] - Схема параметров.
def get_schema(self) -> Dict[str, Any]:
with belief_scope("GitPlugin.get_schema"):
return {
"type": "object",
"properties": {
"operation": {"type": "string", "enum": ["sync", "deploy", "history"]},
"dashboard_id": {"type": "integer"},
"environment_id": {"type": "string"},
"source_env_id": {"type": "string"}
},
"required": ["operation", "dashboard_id"]
}
# [/DEF:get_schema:Function]
# [DEF:initialize:Function]
# @PURPOSE: Выполняет начальную настройку плагина.
# @PRE: GitPlugin is initialized.
# @POST: Плагин готов к выполнению задач.
async def initialize(self):
with belief_scope("GitPlugin.initialize"):
logger.info("[GitPlugin.initialize][Action] Initializing Git Integration Plugin logic.")
# [DEF:execute:Function]
# @PURPOSE: Основной метод выполнения задач плагина.
# @PRE: task_data содержит 'operation' и 'dashboard_id'.
# @POST: Возвращает результат выполнения операции.
# @PARAM: task_data (Dict[str, Any]) - Данные задачи.
# @RETURN: Dict[str, Any] - Статус и сообщение.
# @RELATION: CALLS -> self._handle_sync
# @RELATION: CALLS -> self._handle_deploy
async def execute(self, task_data: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("GitPlugin.execute"):
operation = task_data.get("operation")
dashboard_id = task_data.get("dashboard_id")
logger.info(f"[GitPlugin.execute][Entry] Executing operation: {operation} for dashboard {dashboard_id}")
if operation == "sync":
source_env_id = task_data.get("source_env_id")
result = await self._handle_sync(dashboard_id, source_env_id)
elif operation == "deploy":
env_id = task_data.get("environment_id")
result = await self._handle_deploy(dashboard_id, env_id)
elif operation == "history":
result = {"status": "success", "message": "History available via API"}
else:
logger.error(f"[GitPlugin.execute][Coherence:Failed] Unknown operation: {operation}")
raise ValueError(f"Unknown operation: {operation}")
logger.info(f"[GitPlugin.execute][Exit] Operation {operation} completed.")
return result
# [/DEF:execute:Function]
# [DEF:_handle_sync:Function]
# @PURPOSE: Экспортирует дашборд из Superset и распаковывает в Git-репозиторий.
# @PRE: Репозиторий для дашборда должен существовать.
# @POST: Файлы в репозитории обновлены до текущего состояния в Superset.
# @PARAM: dashboard_id (int) - ID дашборда.
# @PARAM: source_env_id (Optional[str]) - ID исходного окружения.
# @RETURN: Dict[str, str] - Результат синхронизации.
# @SIDE_EFFECT: Изменяет файлы в локальной рабочей директории репозитория.
# @RELATION: CALLS -> src.services.git_service.GitService.get_repo
# @RELATION: CALLS -> src.core.superset_client.SupersetClient.export_dashboard
async def _handle_sync(self, dashboard_id: int, source_env_id: Optional[str] = None) -> Dict[str, str]:
with belief_scope("GitPlugin._handle_sync"):
try:
# 1. Получение репозитория
repo = self.git_service.get_repo(dashboard_id)
repo_path = Path(repo.working_dir)
logger.info(f"[_handle_sync][Action] Target repo path: {repo_path}")
# 2. Настройка клиента Superset
env = self._get_env(source_env_id)
client = SupersetClient(env)
client.authenticate()
# 3. Экспорт дашборда
logger.info(f"[_handle_sync][Action] Exporting dashboard {dashboard_id} from {env.name}")
zip_bytes, _ = client.export_dashboard(dashboard_id)
# 4. Распаковка с выравниванием структуры (flattening)
logger.info(f"[_handle_sync][Action] Unpacking export to {repo_path}")
# Список папок/файлов, которые мы ожидаем от Superset
managed_dirs = ["dashboards", "charts", "datasets", "databases"]
managed_files = ["metadata.yaml"]
# Очистка старых данных перед распаковкой, чтобы не оставалось "призраков"
for d in managed_dirs:
d_path = repo_path / d
if d_path.exists() and d_path.is_dir():
shutil.rmtree(d_path)
for f in managed_files:
f_path = repo_path / f
if f_path.exists():
f_path.unlink()
with zipfile.ZipFile(io.BytesIO(zip_bytes)) as zf:
# Superset экспортирует всё в подпапку dashboard_export_timestamp/
# Нам нужно найти это имя папки
namelist = zf.namelist()
if not namelist:
raise ValueError("Export ZIP is empty")
root_folder = namelist[0].split('/')[0]
logger.info(f"[_handle_sync][Action] Detected root folder in ZIP: {root_folder}")
for member in zf.infolist():
if member.filename.startswith(root_folder + "/") and len(member.filename) > len(root_folder) + 1:
# Убираем префикс папки
relative_path = member.filename[len(root_folder)+1:]
target_path = repo_path / relative_path
if member.is_dir():
target_path.mkdir(parents=True, exist_ok=True)
else:
target_path.parent.mkdir(parents=True, exist_ok=True)
with zf.open(member) as source, open(target_path, "wb") as target:
shutil.copyfileobj(source, target)
# 5. Автоматический staging изменений (не коммит, чтобы юзер мог проверить diff)
try:
repo.git.add(A=True)
logger.info(f"[_handle_sync][Action] Changes staged in git")
except Exception as ge:
logger.warning(f"[_handle_sync][Action] Failed to stage changes: {ge}")
logger.info(f"[_handle_sync][Coherence:OK] Dashboard {dashboard_id} synced successfully.")
return {"status": "success", "message": "Dashboard synced and flattened in local repository"}
except Exception as e:
logger.error(f"[_handle_sync][Coherence:Failed] Sync failed: {e}")
raise
# [/DEF:_handle_sync:Function]
# [DEF:_handle_deploy:Function]
# @PURPOSE: Упаковывает репозиторий в ZIP и импортирует в целевое окружение Superset.
# @PRE: environment_id должен соответствовать настроенному окружению.
# @POST: Дашборд импортирован в целевой Superset.
# @PARAM: dashboard_id (int) - ID дашборда.
# @PARAM: env_id (str) - ID целевого окружения.
# @RETURN: Dict[str, Any] - Результат деплоя.
# @SIDE_EFFECT: Создает и удаляет временный ZIP-файл.
# @RELATION: CALLS -> src.core.superset_client.SupersetClient.import_dashboard
async def _handle_deploy(self, dashboard_id: int, env_id: str) -> Dict[str, Any]:
with belief_scope("GitPlugin._handle_deploy"):
try:
if not env_id:
raise ValueError("Target environment ID required for deployment")
# 1. Получение репозитория
repo = self.git_service.get_repo(dashboard_id)
repo_path = Path(repo.working_dir)
# 2. Упаковка в ZIP
logger.info(f"[_handle_deploy][Action] Packing repository {repo_path} for deployment.")
zip_buffer = io.BytesIO()
# Superset expects a root directory in the ZIP (e.g., dashboard_export_20240101T000000/)
root_dir_name = f"dashboard_export_{dashboard_id}"
with zipfile.ZipFile(zip_buffer, "w", zipfile.ZIP_DEFLATED) as zf:
for root, dirs, files in os.walk(repo_path):
if ".git" in dirs:
dirs.remove(".git")
for file in files:
if file == ".git" or file.endswith(".zip"): continue
file_path = Path(root) / file
# Prepend the root directory name to the archive path
arcname = Path(root_dir_name) / file_path.relative_to(repo_path)
zf.write(file_path, arcname)
zip_buffer.seek(0)
# 3. Настройка клиента Superset
env = self.config_manager.get_environment(env_id)
if not env:
raise ValueError(f"Environment {env_id} not found")
client = SupersetClient(env)
client.authenticate()
# 4. Импорт
temp_zip_path = repo_path / f"deploy_{dashboard_id}.zip"
logger.info(f"[_handle_deploy][Action] Saving temporary zip to {temp_zip_path}")
with open(temp_zip_path, "wb") as f:
f.write(zip_buffer.getvalue())
try:
logger.info(f"[_handle_deploy][Action] Importing dashboard to {env.name}")
result = client.import_dashboard(temp_zip_path)
logger.info(f"[_handle_deploy][Coherence:OK] Deployment successful for dashboard {dashboard_id}.")
return {"status": "success", "message": f"Dashboard deployed to {env.name}", "details": result}
finally:
if temp_zip_path.exists():
os.remove(temp_zip_path)
except Exception as e:
logger.error(f"[_handle_deploy][Coherence:Failed] Deployment failed: {e}")
raise
# [/DEF:_handle_deploy:Function]
# [DEF:_get_env:Function]
# @PURPOSE: Вспомогательный метод для получения конфигурации окружения.
# @PARAM: env_id (Optional[str]) - ID окружения.
# @PRE: env_id is a string or None.
# @POST: Returns an Environment object from config or DB.
# @RETURN: Environment - Объект конфигурации окружения.
def _get_env(self, env_id: Optional[str] = None):
with belief_scope("GitPlugin._get_env"):
logger.info(f"[_get_env][Entry] Fetching environment for ID: {env_id}")
# Priority 1: ConfigManager (config.json)
if env_id:
env = self.config_manager.get_environment(env_id)
if env:
logger.info(f"[_get_env][Exit] Found environment by ID in ConfigManager: {env.name}")
return env
# Priority 2: Database (DeploymentEnvironment)
from src.core.database import SessionLocal
from src.models.git import DeploymentEnvironment
db = SessionLocal()
try:
if env_id:
db_env = db.query(DeploymentEnvironment).filter(DeploymentEnvironment.id == env_id).first()
else:
# If no ID, try to find active or any environment in DB
db_env = db.query(DeploymentEnvironment).filter(DeploymentEnvironment.is_active == True).first()
if not db_env:
db_env = db.query(DeploymentEnvironment).first()
if db_env:
logger.info(f"[_get_env][Exit] Found environment in DB: {db_env.name}")
from src.core.config_models import Environment
# Use token as password for SupersetClient
return Environment(
id=db_env.id,
name=db_env.name,
url=db_env.superset_url,
username="admin",
password=db_env.superset_token,
verify_ssl=True
)
finally:
db.close()
# Priority 3: ConfigManager Default (if no env_id provided)
envs = self.config_manager.get_environments()
if envs:
if env_id:
# If env_id was provided but not found in DB or specifically by ID in config,
# but we have other envs, maybe it's one of them?
env = next((e for e in envs if e.id == env_id), None)
if env:
logger.info(f"[_get_env][Exit] Found environment {env_id} in ConfigManager list")
return env
if not env_id:
logger.info(f"[_get_env][Exit] Using first environment from ConfigManager: {envs[0].name}")
return envs[0]
logger.error(f"[_get_env][Coherence:Failed] No environments configured (searched config.json and DB). env_id={env_id}")
raise ValueError("No environments configured. Please add a Superset Environment in Settings.")
# [/DEF:_get_env:Function]
# [/DEF:initialize:Function]
# [/DEF:GitPlugin:Class]
# [/DEF:backend.src.plugins.git_plugin:Module]

View File

@@ -0,0 +1,12 @@
# [DEF:backend/src/plugins/llm_analysis/__init__.py:Module]
# @TIER: TRIVIAL
# @PURPOSE: Initialize the LLM Analysis plugin package.
# @LAYER: Domain
"""
LLM Analysis Plugin for automated dashboard validation and dataset documentation.
"""
from .plugin import DashboardValidationPlugin, DocumentationPlugin
# [/DEF:backend/src/plugins/llm_analysis/__init__.py:Module]

View File

@@ -0,0 +1,61 @@
# [DEF:backend/src/plugins/llm_analysis/models.py:Module]
# @TIER: STANDARD
# @SEMANTICS: pydantic, models, llm
# @PURPOSE: Define Pydantic models for LLM Analysis plugin.
# @LAYER: Domain
from typing import List, Optional
from pydantic import BaseModel, Field
from datetime import datetime
from enum import Enum
# [DEF:LLMProviderType:Class]
# @PURPOSE: Enum for supported LLM providers.
class LLMProviderType(str, Enum):
OPENAI = "openai"
OPENROUTER = "openrouter"
KILO = "kilo"
# [/DEF:LLMProviderType:Class]
# [DEF:LLMProviderConfig:Class]
# @PURPOSE: Configuration for an LLM provider.
class LLMProviderConfig(BaseModel):
id: Optional[str] = None
provider_type: LLMProviderType
name: str
base_url: str
api_key: Optional[str] = None
default_model: str
is_active: bool = True
# [/DEF:LLMProviderConfig:Class]
# [DEF:ValidationStatus:Class]
# @PURPOSE: Enum for dashboard validation status.
class ValidationStatus(str, Enum):
PASS = "PASS"
WARN = "WARN"
FAIL = "FAIL"
# [/DEF:ValidationStatus:Class]
# [DEF:DetectedIssue:Class]
# @PURPOSE: Model for a single issue detected during validation.
class DetectedIssue(BaseModel):
severity: ValidationStatus
message: str
location: Optional[str] = None
# [/DEF:DetectedIssue:Class]
# [DEF:ValidationResult:Class]
# @PURPOSE: Model for dashboard validation result.
class ValidationResult(BaseModel):
id: Optional[str] = None
dashboard_id: str
timestamp: datetime = Field(default_factory=datetime.utcnow)
status: ValidationStatus
screenshot_path: Optional[str] = None
issues: List[DetectedIssue]
summary: str
raw_response: Optional[str] = None
# [/DEF:ValidationResult:Class]
# [/DEF:backend/src/plugins/llm_analysis/models.py:Module]

View File

@@ -0,0 +1,377 @@
# [DEF:backend/src/plugins/llm_analysis/plugin.py:Module]
# @TIER: STANDARD
# @SEMANTICS: plugin, llm, analysis, documentation
# @PURPOSE: Implements DashboardValidationPlugin and DocumentationPlugin.
# @LAYER: Domain
# @RELATION: INHERITS -> backend.src.core.plugin_base.PluginBase
# @RELATION: CALLS -> backend.src.plugins.llm_analysis.service.ScreenshotService
# @RELATION: CALLS -> backend.src.plugins.llm_analysis.service.LLMClient
# @RELATION: CALLS -> backend.src.services.llm_provider.LLMProviderService
# @INVARIANT: All LLM interactions must be executed as asynchronous tasks.
from typing import Dict, Any, Optional, List
import os
import json
import logging
from datetime import datetime, timedelta
from ...core.plugin_base import PluginBase
from ...core.logger import belief_scope, logger
from ...core.database import SessionLocal
from ...core.config_manager import ConfigManager
from ...services.llm_provider import LLMProviderService
from ...core.superset_client import SupersetClient
from .service import ScreenshotService, LLMClient
from .models import LLMProviderType, ValidationStatus, ValidationResult, DetectedIssue
from ...models.llm import ValidationRecord
# [DEF:DashboardValidationPlugin:Class]
# @PURPOSE: Plugin for automated dashboard health analysis using LLMs.
# @RELATION: IMPLEMENTS -> backend.src.core.plugin_base.PluginBase
class DashboardValidationPlugin(PluginBase):
@property
def id(self) -> str:
return "llm_dashboard_validation"
@property
def name(self) -> str:
return "Dashboard LLM Validation"
@property
def description(self) -> str:
return "Automated dashboard health analysis using multimodal LLMs."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"dashboard_id": {"type": "string", "title": "Dashboard ID"},
"environment_id": {"type": "string", "title": "Environment ID"},
"provider_id": {"type": "string", "title": "LLM Provider ID"}
},
"required": ["dashboard_id", "environment_id", "provider_id"]
}
# [DEF:DashboardValidationPlugin.execute:Function]
# @PURPOSE: Executes the dashboard validation task.
# @PRE: params contains dashboard_id, environment_id, and provider_id.
# @POST: Returns a dictionary with validation results and persists them to the database.
# @SIDE_EFFECT: Captures a screenshot, calls LLM API, and writes to the database.
async def execute(self, params: Dict[str, Any]):
with belief_scope("execute", f"plugin_id={self.id}"):
logger.info(f"Executing {self.name} with params: {params}")
dashboard_id = params.get("dashboard_id")
env_id = params.get("environment_id")
provider_id = params.get("provider_id")
task_id = params.get("_task_id")
# Helper to log to both app logger and task manager logs
def task_log(level: str, message: str, context: Optional[Dict] = None):
logger.log(getattr(logging, level.upper()), message)
if task_id:
from ...dependencies import get_task_manager
try:
tm = get_task_manager()
tm._add_log(task_id, level.upper(), message, context)
except: pass
db = SessionLocal()
try:
# 1. Get Environment
from ...dependencies import get_config_manager
config_mgr = get_config_manager()
env = config_mgr.get_environment(env_id)
if not env:
raise ValueError(f"Environment {env_id} not found")
# 2. Get LLM Provider
llm_service = LLMProviderService(db)
db_provider = llm_service.get_provider(provider_id)
if not db_provider:
raise ValueError(f"LLM Provider {provider_id} not found")
logger.info(f"[DashboardValidationPlugin.execute] Retrieved provider config:")
logger.info(f"[DashboardValidationPlugin.execute] Provider ID: {db_provider.id}")
logger.info(f"[DashboardValidationPlugin.execute] Provider Name: {db_provider.name}")
logger.info(f"[DashboardValidationPlugin.execute] Provider Type: {db_provider.provider_type}")
logger.info(f"[DashboardValidationPlugin.execute] Base URL: {db_provider.base_url}")
logger.info(f"[DashboardValidationPlugin.execute] Default Model: {db_provider.default_model}")
logger.info(f"[DashboardValidationPlugin.execute] Is Active: {db_provider.is_active}")
api_key = llm_service.get_decrypted_api_key(provider_id)
logger.info(f"[DashboardValidationPlugin.execute] API Key decrypted (first 8 chars): {api_key[:8] if api_key and len(api_key) > 8 else 'EMPTY_OR_NONE'}...")
logger.info(f"[DashboardValidationPlugin.execute] API Key Length: {len(api_key) if api_key else 0}")
# Check if API key was successfully decrypted
if not api_key:
raise ValueError(
f"Failed to decrypt API key for provider {provider_id}. "
f"The provider may have been encrypted with a different encryption key. "
f"Please update the provider with a new API key through the UI."
)
# 3. Capture Screenshot
screenshot_service = ScreenshotService(env)
storage_root = config_mgr.get_config().settings.storage.root_path
screenshots_dir = os.path.join(storage_root, "screenshots")
os.makedirs(screenshots_dir, exist_ok=True)
filename = f"{dashboard_id}_{datetime.now().strftime('%Y%m%d_%H%M%S')}.png"
screenshot_path = os.path.join(screenshots_dir, filename)
await screenshot_service.capture_dashboard(dashboard_id, screenshot_path)
# 4. Fetch Logs (from Environment /api/v1/log/)
logs = []
try:
client = SupersetClient(env)
# Calculate time window (last 24 hours)
start_time = (datetime.now() - timedelta(hours=24)).isoformat()
# Construct filter for logs
# Note: We filter by dashboard_id matching the object
query_params = {
"filters": [
{"col": "dashboard_id", "opr": "eq", "value": dashboard_id},
{"col": "dttm", "opr": "gt", "value": start_time}
],
"order_column": "dttm",
"order_direction": "desc",
"page": 0,
"page_size": 100
}
response = client.network.request(
method="GET",
endpoint="/log/",
params={"q": json.dumps(query_params)}
)
if isinstance(response, dict) and "result" in response:
for item in response["result"]:
action = item.get("action", "unknown")
dttm = item.get("dttm", "")
details = item.get("json", "")
logs.append(f"[{dttm}] {action}: {details}")
if not logs:
logs = ["No recent logs found for this dashboard."]
except Exception as e:
logger.warning(f"Failed to fetch logs from environment: {e}")
logs = [f"Error fetching remote logs: {str(e)}"]
# 5. Analyze with LLM
llm_client = LLMClient(
provider_type=LLMProviderType(db_provider.provider_type),
api_key=api_key,
base_url=db_provider.base_url,
default_model=db_provider.default_model
)
analysis = await llm_client.analyze_dashboard(screenshot_path, logs)
# Log analysis summary to task logs for better visibility
task_log("INFO", f"[ANALYSIS_SUMMARY] Status: {analysis['status']}")
task_log("INFO", f"[ANALYSIS_SUMMARY] Summary: {analysis['summary']}")
if analysis.get("issues"):
for i, issue in enumerate(analysis["issues"]):
task_log("INFO", f"[ANALYSIS_ISSUE][{i+1}] {issue.get('severity')}: {issue.get('message')} (Location: {issue.get('location', 'N/A')})")
# 6. Persist Result
validation_result = ValidationResult(
dashboard_id=dashboard_id,
status=ValidationStatus(analysis["status"]),
summary=analysis["summary"],
issues=[DetectedIssue(**issue) for issue in analysis["issues"]],
screenshot_path=screenshot_path,
raw_response=str(analysis)
)
db_record = ValidationRecord(
dashboard_id=validation_result.dashboard_id,
status=validation_result.status.value,
summary=validation_result.summary,
issues=[issue.dict() for issue in validation_result.issues],
screenshot_path=validation_result.screenshot_path,
raw_response=validation_result.raw_response
)
db.add(db_record)
db.commit()
# 7. Notification on failure (US1 / FR-015)
if validation_result.status == ValidationStatus.FAIL:
task_log("WARNING", f"Dashboard {dashboard_id} validation FAILED. Summary: {validation_result.summary}")
# Placeholder for Email/Pulse notification dispatch
# In a real implementation, we would call a NotificationService here
# with a payload containing the summary and a link to the report.
# Final log to ensure all analysis is visible in task logs
task_log("INFO", f"Validation completed for dashboard {dashboard_id}. Status: {validation_result.status.value}")
return validation_result.dict()
finally:
db.close()
# [/DEF:DashboardValidationPlugin.execute:Function]
# [/DEF:DashboardValidationPlugin:Class]
# [DEF:DocumentationPlugin:Class]
# @PURPOSE: Plugin for automated dataset documentation using LLMs.
# @RELATION: IMPLEMENTS -> backend.src.core.plugin_base.PluginBase
class DocumentationPlugin(PluginBase):
@property
def id(self) -> str:
return "llm_documentation"
@property
def name(self) -> str:
return "Dataset LLM Documentation"
@property
def description(self) -> str:
return "Automated dataset and column documentation using LLMs."
@property
def version(self) -> str:
return "1.0.0"
def get_schema(self) -> Dict[str, Any]:
return {
"type": "object",
"properties": {
"dataset_id": {"type": "string", "title": "Dataset ID"},
"environment_id": {"type": "string", "title": "Environment ID"},
"provider_id": {"type": "string", "title": "LLM Provider ID"}
},
"required": ["dataset_id", "environment_id", "provider_id"]
}
# [DEF:DocumentationPlugin.execute:Function]
# @PURPOSE: Executes the dataset documentation task.
# @PRE: params contains dataset_id, environment_id, and provider_id.
# @POST: Returns generated documentation and updates the dataset in Superset.
# @SIDE_EFFECT: Calls LLM API and updates dataset metadata in Superset.
async def execute(self, params: Dict[str, Any]):
with belief_scope("execute", f"plugin_id={self.id}"):
logger.info(f"Executing {self.name} with params: {params}")
dataset_id = params.get("dataset_id")
env_id = params.get("environment_id")
provider_id = params.get("provider_id")
db = SessionLocal()
try:
# 1. Get Environment
from ...dependencies import get_config_manager
config_mgr = get_config_manager()
env = config_mgr.get_environment(env_id)
if not env:
raise ValueError(f"Environment {env_id} not found")
# 2. Get LLM Provider
llm_service = LLMProviderService(db)
db_provider = llm_service.get_provider(provider_id)
if not db_provider:
raise ValueError(f"LLM Provider {provider_id} not found")
logger.info(f"[DocumentationPlugin.execute] Retrieved provider config:")
logger.info(f"[DocumentationPlugin.execute] Provider ID: {db_provider.id}")
logger.info(f"[DocumentationPlugin.execute] Provider Name: {db_provider.name}")
logger.info(f"[DocumentationPlugin.execute] Provider Type: {db_provider.provider_type}")
logger.info(f"[DocumentationPlugin.execute] Base URL: {db_provider.base_url}")
logger.info(f"[DocumentationPlugin.execute] Default Model: {db_provider.default_model}")
logger.info(f"[DocumentationPlugin.execute] Is Active: {db_provider.is_active}")
api_key = llm_service.get_decrypted_api_key(provider_id)
logger.info(f"[DocumentationPlugin.execute] API Key decrypted (first 8 chars): {api_key[:8] if api_key and len(api_key) > 8 else 'EMPTY_OR_NONE'}...")
logger.info(f"[DocumentationPlugin.execute] API Key Length: {len(api_key) if api_key else 0}")
# Check if API key was successfully decrypted
if not api_key:
raise ValueError(
f"Failed to decrypt API key for provider {provider_id}. "
f"The provider may have been encrypted with a different encryption key. "
f"Please update the provider with a new API key through the UI."
)
# 3. Fetch Metadata (US2 / T024)
from ...core.superset_client import SupersetClient
client = SupersetClient(env)
# Optimistic locking check (T045)
dataset = client.get_dataset(int(dataset_id))
# dataset structure might vary, ensure we get the right field
original_changed_on = dataset.get("changed_on_utc") or dataset.get("result", {}).get("changed_on_utc")
# Extract columns and existing descriptions
columns_data = []
for col in dataset.get("columns", []):
columns_data.append({
"name": col.get("column_name"),
"type": col.get("type"),
"description": col.get("description")
})
# 4. Construct Prompt & Analyze (US2 / T025)
llm_client = LLMClient(
provider_type=LLMProviderType(db_provider.provider_type),
api_key=api_key,
base_url=db_provider.base_url,
default_model=db_provider.default_model
)
prompt = f"""
Generate professional documentation for the following dataset and its columns.
Dataset: {dataset.get('table_name')}
Columns: {columns_data}
Provide the documentation in JSON format:
{{
"dataset_description": "General description of the dataset",
"column_descriptions": [
{{
"name": "column_name",
"description": "Generated description"
}}
]
}}
"""
# Using a generic chat completion for text-only US2
# We use the shared get_json_completion method from LLMClient
doc_result = await llm_client.get_json_completion([{"role": "user", "content": prompt}])
# 5. Update Metadata (US2 / T026)
# This part normally goes to mapping_service, but we implement the logic here for the plugin flow
# We'll update the dataset in Superset
update_payload = {
"description": doc_result["dataset_description"],
"columns": []
}
# Map generated descriptions back to column IDs
for col_doc in doc_result["column_descriptions"]:
for col in dataset.get("columns", []):
if col.get("column_name") == col_doc["name"]:
update_payload["columns"].append({
"id": col.get("id"),
"description": col_doc["description"]
})
client.update_dataset(int(dataset_id), update_payload)
return doc_result
finally:
db.close()
# [/DEF:DocumentationPlugin.execute:Function]
# [/DEF:DocumentationPlugin:Class]
# [/DEF:backend/src/plugins/llm_analysis/plugin.py:Module]

View File

@@ -0,0 +1,60 @@
# [DEF:backend/src/plugins/llm_analysis/scheduler.py:Module]
# @TIER: STANDARD
# @SEMANTICS: scheduler, task, automation
# @PURPOSE: Provides helper functions to schedule LLM-based validation tasks.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> backend.src.core.scheduler
from typing import Dict, Any
from ...dependencies import get_task_manager, get_scheduler_service
from ...core.logger import belief_scope, logger
# [DEF:schedule_dashboard_validation:Function]
# @PURPOSE: Schedules a recurring dashboard validation task.
# @PARAM: dashboard_id (str) - ID of the dashboard to validate.
# @PARAM: cron_expression (str) - Standard cron expression for scheduling.
# @PARAM: params (Dict[str, Any]) - Task parameters (environment_id, provider_id).
# @SIDE_EFFECT: Adds a job to the scheduler service.
def schedule_dashboard_validation(dashboard_id: str, cron_expression: str, params: Dict[str, Any]):
with belief_scope("schedule_dashboard_validation", f"dashboard_id={dashboard_id}"):
scheduler = get_scheduler_service()
task_manager = get_task_manager()
job_id = f"llm_val_{dashboard_id}"
async def job_func():
await task_manager.create_task(
plugin_id="llm_dashboard_validation",
params={
"dashboard_id": dashboard_id,
**params
}
)
scheduler.add_job(
job_func,
"cron",
id=job_id,
replace_existing=True,
**_parse_cron(cron_expression)
)
logger.info(f"Scheduled validation for dashboard {dashboard_id} with cron {cron_expression}")
# [DEF:_parse_cron:Function]
# @PURPOSE: Basic cron parser placeholder.
# @PARAM: cron (str) - Cron expression.
# @RETURN: Dict[str, str] - Parsed cron parts.
def _parse_cron(cron: str) -> Dict[str, str]:
# Basic cron parser placeholder
parts = cron.split()
if len(parts) != 5:
return {}
return {
"minute": parts[0],
"hour": parts[1],
"day": parts[2],
"month": parts[3],
"day_of_week": parts[4]
}
# [/DEF:backend/src/plugins/llm_analysis/scheduler.py:Module]

View File

@@ -0,0 +1,629 @@
# [DEF:backend/src/plugins/llm_analysis/service.py:Module]
# @TIER: STANDARD
# @SEMANTICS: service, llm, screenshot, playwright, openai
# @PURPOSE: Services for LLM interaction and dashboard screenshots.
# @LAYER: Domain
# @RELATION: DEPENDS_ON -> playwright
# @RELATION: DEPENDS_ON -> openai
# @RELATION: DEPENDS_ON -> tenacity
# @INVARIANT: Screenshots must be 1920px width and capture full page height.
import asyncio
import base64
import json
import io
from typing import List, Optional, Dict, Any
from PIL import Image
from playwright.async_api import async_playwright
from openai import AsyncOpenAI, RateLimitError, AuthenticationError as OpenAIAuthenticationError
from tenacity import retry, stop_after_attempt, wait_exponential, retry_if_exception
from .models import LLMProviderType, ValidationResult, ValidationStatus, DetectedIssue
from ...core.logger import belief_scope, logger
from ...core.config_models import Environment
# [DEF:ScreenshotService:Class]
# @PURPOSE: Handles capturing screenshots of Superset dashboards.
class ScreenshotService:
# [DEF:ScreenshotService.__init__:Function]
# @PURPOSE: Initializes the ScreenshotService with environment configuration.
# @PRE: env is a valid Environment object.
def __init__(self, env: Environment):
self.env = env
# [/DEF:ScreenshotService.__init__:Function]
# [DEF:ScreenshotService.capture_dashboard:Function]
# @PURPOSE: Captures a full-page screenshot of a dashboard using Playwright and CDP.
# @PRE: dashboard_id is a valid string, output_path is a writable path.
# @POST: Returns True if screenshot is saved successfully.
# @SIDE_EFFECT: Launches a browser, performs UI login, switches tabs, and writes a PNG file.
# @UX_STATE: [Navigating] -> Loading dashboard UI
# @UX_STATE: [TabSwitching] -> Iterating through dashboard tabs to trigger lazy loading
# @UX_STATE: [CalculatingHeight] -> Determining dashboard dimensions
# @UX_STATE: [Capturing] -> Executing CDP screenshot
async def capture_dashboard(self, dashboard_id: str, output_path: str) -> bool:
with belief_scope("capture_dashboard", f"dashboard_id={dashboard_id}"):
logger.info(f"Capturing screenshot for dashboard {dashboard_id}")
async with async_playwright() as p:
browser = await p.chromium.launch(
headless=True,
args=[
"--disable-blink-features=AutomationControlled",
"--disable-infobars",
"--no-sandbox"
]
)
# Set a realistic user agent to avoid 403 Forbidden from OpenResty/WAF
user_agent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36"
# Construct base UI URL from environment (strip /api/v1 suffix)
base_ui_url = self.env.url.rstrip("/")
if base_ui_url.endswith("/api/v1"):
base_ui_url = base_ui_url[:-len("/api/v1")]
# Create browser context with realistic headers
context = await browser.new_context(
viewport={'width': 1280, 'height': 720},
user_agent=user_agent,
extra_http_headers={
"Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.7",
"Accept-Language": "ru-RU,ru;q=0.9,en-US;q=0.8,en;q=0.7",
"Upgrade-Insecure-Requests": "1",
"Sec-Fetch-Dest": "document",
"Sec-Fetch-Mode": "navigate",
"Sec-Fetch-Site": "none",
"Sec-Fetch-User": "?1"
}
)
logger.info("Browser context created successfully")
page = await context.new_page()
# Bypass navigator.webdriver detection
await page.add_init_script("delete Object.getPrototypeOf(navigator).webdriver")
# 1. Navigate to login page and authenticate
login_url = f"{base_ui_url.rstrip('/')}/login/"
logger.info(f"[DEBUG] Navigating to login page: {login_url}")
response = await page.goto(login_url, wait_until="networkidle", timeout=60000)
if response:
logger.info(f"[DEBUG] Login page response status: {response.status}")
# Wait for login form to be ready
await page.wait_for_load_state("domcontentloaded")
# More exhaustive list of selectors for various Superset versions/themes
selectors = {
"username": ['input[name="username"]', 'input#username', 'input[placeholder*="Username"]', 'input[type="text"]'],
"password": ['input[name="password"]', 'input#password', 'input[placeholder*="Password"]', 'input[type="password"]'],
"submit": ['button[type="submit"]', 'button#submit', '.btn-primary', 'input[type="submit"]']
}
logger.info(f"[DEBUG] Attempting to find login form elements...")
try:
# Find and fill username
u_selector = None
for s in selectors["username"]:
count = await page.locator(s).count()
logger.info(f"[DEBUG] Selector '{s}': {count} elements found")
if count > 0:
u_selector = s
break
if not u_selector:
# Log all input fields on the page for debugging
all_inputs = await page.locator('input').all()
logger.info(f"[DEBUG] Found {len(all_inputs)} input fields on page")
for i, inp in enumerate(all_inputs[:5]): # Log first 5
inp_type = await inp.get_attribute('type')
inp_name = await inp.get_attribute('name')
inp_id = await inp.get_attribute('id')
logger.info(f"[DEBUG] Input {i}: type={inp_type}, name={inp_name}, id={inp_id}")
raise RuntimeError("Could not find username input field on login page")
logger.info(f"[DEBUG] Filling username field with selector: {u_selector}")
await page.fill(u_selector, self.env.username)
# Find and fill password
p_selector = None
for s in selectors["password"]:
if await page.locator(s).count() > 0:
p_selector = s
break
if not p_selector:
raise RuntimeError("Could not find password input field on login page")
logger.info(f"[DEBUG] Filling password field with selector: {p_selector}")
await page.fill(p_selector, self.env.password)
# Click submit
s_selector = selectors["submit"][0]
for s in selectors["submit"]:
if await page.locator(s).count() > 0:
s_selector = s
break
logger.info(f"[DEBUG] Clicking submit button with selector: {s_selector}")
await page.click(s_selector)
# Wait for navigation after login
await page.wait_for_load_state("networkidle", timeout=30000)
# Check if login was successful
if "/login" in page.url:
# Check for error messages on page
error_msg = await page.locator(".alert-danger, .error-message").text_content() if await page.locator(".alert-danger, .error-message").count() > 0 else "Unknown error"
logger.error(f"[DEBUG] Login failed. Still on login page. Error: {error_msg}")
debug_path = output_path.replace(".png", "_debug_failed_login.png")
await page.screenshot(path=debug_path)
raise RuntimeError(f"Login failed: {error_msg}. Debug screenshot saved to {debug_path}")
logger.info(f"[DEBUG] Login successful. Current URL: {page.url}")
# Check cookies after successful login
page_cookies = await context.cookies()
logger.info(f"[DEBUG] Cookies after login: {len(page_cookies)}")
for c in page_cookies:
logger.info(f"[DEBUG] Cookie: name={c['name']}, domain={c['domain']}, value={c.get('value', '')[:20]}...")
except Exception as e:
page_title = await page.title()
logger.error(f"UI Login failed. Page title: {page_title}, URL: {page.url}, Error: {str(e)}")
debug_path = output_path.replace(".png", "_debug_failed_login.png")
await page.screenshot(path=debug_path)
raise RuntimeError(f"Login failed: {str(e)}. Debug screenshot saved to {debug_path}")
# 2. Navigate to dashboard
# @UX_STATE: [Navigating] -> Loading dashboard UI
dashboard_url = f"{base_ui_url.rstrip('/')}/superset/dashboard/{dashboard_id}/?standalone=true"
if base_ui_url.startswith("https://") and dashboard_url.startswith("http://"):
dashboard_url = dashboard_url.replace("http://", "https://")
logger.info(f"[DEBUG] Navigating to dashboard: {dashboard_url}")
# Use networkidle to ensure all initial assets are loaded
response = await page.goto(dashboard_url, wait_until="networkidle", timeout=60000)
if response:
logger.info(f"[DEBUG] Dashboard navigation response status: {response.status}, URL: {response.url}")
try:
# Wait for the dashboard grid to be present
await page.wait_for_selector('.dashboard-component, .dashboard-header, [data-test="dashboard-grid"]', timeout=30000)
logger.info(f"[DEBUG] Dashboard container loaded")
# Wait for charts to finish loading (Superset uses loading spinners/skeletons)
# We wait until loading indicators disappear or a timeout occurs
try:
# Wait for loading indicators to disappear
await page.wait_for_selector('.loading, .ant-skeleton, .spinner', state="hidden", timeout=60000)
logger.info(f"[DEBUG] Loading indicators hidden")
except:
logger.warning(f"[DEBUG] Timeout waiting for loading indicators to hide")
# Wait for charts to actually render their content (e.g., ECharts, NVD3)
# We look for common chart containers that should have content
try:
await page.wait_for_selector('.chart-container canvas, .slice_container svg, .superset-chart-canvas, .grid-content .chart-container', timeout=60000)
logger.info(f"[DEBUG] Chart content detected")
except:
logger.warning(f"[DEBUG] Timeout waiting for chart content")
# Additional check: wait for all chart containers to have non-empty content
logger.info(f"[DEBUG] Waiting for all charts to have rendered content...")
await page.wait_for_function("""() => {
const charts = document.querySelectorAll('.chart-container, .slice_container');
if (charts.length === 0) return true; // No charts to wait for
// Check if all charts have rendered content (canvas, svg, or non-empty div)
return Array.from(charts).every(chart => {
const hasCanvas = chart.querySelector('canvas') !== null;
const hasSvg = chart.querySelector('svg') !== null;
const hasContent = chart.innerText.trim().length > 0 || chart.children.length > 0;
return hasCanvas || hasSvg || hasContent;
});
}""", timeout=60000)
logger.info(f"[DEBUG] All charts have rendered content")
# Scroll to bottom and back to top to trigger lazy loading of all charts
logger.info(f"[DEBUG] Scrolling to trigger lazy loading...")
await page.evaluate("""async () => {
const delay = ms => new Promise(resolve => setTimeout(resolve, ms));
for (let i = 0; i < document.body.scrollHeight; i += 500) {
window.scrollTo(0, i);
await delay(100);
}
window.scrollTo(0, 0);
await delay(500);
}""")
except Exception as e:
logger.warning(f"[DEBUG] Dashboard content wait failed: {e}, proceeding anyway after delay")
# Final stabilization delay - increased for complex dashboards
logger.info(f"[DEBUG] Final stabilization delay...")
await asyncio.sleep(15)
# Logic to handle tabs and full-page capture
try:
# 1. Handle Tabs (Recursive switching)
# @UX_STATE: [TabSwitching] -> Iterating through dashboard tabs to trigger lazy loading
processed_tabs = set()
async def switch_tabs(depth=0):
if depth > 3: return # Limit recursion depth
tab_selectors = [
'.ant-tabs-nav-list .ant-tabs-tab',
'.dashboard-component-tabs .ant-tabs-tab',
'[data-test="dashboard-component-tabs"] .ant-tabs-tab'
]
found_tabs = []
for selector in tab_selectors:
found_tabs = await page.locator(selector).all()
if found_tabs: break
if found_tabs:
logger.info(f"[DEBUG][TabSwitching] Found {len(found_tabs)} tabs at depth {depth}")
for i, tab in enumerate(found_tabs):
try:
tab_text = (await tab.inner_text()).strip()
tab_id = f"{depth}_{i}_{tab_text}"
if tab_id in processed_tabs:
continue
if await tab.is_visible():
logger.info(f"[DEBUG][TabSwitching] Switching to tab: {tab_text}")
processed_tabs.add(tab_id)
is_active = "ant-tabs-tab-active" in (await tab.get_attribute("class") or "")
if not is_active:
await tab.click()
await asyncio.sleep(2) # Wait for content to render
await switch_tabs(depth + 1)
except Exception as tab_e:
logger.warning(f"[DEBUG][TabSwitching] Failed to process tab {i}: {tab_e}")
try:
first_tab = found_tabs[0]
if "ant-tabs-tab-active" not in (await first_tab.get_attribute("class") or ""):
await first_tab.click()
await asyncio.sleep(1)
except: pass
await switch_tabs()
# 2. Calculate full height for screenshot
# @UX_STATE: [CalculatingHeight] -> Determining dashboard dimensions
full_height = await page.evaluate("""() => {
const body = document.body;
const html = document.documentElement;
const dashboardContent = document.querySelector('.dashboard-content');
return Math.max(
body.scrollHeight, body.offsetHeight,
html.clientHeight, html.scrollHeight, html.offsetHeight,
dashboardContent ? dashboardContent.scrollHeight + 100 : 0
);
}""")
logger.info(f"[DEBUG] Calculated full height: {full_height}")
# DIAGNOSTIC: Count chart elements before resize
chart_count_before = await page.evaluate("""() => {
return {
chartContainers: document.querySelectorAll('.chart-container, .slice_container').length,
canvasElements: document.querySelectorAll('canvas').length,
svgElements: document.querySelectorAll('.chart-container svg, .slice_container svg').length,
visibleCharts: document.querySelectorAll('.chart-container:visible, .slice_container:visible').length
};
}""")
logger.info(f"[DIAGNOSTIC] Chart elements BEFORE viewport resize: {chart_count_before}")
# DIAGNOSTIC: Capture pre-resize screenshot for comparison
pre_resize_path = output_path.replace(".png", "_preresize.png")
try:
await page.screenshot(path=pre_resize_path, full_page=False, timeout=10000)
import os
pre_resize_size = os.path.getsize(pre_resize_path) if os.path.exists(pre_resize_path) else 0
logger.info(f"[DIAGNOSTIC] Pre-resize screenshot saved: {pre_resize_path} ({pre_resize_size} bytes)")
except Exception as pre_e:
logger.warning(f"[DIAGNOSTIC] Failed to capture pre-resize screenshot: {pre_e}")
logger.info(f"[DIAGNOSTIC] Resizing viewport from current to 1920x{int(full_height)}")
await page.set_viewport_size({"width": 1920, "height": int(full_height)})
# DIAGNOSTIC: Increased wait time and log timing
logger.info("[DIAGNOSTIC] Waiting 10 seconds after viewport resize for re-render...")
await asyncio.sleep(10)
logger.info("[DIAGNOSTIC] Wait completed")
# DIAGNOSTIC: Count chart elements after resize and wait
chart_count_after = await page.evaluate("""() => {
return {
chartContainers: document.querySelectorAll('.chart-container, .slice_container').length,
canvasElements: document.querySelectorAll('canvas').length,
svgElements: document.querySelectorAll('.chart-container svg, .slice_container svg').length,
visibleCharts: document.querySelectorAll('.chart-container:visible, .slice_container:visible').length
};
}""")
logger.info(f"[DIAGNOSTIC] Chart elements AFTER viewport resize + wait: {chart_count_after}")
# DIAGNOSTIC: Check if any charts have error states
chart_errors = await page.evaluate("""() => {
const errors = [];
document.querySelectorAll('.chart-container, .slice_container').forEach((chart, i) => {
const errorEl = chart.querySelector('.error, .alert-danger, .ant-alert-error');
if (errorEl) {
errors.push({index: i, text: errorEl.innerText.substring(0, 100)});
}
});
return errors;
}""")
if chart_errors:
logger.warning(f"[DIAGNOSTIC] Charts with error states detected: {chart_errors}")
else:
logger.info("[DIAGNOSTIC] No chart error states detected")
# 3. Take screenshot using CDP to bypass Playwright's font loading wait
# @UX_STATE: [Capturing] -> Executing CDP screenshot
logger.info("[DEBUG] Attempting full-page screenshot via CDP...")
cdp = await page.context.new_cdp_session(page)
screenshot_data = await cdp.send("Page.captureScreenshot", {
"format": "png",
"fromSurface": True,
"captureBeyondViewport": True
})
image_data = base64.b64decode(screenshot_data["data"])
with open(output_path, 'wb') as f:
f.write(image_data)
# DIAGNOSTIC: Verify screenshot file
import os
final_size = os.path.getsize(output_path) if os.path.exists(output_path) else 0
logger.info(f"[DIAGNOSTIC] Final screenshot saved: {output_path}")
logger.info(f"[DIAGNOSTIC] Final screenshot size: {final_size} bytes ({final_size / 1024:.2f} KB)")
# DIAGNOSTIC: Get image dimensions
try:
with Image.open(output_path) as final_img:
logger.info(f"[DIAGNOSTIC] Final screenshot dimensions: {final_img.width}x{final_img.height}")
except Exception as img_err:
logger.warning(f"[DIAGNOSTIC] Could not read final image dimensions: {img_err}")
logger.info(f"Full-page screenshot saved to {output_path} (via CDP)")
except Exception as e:
logger.error(f"[DEBUG] Full-page/Tab capture failed: {e}")
try:
await page.screenshot(path=output_path, full_page=True, timeout=10000)
except Exception as e2:
logger.error(f"[DEBUG] Fallback screenshot also failed: {e2}")
await page.screenshot(path=output_path, timeout=5000)
await browser.close()
return True
# [/DEF:ScreenshotService.capture_dashboard:Function]
# [/DEF:ScreenshotService:Class]
# [DEF:LLMClient:Class]
# @PURPOSE: Wrapper for LLM provider APIs.
class LLMClient:
# [DEF:LLMClient.__init__:Function]
# @PURPOSE: Initializes the LLMClient with provider settings.
# @PRE: api_key, base_url, and default_model are non-empty strings.
def __init__(self, provider_type: LLMProviderType, api_key: str, base_url: str, default_model: str):
self.provider_type = provider_type
self.api_key = api_key
self.base_url = base_url
self.default_model = default_model
# DEBUG: Log initialization parameters (without exposing full API key)
logger.info(f"[LLMClient.__init__] Initializing LLM client:")
logger.info(f"[LLMClient.__init__] Provider Type: {provider_type}")
logger.info(f"[LLMClient.__init__] Base URL: {base_url}")
logger.info(f"[LLMClient.__init__] Default Model: {default_model}")
logger.info(f"[LLMClient.__init__] API Key (first 8 chars): {api_key[:8] if api_key and len(api_key) > 8 else 'EMPTY_OR_NONE'}...")
logger.info(f"[LLMClient.__init__] API Key Length: {len(api_key) if api_key else 0}")
self.client = AsyncOpenAI(api_key=api_key, base_url=base_url)
# [/DEF:LLMClient.__init__:Function]
# [DEF:LLMClient.get_json_completion:Function]
# @PURPOSE: Helper to handle LLM calls with JSON mode and fallback parsing.
# @PRE: messages is a list of valid message dictionaries.
# @POST: Returns a parsed JSON dictionary.
# @SIDE_EFFECT: Calls external LLM API.
def _should_retry(exception: Exception) -> bool:
"""Custom retry predicate that excludes authentication errors."""
# Don't retry on authentication errors
if isinstance(exception, OpenAIAuthenticationError):
return False
# Retry on rate limit errors and other exceptions
return isinstance(exception, (RateLimitError, Exception))
@retry(
stop=stop_after_attempt(5),
wait=wait_exponential(multiplier=2, min=5, max=60),
retry=retry_if_exception(_should_retry),
reraise=True
)
async def get_json_completion(self, messages: List[Dict[str, Any]]) -> Dict[str, Any]:
with belief_scope("get_json_completion"):
response = None
try:
try:
logger.info(f"[get_json_completion] Attempting LLM call with JSON mode for model: {self.default_model}")
logger.info(f"[get_json_completion] Base URL being used: {self.base_url}")
logger.info(f"[get_json_completion] Number of messages: {len(messages)}")
logger.info(f"[get_json_completion] API Key present: {bool(self.api_key and len(self.api_key) > 0)}")
response = await self.client.chat.completions.create(
model=self.default_model,
messages=messages,
response_format={"type": "json_object"}
)
except Exception as e:
if "JSON mode is not enabled" in str(e) or "400" in str(e):
logger.warning(f"[get_json_completion] JSON mode failed or not supported: {str(e)}. Falling back to plain text response.")
response = await self.client.chat.completions.create(
model=self.default_model,
messages=messages
)
else:
raise e
logger.debug(f"[get_json_completion] LLM Response: {response}")
except OpenAIAuthenticationError as e:
logger.error(f"[get_json_completion] Authentication error: {str(e)}")
# Do not retry on auth errors - re-raise to stop retry
raise
except RateLimitError as e:
logger.warning(f"[get_json_completion] Rate limit hit: {str(e)}")
# Extract retry_delay from error metadata if available
retry_delay = 5.0 # Default fallback
try:
# Based on logs, the raw response is in e.body or e.response.json()
# The logs show 'metadata': {'raw': '...'} which suggests a proxy or specific client wrapper
# Let's try to find the 'retryDelay' in the error message or response
import re
# Try to find "retryDelay": "XXs" in the string representation of the error
error_str = str(e)
match = re.search(r'"retryDelay":\s*"(\d+)s"', error_str)
if match:
retry_delay = float(match.group(1))
else:
# Try to parse from response if it's a standard OpenAI-like error with body
if hasattr(e, 'body') and isinstance(e.body, dict):
# Some providers put it in details
details = e.body.get('error', {}).get('details', [])
for detail in details:
if detail.get('@type') == 'type.googleapis.com/google.rpc.RetryInfo':
delay_str = detail.get('retryDelay', '5s')
retry_delay = float(delay_str.rstrip('s'))
break
except Exception as parse_e:
logger.debug(f"[get_json_completion] Failed to parse retry delay: {parse_e}")
# Add a small safety margin (0.5s) as requested
wait_time = retry_delay + 0.5
logger.info(f"[get_json_completion] Waiting for {wait_time}s before retry...")
await asyncio.sleep(wait_time)
raise
except Exception as e:
logger.error(f"[get_json_completion] LLM call failed: {str(e)}")
raise
if not response or not hasattr(response, 'choices') or not response.choices:
raise RuntimeError(f"Invalid LLM response: {response}")
content = response.choices[0].message.content
logger.debug(f"[get_json_completion] Raw content to parse: {content}")
try:
return json.loads(content)
except json.JSONDecodeError:
logger.warning("[get_json_completion] Failed to parse JSON directly, attempting to extract from code blocks")
if "```json" in content:
json_str = content.split("```json")[1].split("```")[0].strip()
return json.loads(json_str)
elif "```" in content:
json_str = content.split("```")[1].split("```")[0].strip()
return json.loads(json_str)
else:
raise
# [/DEF:LLMClient.get_json_completion:Function]
# [DEF:LLMClient.analyze_dashboard:Function]
# @PURPOSE: Sends dashboard data (screenshot + logs) to LLM for health analysis.
# @PRE: screenshot_path exists, logs is a list of strings.
# @POST: Returns a structured analysis dictionary (status, summary, issues).
# @SIDE_EFFECT: Reads screenshot file and calls external LLM API.
async def analyze_dashboard(self, screenshot_path: str, logs: List[str]) -> Dict[str, Any]:
with belief_scope("analyze_dashboard"):
# Optimize image to reduce token count (US1 / T023)
# Gemini/Gemma models have limits on input tokens, and large images contribute significantly.
try:
with Image.open(screenshot_path) as img:
# Convert to RGB if necessary
if img.mode in ("RGBA", "P"):
img = img.convert("RGB")
# Resize if too large (max 1024px width while maintaining aspect ratio)
# We reduce width further to 1024px to stay within token limits for long dashboards
max_width = 1024
if img.width > max_width or img.height > 2048:
# Calculate scaling factor to fit within 1024x2048
scale = min(max_width / img.width, 2048 / img.height)
if scale < 1.0:
new_width = int(img.width * scale)
new_height = int(img.height * scale)
img = img.resize((new_width, new_height), Image.Resampling.LANCZOS)
logger.info(f"[analyze_dashboard] Resized image from {img.width}x{img.height} to {new_width}x{new_height}")
# Compress and convert to base64
buffer = io.BytesIO()
# Lower quality to 60% to further reduce payload size
img.save(buffer, format="JPEG", quality=60, optimize=True)
base_64_image = base64.b64encode(buffer.getvalue()).decode('utf-8')
logger.info(f"[analyze_dashboard] Optimized image size: {len(buffer.getvalue()) / 1024:.2f} KB")
except Exception as img_e:
logger.warning(f"[analyze_dashboard] Image optimization failed: {img_e}. Using raw image.")
with open(screenshot_path, "rb") as image_file:
base_64_image = base64.b64encode(image_file.read()).decode('utf-8')
log_text = "\n".join(logs)
prompt = f"""
Analyze the attached dashboard screenshot and the following execution logs for health and visual issues.
Logs:
{log_text}
Provide the analysis in JSON format with the following structure:
{{
"status": "PASS" | "WARN" | "FAIL",
"summary": "Short summary of findings",
"issues": [
{{
"severity": "WARN" | "FAIL",
"message": "Description of the issue",
"location": "Optional location info (e.g. chart name)"
}}
]
}}
"""
messages = [
{
"role": "user",
"content": [
{"type": "text", "text": prompt},
{
"type": "image_url",
"image_url": {
"url": f"data:image/jpeg;base64,{base_64_image}"
}
}
]
}
]
try:
return await self.get_json_completion(messages)
except Exception as e:
logger.error(f"[analyze_dashboard] Failed to get analysis: {str(e)}")
return {
"status": "FAIL",
"summary": f"Failed to get response from LLM: {str(e)}",
"issues": [{"severity": "FAIL", "message": "LLM provider returned empty or invalid response"}]
}
# [/DEF:LLMClient.analyze_dashboard:Function]
# [/DEF:LLMClient:Class]
# [/DEF:backend/src/plugins/llm_analysis/service.py:Module]

View File

@@ -0,0 +1,204 @@
# [DEF:MapperPluginModule:Module]
# @SEMANTICS: plugin, mapper, datasets, postgresql, excel
# @PURPOSE: Implements a plugin for mapping dataset columns using external database connections or Excel files.
# @LAYER: Plugins
# @RELATION: Inherits from PluginBase. Uses DatasetMapper from superset_tool.
# @CONSTRAINT: Must use belief_scope for logging.
# [SECTION: IMPORTS]
from typing import Dict, Any, Optional
from ..core.plugin_base import PluginBase
from ..core.superset_client import SupersetClient
from ..core.logger import logger, belief_scope
from ..core.database import SessionLocal
from ..models.connection import ConnectionConfig
from ..core.utils.dataset_mapper import DatasetMapper
# [/SECTION]
# [DEF:MapperPlugin:Class]
# @PURPOSE: Plugin for mapping dataset columns verbose names.
class MapperPlugin(PluginBase):
"""
Plugin for mapping dataset columns verbose names.
"""
@property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the mapper plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string ID.
# @RETURN: str - "dataset-mapper"
def id(self) -> str:
with belief_scope("id"):
return "dataset-mapper"
# [/DEF:id:Function]
@property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the mapper plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string name.
# @RETURN: str - Plugin name.
def name(self) -> str:
with belief_scope("name"):
return "Dataset Mapper"
# [/DEF:name:Function]
@property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the mapper plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string description.
# @RETURN: str - Plugin description.
def description(self) -> str:
with belief_scope("description"):
return "Map dataset column verbose names using PostgreSQL comments or Excel files."
# [/DEF:description:Function]
@property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the mapper plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string version.
# @RETURN: str - "1.0.0"
def version(self) -> str:
with belief_scope("version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the mapper plugin.
# @RETURN: str - "/tools/mapper"
def ui_route(self) -> str:
with belief_scope("ui_route"):
return "/tools/mapper"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for the mapper plugin parameters.
# @PRE: Plugin instance exists.
# @POST: Returns dictionary schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]:
with belief_scope("get_schema"):
return {
"type": "object",
"properties": {
"env": {
"type": "string",
"title": "Environment",
"description": "The Superset environment (e.g., 'dev')."
},
"dataset_id": {
"type": "integer",
"title": "Dataset ID",
"description": "The ID of the dataset to update."
},
"source": {
"type": "string",
"title": "Mapping Source",
"enum": ["postgres", "excel"],
"default": "postgres"
},
"connection_id": {
"type": "string",
"title": "Saved Connection",
"description": "The ID of a saved database connection (for postgres source)."
},
"table_name": {
"type": "string",
"title": "Table Name",
"description": "Target table name in PostgreSQL."
},
"table_schema": {
"type": "string",
"title": "Table Schema",
"description": "Target table schema in PostgreSQL.",
"default": "public"
},
"excel_path": {
"type": "string",
"title": "Excel Path",
"description": "Path to the Excel file (for excel source)."
}
},
"required": ["env", "dataset_id", "source"]
}
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes the dataset mapping logic.
# @PARAM: params (Dict[str, Any]) - Mapping parameters.
# @PRE: Params contain valid 'env', 'dataset_id', and 'source'. params must be a dictionary.
# @POST: Updates the dataset in Superset.
# @RETURN: Dict[str, Any] - Execution status.
async def execute(self, params: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("execute"):
env_name = params.get("env")
dataset_id = params.get("dataset_id")
source = params.get("source")
if not env_name or dataset_id is None or not source:
logger.error("[MapperPlugin.execute][State] Missing required parameters.")
raise ValueError("Missing required parameters: env, dataset_id, source")
# Get config and initialize client
from ..dependencies import get_config_manager
config_manager = get_config_manager()
env_config = config_manager.get_environment(env_name)
if not env_config:
logger.error(f"[MapperPlugin.execute][State] Environment '{env_name}' not found.")
raise ValueError(f"Environment '{env_name}' not found in configuration.")
client = SupersetClient(env_config)
client.authenticate()
postgres_config = None
if source == "postgres":
connection_id = params.get("connection_id")
if not connection_id:
logger.error("[MapperPlugin.execute][State] connection_id is required for postgres source.")
raise ValueError("connection_id is required for postgres source.")
# Load connection from DB
db = SessionLocal()
try:
conn_config = db.query(ConnectionConfig).filter(ConnectionConfig.id == connection_id).first()
if not conn_config:
logger.error(f"[MapperPlugin.execute][State] Connection {connection_id} not found.")
raise ValueError(f"Connection {connection_id} not found.")
postgres_config = {
'dbname': conn_config.database,
'user': conn_config.username,
'password': conn_config.password,
'host': conn_config.host,
'port': str(conn_config.port) if conn_config.port else '5432'
}
finally:
db.close()
logger.info(f"[MapperPlugin.execute][Action] Starting mapping for dataset {dataset_id} in {env_name}")
mapper = DatasetMapper()
try:
mapper.run_mapping(
superset_client=client,
dataset_id=dataset_id,
source=source,
postgres_config=postgres_config,
excel_path=params.get("excel_path"),
table_name=params.get("table_name"),
table_schema=params.get("table_schema") or "public"
)
logger.info(f"[MapperPlugin.execute][Success] Mapping completed for dataset {dataset_id}")
return {"status": "success", "dataset_id": dataset_id}
except Exception as e:
logger.error(f"[MapperPlugin.execute][Failure] Mapping failed: {e}")
raise
# [/DEF:execute:Function]
# [/DEF:MapperPlugin:Class]
# [/DEF:MapperPluginModule:Module]

View File

@@ -12,38 +12,82 @@ import zipfile
import re import re
from ..core.plugin_base import PluginBase from ..core.plugin_base import PluginBase
from superset_tool.client import SupersetClient from ..core.logger import belief_scope
from superset_tool.utils.init_clients import setup_clients from ..core.superset_client import SupersetClient
from superset_tool.utils.fileio import create_temp_file, update_yamls, create_dashboard_export from ..core.utils.fileio import create_temp_file, update_yamls, create_dashboard_export
from ..dependencies import get_config_manager from ..dependencies import get_config_manager
from superset_tool.utils.logger import SupersetLogger
from ..core.migration_engine import MigrationEngine from ..core.migration_engine import MigrationEngine
from ..core.database import SessionLocal from ..core.database import SessionLocal
from ..models.mapping import DatabaseMapping, Environment from ..models.mapping import DatabaseMapping, Environment
# [DEF:MigrationPlugin:Class]
# @PURPOSE: Implementation of the migration plugin logic.
class MigrationPlugin(PluginBase): class MigrationPlugin(PluginBase):
""" """
A plugin to migrate Superset dashboards between environments. A plugin to migrate Superset dashboards between environments.
""" """
@property @property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the migration plugin.
# @PRE: None.
# @POST: Returns "superset-migration".
# @RETURN: str - "superset-migration"
def id(self) -> str: def id(self) -> str:
return "superset-migration" with belief_scope("id"):
return "superset-migration"
# [/DEF:id:Function]
@property @property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the migration plugin.
# @PRE: None.
# @POST: Returns the plugin name.
# @RETURN: str - Plugin name.
def name(self) -> str: def name(self) -> str:
return "Superset Dashboard Migration" with belief_scope("name"):
return "Superset Dashboard Migration"
# [/DEF:name:Function]
@property @property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the migration plugin.
# @PRE: None.
# @POST: Returns the plugin description.
# @RETURN: str - Plugin description.
def description(self) -> str: def description(self) -> str:
return "Migrates dashboards between Superset environments." with belief_scope("description"):
return "Migrates dashboards between Superset environments."
# [/DEF:description:Function]
@property @property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the migration plugin.
# @PRE: None.
# @POST: Returns "1.0.0".
# @RETURN: str - "1.0.0"
def version(self) -> str: def version(self) -> str:
return "1.0.0" with belief_scope("version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the migration plugin.
# @RETURN: str - "/migration"
def ui_route(self) -> str:
with belief_scope("ui_route"):
return "/migration"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for migration plugin parameters.
# @PRE: Config manager is available.
# @POST: Returns a valid JSON schema dictionary.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]: def get_schema(self) -> Dict[str, Any]:
config_manager = get_config_manager() with belief_scope("get_schema"):
config_manager = get_config_manager()
envs = [e.name for e in config_manager.get_environments()] envs = [e.name for e in config_manager.get_environments()]
return { return {
@@ -85,36 +129,152 @@ class MigrationPlugin(PluginBase):
}, },
"required": ["from_env", "to_env", "dashboard_regex"], "required": ["from_env", "to_env", "dashboard_regex"],
} }
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes the dashboard migration logic.
# @PARAM: params (Dict[str, Any]) - Migration parameters.
# @PRE: Source and target environments must be configured.
# @POST: Selected dashboards are migrated.
async def execute(self, params: Dict[str, Any]): async def execute(self, params: Dict[str, Any]):
from_env = params["from_env"] with belief_scope("MigrationPlugin.execute"):
to_env = params["to_env"] source_env_id = params.get("source_env_id")
dashboard_regex = params["dashboard_regex"] target_env_id = params.get("target_env_id")
selected_ids = params.get("selected_ids")
# Legacy support or alternative params
from_env_name = params.get("from_env")
to_env_name = params.get("to_env")
dashboard_regex = params.get("dashboard_regex")
replace_db_config = params.get("replace_db_config", False) replace_db_config = params.get("replace_db_config", False)
from_db_id = params.get("from_db_id") from_db_id = params.get("from_db_id")
to_db_id = params.get("to_db_id") to_db_id = params.get("to_db_id")
logger = SupersetLogger(log_dir=Path.cwd() / "logs", console=True) # [DEF:MigrationPlugin.execute:Action]
logger.info(f"[MigrationPlugin][Entry] Starting migration from {from_env} to {to_env}.") # @PURPOSE: Execute the migration logic with proper task logging.
task_id = params.get("_task_id")
from ..dependencies import get_task_manager
tm = get_task_manager()
class TaskLoggerProxy:
# [DEF:__init__:Function]
# @PURPOSE: Initializes the proxy logger.
# @PRE: None.
# @POST: Instance is initialized.
def __init__(self):
with belief_scope("__init__"):
# Initialize parent with dummy values since we override methods
pass
# [/DEF:__init__:Function]
# [DEF:debug:Function]
# @PURPOSE: Logs a debug message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def debug(self, msg, *args, extra=None, **kwargs):
with belief_scope("debug"):
if task_id: tm._add_log(task_id, "DEBUG", msg, extra or {})
# [/DEF:debug:Function]
# [DEF:info:Function]
# @PURPOSE: Logs an info message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def info(self, msg, *args, extra=None, **kwargs):
with belief_scope("info"):
if task_id: tm._add_log(task_id, "INFO", msg, extra or {})
# [/DEF:info:Function]
# [DEF:warning:Function]
# @PURPOSE: Logs a warning message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def warning(self, msg, *args, extra=None, **kwargs):
with belief_scope("warning"):
if task_id: tm._add_log(task_id, "WARNING", msg, extra or {})
# [/DEF:warning:Function]
# [DEF:error:Function]
# @PURPOSE: Logs an error message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def error(self, msg, *args, extra=None, **kwargs):
with belief_scope("error"):
if task_id: tm._add_log(task_id, "ERROR", msg, extra or {})
# [/DEF:error:Function]
# [DEF:critical:Function]
# @PURPOSE: Logs a critical message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def critical(self, msg, *args, extra=None, **kwargs):
with belief_scope("critical"):
if task_id: tm._add_log(task_id, "ERROR", msg, extra or {})
# [/DEF:critical:Function]
# [DEF:exception:Function]
# @PURPOSE: Logs an exception message to the task manager.
# @PRE: msg is a string.
# @POST: Log is added to task manager if task_id exists.
def exception(self, msg, *args, **kwargs):
with belief_scope("exception"):
if task_id: tm._add_log(task_id, "ERROR", msg, {"exception": True})
# [/DEF:exception:Function]
logger = TaskLoggerProxy()
logger.info(f"[MigrationPlugin][Entry] Starting migration task.")
logger.info(f"[MigrationPlugin][Action] Params: {params}")
try: try:
config_manager = get_config_manager() with belief_scope("execute"):
all_clients = setup_clients(logger, custom_envs=config_manager.get_environments()) config_manager = get_config_manager()
from_c = all_clients.get(from_env) environments = config_manager.get_environments()
to_c = all_clients.get(to_env)
# Resolve environments
src_env = None
tgt_env = None
if source_env_id:
src_env = next((e for e in environments if e.id == source_env_id), None)
elif from_env_name:
src_env = next((e for e in environments if e.name == from_env_name), None)
if target_env_id:
tgt_env = next((e for e in environments if e.id == target_env_id), None)
elif to_env_name:
tgt_env = next((e for e in environments if e.name == to_env_name), None)
if not src_env or not tgt_env:
raise ValueError(f"Could not resolve source or target environment. Source: {source_env_id or from_env_name}, Target: {target_env_id or to_env_name}")
from_env_name = src_env.name
to_env_name = tgt_env.name
logger.info(f"[MigrationPlugin][State] Resolved environments: {from_env_name} -> {to_env_name}")
from_c = SupersetClient(src_env)
to_c = SupersetClient(tgt_env)
if not from_c or not to_c: if not from_c or not to_c:
raise ValueError(f"One or both environments ('{from_env}', '{to_env}') not found in configuration.") raise ValueError(f"Clients not initialized for environments: {from_env_name}, {to_env_name}")
_, all_dashboards = from_c.get_dashboards() _, all_dashboards = from_c.get_dashboards()
regex_str = str(dashboard_regex) dashboards_to_migrate = []
dashboards_to_migrate = [ if selected_ids:
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE) dashboards_to_migrate = [d for d in all_dashboards if d["id"] in selected_ids]
] elif dashboard_regex:
regex_str = str(dashboard_regex)
dashboards_to_migrate = [
d for d in all_dashboards if re.search(regex_str, d["dashboard_title"], re.IGNORECASE)
]
else:
logger.warning("[MigrationPlugin][State] No selection criteria provided (selected_ids or dashboard_regex).")
return
if not dashboards_to_migrate: if not dashboards_to_migrate:
logger.warning("[MigrationPlugin][State] No dashboards found matching the regex.") logger.warning("[MigrationPlugin][State] No dashboards found matching criteria.")
return return
# Fetch mappings from database # Fetch mappings from database
@@ -123,8 +283,8 @@ class MigrationPlugin(PluginBase):
db = SessionLocal() db = SessionLocal()
try: try:
# Find environment IDs by name # Find environment IDs by name
src_env = db.query(Environment).filter(Environment.name == from_env).first() src_env = db.query(Environment).filter(Environment.name == from_env_name).first()
tgt_env = db.query(Environment).filter(Environment.name == to_env).first() tgt_env = db.query(Environment).filter(Environment.name == to_env_name).first()
if src_env and tgt_env: if src_env and tgt_env:
mappings = db.query(DatabaseMapping).filter( mappings = db.query(DatabaseMapping).filter(
@@ -143,55 +303,94 @@ class MigrationPlugin(PluginBase):
try: try:
exported_content, _ = from_c.export_dashboard(dash_id) exported_content, _ = from_c.export_dashboard(dash_id)
with create_temp_file(content=exported_content, dry_run=True, suffix=".zip", logger=logger) as tmp_zip_path: with create_temp_file(content=exported_content, dry_run=True, suffix=".zip") as tmp_zip_path:
if not replace_db_config: # Always transform to strip databases to avoid password errors
to_c.import_dashboard(file_name=tmp_zip_path, dash_id=dash_id, dash_slug=dash_slug) with create_temp_file(suffix=".zip", dry_run=True) as tmp_new_zip:
else: success = engine.transform_zip(str(tmp_zip_path), str(tmp_new_zip), db_mapping, strip_databases=False)
# Check for missing mappings before transformation
# This is a simplified check, in reality we'd check all YAMLs if not success and replace_db_config:
# For US3, we'll just use the engine and handle missing ones there # Signal missing mapping and wait (only if we care about mappings)
with create_temp_file(suffix=".zip", dry_run=True, logger=logger) as tmp_new_zip: if task_id:
# If we have missing mappings, we might need to pause logger.info(f"[MigrationPlugin][Action] Pausing for missing mapping in task {task_id}")
# For now, let's assume the engine can tell us what's missing # In a real scenario, we'd pass the missing DB info to the frontend
success = engine.transform_zip(str(tmp_zip_path), str(tmp_new_zip), db_mapping) # For this task, we'll just simulate the wait
await tm.wait_for_resolution(task_id)
if not success: # After resolution, retry transformation with updated mappings
# Signal missing mapping and wait # (Mappings would be updated in task.params by resolve_task)
task_id = params.get("_task_id") db = SessionLocal()
if task_id: try:
from ..dependencies import get_task_manager src_env = db.query(Environment).filter(Environment.name == from_env_name).first()
tm = get_task_manager() tgt_env = db.query(Environment).filter(Environment.name == to_env_name).first()
logger.info(f"[MigrationPlugin][Action] Pausing for missing mapping in task {task_id}") mappings = db.query(DatabaseMapping).filter(
# In a real scenario, we'd pass the missing DB info to the frontend DatabaseMapping.source_env_id == src_env.id,
# For this task, we'll just simulate the wait DatabaseMapping.target_env_id == tgt_env.id
await tm.wait_for_resolution(task_id) ).all()
# After resolution, retry transformation with updated mappings db_mapping = {m.source_db_uuid: m.target_db_uuid for m in mappings}
# (Mappings would be updated in task.params by resolve_task) finally:
db = SessionLocal() db.close()
try: success = engine.transform_zip(str(tmp_zip_path), str(tmp_new_zip), db_mapping, strip_databases=False)
src_env = db.query(Environment).filter(Environment.name == from_env).first()
tgt_env = db.query(Environment).filter(Environment.name == to_env).first()
mappings = db.query(DatabaseMapping).filter(
DatabaseMapping.source_env_id == src_env.id,
DatabaseMapping.target_env_id == tgt_env.id
).all()
db_mapping = {m.source_db_uuid: m.target_db_uuid for m in mappings}
finally:
db.close()
success = engine.transform_zip(str(tmp_zip_path), str(tmp_new_zip), db_mapping)
if success: if success:
to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug) to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug)
else: else:
logger.error(f"[MigrationPlugin][Failure] Failed to transform ZIP for dashboard {title}") logger.error(f"[MigrationPlugin][Failure] Failed to transform ZIP for dashboard {title}")
logger.info(f"[MigrationPlugin][Success] Dashboard {title} imported.") logger.info(f"[MigrationPlugin][Success] Dashboard {title} imported.")
except Exception as exc: except Exception as exc:
# Check for password error
error_msg = str(exc)
# The error message from Superset is often a JSON string inside a string.
# We need to robustly detect the password requirement.
# Typical error: "Error importing dashboard: databases/PostgreSQL.yaml: {'_schema': ['Must provide a password for the database']}"
if "Must provide a password for the database" in error_msg:
# Extract database name
# Try to find "databases/DBNAME.yaml" pattern
import re
db_name = "unknown"
match = re.search(r"databases/([^.]+)\.yaml", error_msg)
if match:
db_name = match.group(1)
else:
# Fallback: try to find 'database 'NAME'' pattern
match_alt = re.search(r"database '([^']+)'", error_msg)
if match_alt:
db_name = match_alt.group(1)
logger.warning(f"[MigrationPlugin][Action] Detected missing password for database: {db_name}")
if task_id:
input_request = {
"type": "database_password",
"databases": [db_name],
"error_message": error_msg
}
tm.await_input(task_id, input_request)
# Wait for user input
await tm.wait_for_input(task_id)
# Resume with passwords
task = tm.get_task(task_id)
passwords = task.params.get("passwords", {})
# Retry import with password
if passwords:
logger.info(f"[MigrationPlugin][Action] Retrying import for {title} with provided passwords.")
to_c.import_dashboard(file_name=tmp_new_zip, dash_id=dash_id, dash_slug=dash_slug, passwords=passwords)
logger.info(f"[MigrationPlugin][Success] Dashboard {title} imported after password injection.")
# Clear passwords from params after use for security
if "passwords" in task.params:
del task.params["passwords"]
continue
logger.error(f"[MigrationPlugin][Failure] Failed to migrate dashboard {title}: {exc}", exc_info=True) logger.error(f"[MigrationPlugin][Failure] Failed to migrate dashboard {title}: {exc}", exc_info=True)
logger.info("[MigrationPlugin][Exit] Migration finished.") logger.info("[MigrationPlugin][Exit] Migration finished.")
except Exception as e: except Exception as e:
logger.critical(f"[MigrationPlugin][Failure] Fatal error during migration: {e}", exc_info=True) logger.critical(f"[MigrationPlugin][Failure] Fatal error during migration: {e}", exc_info=True)
raise e raise e
# [/DEF:MigrationPlugin] # [/DEF:MigrationPlugin.execute:Action]
# [/DEF:execute:Function]
# [/DEF:MigrationPlugin:Class]
# [/DEF:MigrationPlugin:Module]

View File

@@ -0,0 +1,211 @@
# [DEF:SearchPluginModule:Module]
# @SEMANTICS: plugin, search, datasets, regex, superset
# @PURPOSE: Implements a plugin for searching text patterns across all datasets in a specific Superset environment.
# @LAYER: Plugins
# @RELATION: Inherits from PluginBase. Uses SupersetClient from core.
# @CONSTRAINT: Must use belief_scope for logging.
# [SECTION: IMPORTS]
import re
from typing import Dict, Any, List, Optional
from ..core.plugin_base import PluginBase
from ..core.superset_client import SupersetClient
from ..core.logger import logger, belief_scope
# [/SECTION]
# [DEF:SearchPlugin:Class]
# @PURPOSE: Plugin for searching text patterns in Superset datasets.
class SearchPlugin(PluginBase):
"""
Plugin for searching text patterns in Superset datasets.
"""
@property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the search plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string ID.
# @RETURN: str - "search-datasets"
def id(self) -> str:
with belief_scope("id"):
return "search-datasets"
# [/DEF:id:Function]
@property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the search plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string name.
# @RETURN: str - Plugin name.
def name(self) -> str:
with belief_scope("name"):
return "Search Datasets"
# [/DEF:name:Function]
@property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the search plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string description.
# @RETURN: str - Plugin description.
def description(self) -> str:
with belief_scope("description"):
return "Search for text patterns across all datasets in a specific environment."
# [/DEF:description:Function]
@property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the search plugin.
# @PRE: Plugin instance exists.
# @POST: Returns string version.
# @RETURN: str - "1.0.0"
def version(self) -> str:
with belief_scope("version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the search plugin.
# @RETURN: str - "/tools/search"
def ui_route(self) -> str:
with belief_scope("ui_route"):
return "/tools/search"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for the search plugin parameters.
# @PRE: Plugin instance exists.
# @POST: Returns dictionary schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]:
with belief_scope("get_schema"):
return {
"type": "object",
"properties": {
"env": {
"type": "string",
"title": "Environment",
"description": "The Superset environment to search in (e.g., 'dev', 'prod')."
},
"query": {
"type": "string",
"title": "Search Query (Regex)",
"description": "The regex pattern to search for."
}
},
"required": ["env", "query"]
}
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes the dataset search logic.
# @PARAM: params (Dict[str, Any]) - Search parameters.
# @PRE: Params contain valid 'env' and 'query'.
# @POST: Returns a dictionary with count and results list.
# @RETURN: Dict[str, Any] - Search results.
async def execute(self, params: Dict[str, Any]) -> Dict[str, Any]:
with belief_scope("SearchPlugin.execute", f"params={params}"):
env_name = params.get("env")
search_query = params.get("query")
if not env_name or not search_query:
logger.error("[SearchPlugin.execute][State] Missing required parameters.")
raise ValueError("Missing required parameters: env, query")
# Get config and initialize client
from ..dependencies import get_config_manager
config_manager = get_config_manager()
env_config = config_manager.get_environment(env_name)
if not env_config:
logger.error(f"[SearchPlugin.execute][State] Environment '{env_name}' not found.")
raise ValueError(f"Environment '{env_name}' not found in configuration.")
client = SupersetClient(env_config)
client.authenticate()
logger.info(f"[SearchPlugin.execute][Action] Searching for pattern: '{search_query}' in environment: {env_name}")
try:
# Ported logic from search_script.py
_, datasets = client.get_datasets(query={"columns": ["id", "table_name", "sql", "database", "columns"]})
if not datasets:
logger.warning("[SearchPlugin.execute][State] No datasets found.")
return {"count": 0, "results": []}
pattern = re.compile(search_query, re.IGNORECASE)
results = []
for dataset in datasets:
dataset_id = dataset.get('id')
dataset_name = dataset.get('table_name', 'Unknown')
if not dataset_id:
continue
for field, value in dataset.items():
value_str = str(value)
if pattern.search(value_str):
match_obj = pattern.search(value_str)
results.append({
"dataset_id": dataset_id,
"dataset_name": dataset_name,
"field": field,
"match_context": self._get_context(value_str, match_obj.group() if match_obj else ""),
"full_value": value_str
})
logger.info(f"[SearchPlugin.execute][Success] Found matches in {len(results)} locations.")
return {
"count": len(results),
"results": results
}
except re.error as e:
logger.error(f"[SearchPlugin.execute][Failure] Invalid regex pattern: {e}")
raise ValueError(f"Invalid regex pattern: {e}")
except Exception as e:
logger.error(f"[SearchPlugin.execute][Failure] Error during search: {e}")
raise
# [/DEF:execute:Function]
# [DEF:_get_context:Function]
# @PURPOSE: Extracts a small context around the match for display.
# @PARAM: text (str) - The full text to extract context from.
# @PARAM: match_text (str) - The matched text pattern.
# @PARAM: context_lines (int) - Number of lines of context to include.
# @PRE: text and match_text must be strings.
# @POST: Returns context string.
# @RETURN: str - Extracted context.
def _get_context(self, text: str, match_text: str, context_lines: int = 1) -> str:
"""
Extracts a small context around the match for display.
"""
with belief_scope("_get_context"):
if not match_text:
return text[:100] + "..." if len(text) > 100 else text
lines = text.splitlines()
match_line_index = -1
for i, line in enumerate(lines):
if match_text in line:
match_line_index = i
break
if match_line_index != -1:
start = max(0, match_line_index - context_lines)
end = min(len(lines), match_line_index + context_lines + 1)
context = []
for i in range(start, end):
line_content = lines[i]
if i == match_line_index:
context.append(f"==> {line_content}")
else:
context.append(f" {line_content}")
return "\n".join(context)
return text[:100] + "..." if len(text) > 100 else text
# [/DEF:_get_context:Function]
# [/DEF:SearchPlugin:Class]
# [/DEF:SearchPluginModule:Module]

View File

@@ -0,0 +1,3 @@
from .plugin import StoragePlugin
__all__ = ["StoragePlugin"]

View File

@@ -0,0 +1,333 @@
# [DEF:StoragePlugin:Module]
#
# @SEMANTICS: storage, files, filesystem, plugin
# @PURPOSE: Provides core filesystem operations for managing backups and repositories.
# @LAYER: App
# @RELATION: IMPLEMENTS -> PluginBase
# @RELATION: DEPENDS_ON -> backend.src.models.storage
#
# @INVARIANT: All file operations must be restricted to the configured storage root.
# [SECTION: IMPORTS]
import os
import shutil
from pathlib import Path
from datetime import datetime
from typing import Dict, Any, List, Optional
from fastapi import UploadFile
from ...core.plugin_base import PluginBase
from ...core.logger import belief_scope, logger
from ...models.storage import StoredFile, FileCategory, StorageConfig
from ...dependencies import get_config_manager
# [/SECTION]
# [DEF:StoragePlugin:Class]
# @PURPOSE: Implementation of the storage management plugin.
class StoragePlugin(PluginBase):
"""
Plugin for managing local file storage for backups and repositories.
"""
# [DEF:__init__:Function]
# @PURPOSE: Initializes the StoragePlugin and ensures required directories exist.
# @PRE: Configuration manager must be accessible.
# @POST: Storage root and category directories are created on disk.
def __init__(self):
with belief_scope("StoragePlugin:init"):
self.ensure_directories()
# [/DEF:__init__:Function]
@property
# [DEF:id:Function]
# @PURPOSE: Returns the unique identifier for the storage plugin.
# @PRE: None.
# @POST: Returns the plugin ID string.
# @RETURN: str - "storage-manager"
def id(self) -> str:
with belief_scope("StoragePlugin:id"):
return "storage-manager"
# [/DEF:id:Function]
@property
# [DEF:name:Function]
# @PURPOSE: Returns the human-readable name of the storage plugin.
# @PRE: None.
# @POST: Returns the plugin name string.
# @RETURN: str - "Storage Manager"
def name(self) -> str:
with belief_scope("StoragePlugin:name"):
return "Storage Manager"
# [/DEF:name:Function]
@property
# [DEF:description:Function]
# @PURPOSE: Returns a description of the storage plugin.
# @PRE: None.
# @POST: Returns the plugin description string.
# @RETURN: str - Plugin description.
def description(self) -> str:
with belief_scope("StoragePlugin:description"):
return "Manages local file storage for backups and repositories."
# [/DEF:description:Function]
@property
# [DEF:version:Function]
# @PURPOSE: Returns the version of the storage plugin.
# @PRE: None.
# @POST: Returns the version string.
# @RETURN: str - "1.0.0"
def version(self) -> str:
with belief_scope("StoragePlugin:version"):
return "1.0.0"
# [/DEF:version:Function]
@property
# [DEF:ui_route:Function]
# @PURPOSE: Returns the frontend route for the storage plugin.
# @RETURN: str - "/tools/storage"
def ui_route(self) -> str:
with belief_scope("StoragePlugin:ui_route"):
return "/tools/storage"
# [/DEF:ui_route:Function]
# [DEF:get_schema:Function]
# @PURPOSE: Returns the JSON schema for storage plugin parameters.
# @PRE: None.
# @POST: Returns a dictionary representing the JSON schema.
# @RETURN: Dict[str, Any] - JSON schema.
def get_schema(self) -> Dict[str, Any]:
with belief_scope("StoragePlugin:get_schema"):
return {
"type": "object",
"properties": {
"category": {
"type": "string",
"enum": [c.value for c in FileCategory],
"title": "Category"
}
},
"required": ["category"]
}
# [/DEF:get_schema:Function]
# [DEF:execute:Function]
# @PURPOSE: Executes storage-related tasks (placeholder for PluginBase compliance).
# @PRE: params must match the plugin schema.
# @POST: Task is executed and logged.
async def execute(self, params: Dict[str, Any]):
with belief_scope("StoragePlugin:execute"):
logger.info(f"[StoragePlugin][Action] Executing with params: {params}")
# [/DEF:execute:Function]
# [DEF:get_storage_root:Function]
# @PURPOSE: Resolves the absolute path to the storage root.
# @PRE: Settings must define a storage root path.
# @POST: Returns a Path object representing the storage root.
def get_storage_root(self) -> Path:
with belief_scope("StoragePlugin:get_storage_root"):
config_manager = get_config_manager()
global_settings = config_manager.get_config().settings
# Use storage.root_path as the source of truth for storage UI
root = Path(global_settings.storage.root_path)
if not root.is_absolute():
# Resolve relative to the backend directory
# Path(__file__) is backend/src/plugins/storage/plugin.py
# parents[3] is the project root (ss-tools)
# We need to ensure it's relative to where backend/ is
project_root = Path(__file__).parents[3]
root = (project_root / root).resolve()
return root
# [/DEF:get_storage_root:Function]
# [DEF:resolve_path:Function]
# @PURPOSE: Resolves a dynamic path pattern using provided variables.
# @PARAM: pattern (str) - The path pattern to resolve.
# @PARAM: variables (Dict[str, str]) - Variables to substitute in the pattern.
# @PRE: pattern must be a valid format string.
# @POST: Returns the resolved path string.
# @RETURN: str - The resolved path.
def resolve_path(self, pattern: str, variables: Dict[str, str]) -> str:
with belief_scope("StoragePlugin:resolve_path"):
# Add common variables
vars_with_defaults = {
"timestamp": datetime.now().strftime("%Y%m%dT%H%M%S"),
**variables
}
try:
resolved = pattern.format(**vars_with_defaults)
# Clean up any double slashes or leading/trailing slashes for relative path
return os.path.normpath(resolved).strip("/")
except KeyError as e:
logger.warning(f"[StoragePlugin][Coherence:Failed] Missing variable for path resolution: {e}")
# Fallback to literal pattern if formatting fails partially (or handle as needed)
return pattern.replace("{", "").replace("}", "")
# [/DEF:resolve_path:Function]
# [DEF:ensure_directories:Function]
# @PURPOSE: Creates the storage root and category subdirectories if they don't exist.
# @PRE: Storage root must be resolvable.
# @POST: Directories are created on the filesystem.
# @SIDE_EFFECT: Creates directories on the filesystem.
def ensure_directories(self):
with belief_scope("StoragePlugin:ensure_directories"):
root = self.get_storage_root()
for category in FileCategory:
# Use singular name for consistency with BackupPlugin and GitService
path = root / category.value
path.mkdir(parents=True, exist_ok=True)
logger.debug(f"[StoragePlugin][Action] Ensured directory: {path}")
# [/DEF:ensure_directories:Function]
# [DEF:validate_path:Function]
# @PURPOSE: Prevents path traversal attacks by ensuring the path is within the storage root.
# @PRE: path must be a Path object.
# @POST: Returns the resolved absolute path if valid, otherwise raises ValueError.
def validate_path(self, path: Path) -> Path:
with belief_scope("StoragePlugin:validate_path"):
root = self.get_storage_root().resolve()
resolved = path.resolve()
try:
resolved.relative_to(root)
except ValueError:
logger.error(f"[StoragePlugin][Coherence:Failed] Path traversal detected: {resolved} is not under {root}")
raise ValueError("Access denied: Path is outside of storage root.")
return resolved
# [/DEF:validate_path:Function]
# [DEF:list_files:Function]
# @PURPOSE: Lists all files and directories in a specific category and subpath.
# @PARAM: category (Optional[FileCategory]) - The category to list.
# @PARAM: subpath (Optional[str]) - Nested path within the category.
# @PRE: Storage root must exist.
# @POST: Returns a list of StoredFile objects.
# @RETURN: List[StoredFile] - List of file and directory metadata objects.
def list_files(self, category: Optional[FileCategory] = None, subpath: Optional[str] = None) -> List[StoredFile]:
with belief_scope("StoragePlugin:list_files"):
root = self.get_storage_root()
logger.info(f"[StoragePlugin][Action] Listing files in root: {root}, category: {category}, subpath: {subpath}")
files = []
categories = [category] if category else list(FileCategory)
for cat in categories:
# Scan the category subfolder + optional subpath
base_dir = root / cat.value
if subpath:
target_dir = self.validate_path(base_dir / subpath)
else:
target_dir = base_dir
if not target_dir.exists():
continue
logger.debug(f"[StoragePlugin][Action] Scanning directory: {target_dir}")
# Use os.scandir for better performance and to distinguish files vs dirs
with os.scandir(target_dir) as it:
for entry in it:
# Skip logs
if "Logs" in entry.path:
continue
stat = entry.stat()
is_dir = entry.is_dir()
files.append(StoredFile(
name=entry.name,
path=str(Path(entry.path).relative_to(root)),
size=stat.st_size if not is_dir else 0,
created_at=datetime.fromtimestamp(stat.st_ctime),
category=cat,
mime_type="directory" if is_dir else None
))
# Sort: directories first, then by name
return sorted(files, key=lambda x: (x.mime_type != "directory", x.name))
# [/DEF:list_files:Function]
# [DEF:save_file:Function]
# @PURPOSE: Saves an uploaded file to the specified category and optional subpath.
# @PARAM: file (UploadFile) - The uploaded file.
# @PARAM: category (FileCategory) - The target category.
# @PARAM: subpath (Optional[str]) - The target subpath.
# @PRE: file must be a valid UploadFile; category must be valid.
# @POST: File is written to disk and metadata is returned.
# @RETURN: StoredFile - Metadata of the saved file.
# @SIDE_EFFECT: Writes file to disk.
async def save_file(self, file: UploadFile, category: FileCategory, subpath: Optional[str] = None) -> StoredFile:
with belief_scope("StoragePlugin:save_file"):
root = self.get_storage_root()
dest_dir = root / category.value
if subpath:
dest_dir = dest_dir / subpath
dest_dir.mkdir(parents=True, exist_ok=True)
dest_path = self.validate_path(dest_dir / file.filename)
with dest_path.open("wb") as buffer:
shutil.copyfileobj(file.file, buffer)
stat = dest_path.stat()
return StoredFile(
name=dest_path.name,
path=str(dest_path.relative_to(root)),
size=stat.st_size,
created_at=datetime.fromtimestamp(stat.st_ctime),
category=category,
mime_type=file.content_type
)
# [/DEF:save_file:Function]
# [DEF:delete_file:Function]
# @PURPOSE: Deletes a file or directory from the specified category and path.
# @PARAM: category (FileCategory) - The category.
# @PARAM: path (str) - The relative path of the file or directory.
# @PRE: path must belong to the specified category and exist on disk.
# @POST: The file or directory is removed from disk.
# @SIDE_EFFECT: Removes item from disk.
def delete_file(self, category: FileCategory, path: str):
with belief_scope("StoragePlugin:delete_file"):
root = self.get_storage_root()
# path is relative to root, but we ensure it starts with category
full_path = self.validate_path(root / path)
if not str(Path(path)).startswith(category.value):
raise ValueError(f"Path {path} does not belong to category {category}")
if full_path.exists():
if full_path.is_dir():
shutil.rmtree(full_path)
else:
full_path.unlink()
logger.info(f"[StoragePlugin][Action] Deleted: {full_path}")
else:
raise FileNotFoundError(f"Item {path} not found")
# [/DEF:delete_file:Function]
# [DEF:get_file_path:Function]
# @PURPOSE: Returns the absolute path of a file for download.
# @PARAM: category (FileCategory) - The category.
# @PARAM: path (str) - The relative path of the file.
# @PRE: path must belong to the specified category and be a file.
# @POST: Returns the absolute Path to the file.
# @RETURN: Path - Absolute path to the file.
def get_file_path(self, category: FileCategory, path: str) -> Path:
with belief_scope("StoragePlugin:get_file_path"):
root = self.get_storage_root()
file_path = self.validate_path(root / path)
if not str(Path(path)).startswith(category.value):
raise ValueError(f"Path {path} does not belong to category {category}")
if not file_path.exists() or file_path.is_dir():
raise FileNotFoundError(f"File {path} not found")
return file_path
# [/DEF:get_file_path:Function]
# [/DEF:StoragePlugin:Class]
# [/DEF:StoragePlugin:Module]

128
backend/src/schemas/auth.py Normal file
View File

@@ -0,0 +1,128 @@
# [DEF:backend.src.schemas.auth:Module]
#
# @TIER: STANDARD
# @SEMANTICS: auth, schemas, pydantic, user, token
# @PURPOSE: Pydantic schemas for authentication requests and responses.
# @LAYER: API
# @RELATION: DEPENDS_ON -> pydantic
#
# @INVARIANT: Sensitive fields like password must not be included in response schemas.
# [SECTION: IMPORTS]
from typing import List, Optional
from pydantic import BaseModel, EmailStr, Field
from datetime import datetime
# [/SECTION]
# [DEF:Token:Class]
# @TIER: TRIVIAL
# @PURPOSE: Represents a JWT access token response.
class Token(BaseModel):
access_token: str
token_type: str
# [/DEF:Token:Class]
# [DEF:TokenData:Class]
# @TIER: TRIVIAL
# @PURPOSE: Represents the data encoded in a JWT token.
class TokenData(BaseModel):
username: Optional[str] = None
scopes: List[str] = []
# [/DEF:TokenData:Class]
# [DEF:PermissionSchema:Class]
# @TIER: TRIVIAL
# @PURPOSE: Represents a permission in API responses.
class PermissionSchema(BaseModel):
id: Optional[str] = None
resource: str
action: str
class Config:
from_attributes = True
# [/DEF:PermissionSchema:Class]
# [DEF:RoleSchema:Class]
# @PURPOSE: Represents a role in API responses.
class RoleSchema(BaseModel):
id: str
name: str
description: Optional[str] = None
permissions: List[PermissionSchema] = []
class Config:
from_attributes = True
# [/DEF:RoleSchema:Class]
# [DEF:RoleCreate:Class]
# @PURPOSE: Schema for creating a new role.
class RoleCreate(BaseModel):
name: str
description: Optional[str] = None
permissions: List[str] = [] # List of permission IDs or "resource:action" strings
# [/DEF:RoleCreate:Class]
# [DEF:RoleUpdate:Class]
# @PURPOSE: Schema for updating an existing role.
class RoleUpdate(BaseModel):
name: Optional[str] = None
description: Optional[str] = None
permissions: Optional[List[str]] = None
# [/DEF:RoleUpdate:Class]
# [DEF:ADGroupMappingSchema:Class]
# @PURPOSE: Represents an AD Group to Role mapping in API responses.
class ADGroupMappingSchema(BaseModel):
id: str
ad_group: str
role_id: str
class Config:
from_attributes = True
# [/DEF:ADGroupMappingSchema:Class]
# [DEF:ADGroupMappingCreate:Class]
# @PURPOSE: Schema for creating an AD Group mapping.
class ADGroupMappingCreate(BaseModel):
ad_group: str
role_id: str
# [/DEF:ADGroupMappingCreate:Class]
# [DEF:UserBase:Class]
# @PURPOSE: Base schema for user data.
class UserBase(BaseModel):
username: str
email: Optional[EmailStr] = None
is_active: bool = True
# [/DEF:UserBase:Class]
# [DEF:UserCreate:Class]
# @PURPOSE: Schema for creating a new user.
class UserCreate(UserBase):
password: str
roles: List[str] = []
# [/DEF:UserCreate:Class]
# [DEF:UserUpdate:Class]
# @PURPOSE: Schema for updating an existing user.
class UserUpdate(BaseModel):
email: Optional[EmailStr] = None
password: Optional[str] = None
is_active: Optional[bool] = None
roles: Optional[List[str]] = None
# [/DEF:UserUpdate:Class]
# [DEF:User:Class]
# @PURPOSE: Schema for user data in API responses.
class User(UserBase):
id: str
auth_source: str
created_at: datetime
last_login: Optional[datetime] = None
roles: List[RoleSchema] = []
class Config:
from_attributes = True
# [/DEF:User:Class]
# [/DEF:backend.src.schemas.auth:Module]

View File

@@ -0,0 +1,82 @@
# [DEF:backend.src.scripts.create_admin:Module]
#
# @SEMANTICS: admin, setup, user, auth, cli
# @PURPOSE: CLI tool for creating the initial admin user.
# @LAYER: Scripts
# @RELATION: USES -> backend.src.core.auth.security
# @RELATION: USES -> backend.src.core.database
# @RELATION: USES -> backend.src.models.auth
#
# @INVARIANT: Admin user must have the "Admin" role.
# [SECTION: IMPORTS]
import sys
import argparse
from pathlib import Path
# Add src to path
sys.path.append(str(Path(__file__).parent.parent.parent))
from src.core.database import AuthSessionLocal, init_db
from src.core.auth.security import get_password_hash
from src.models.auth import User, Role, Permission
from src.core.logger import logger, belief_scope
# [/SECTION]
# [DEF:create_admin:Function]
# @PURPOSE: Creates an admin user and necessary roles/permissions.
# @PRE: username and password provided via CLI.
# @POST: Admin user exists in auth.db.
#
# @PARAM: username (str) - Admin username.
# @PARAM: password (str) - Admin password.
def create_admin(username, password):
with belief_scope("create_admin"):
db = AuthSessionLocal()
try:
# 1. Ensure Admin role exists
admin_role = db.query(Role).filter(Role.name == "Admin").first()
if not admin_role:
logger.info("Creating Admin role...")
admin_role = Role(name="Admin", description="System Administrator")
db.add(admin_role)
db.commit()
db.refresh(admin_role)
# 2. Check if user already exists
existing_user = db.query(User).filter(User.username == username).first()
if existing_user:
logger.warning(f"User {username} already exists.")
return
# 3. Create Admin user
logger.info(f"Creating admin user: {username}")
new_user = User(
username=username,
password_hash=get_password_hash(password),
auth_source="LOCAL",
is_active=True
)
new_user.roles.append(admin_role)
db.add(new_user)
db.commit()
logger.info(f"Admin user {username} created successfully.")
except Exception as e:
logger.error(f"Failed to create admin user: {e}")
db.rollback()
finally:
db.close()
# [/DEF:create_admin:Function]
if __name__ == "__main__":
parser = argparse.ArgumentParser(description="Create initial admin user")
parser.add_argument("--username", required=True, help="Admin username")
parser.add_argument("--password", required=True, help="Admin password")
args = parser.parse_args()
# Ensure DB is initialized before creating admin
init_db()
create_admin(args.username, args.password)
# [/DEF:backend.src.scripts.create_admin:Module]

View File

@@ -0,0 +1,44 @@
# [DEF:backend.src.scripts.init_auth_db:Module]
#
# @SEMANTICS: setup, database, auth, migration
# @PURPOSE: Initializes the auth database and creates the necessary tables.
# @LAYER: Scripts
# @RELATION: CALLS -> backend.src.core.database.init_db
#
# @INVARIANT: Safe to run multiple times (idempotent).
# [SECTION: IMPORTS]
import sys
import os
from pathlib import Path
# Add src to path
sys.path.append(str(Path(__file__).parent.parent.parent))
from src.core.database import init_db, auth_engine
from src.core.logger import logger, belief_scope
from src.scripts.seed_permissions import seed_permissions
# [/SECTION]
# [DEF:run_init:Function]
# @PURPOSE: Main entry point for the initialization script.
# @POST: auth.db is initialized with the correct schema and seeded permissions.
def run_init():
with belief_scope("init_auth_db"):
logger.info("Initializing authentication database...")
try:
init_db()
logger.info("Authentication database initialized successfully.")
# Seed permissions
seed_permissions()
except Exception as e:
logger.error(f"Failed to initialize authentication database: {e}")
sys.exit(1)
# [/DEF:run_init:Function]
if __name__ == "__main__":
run_init()
# [/DEF:backend.src.scripts.init_auth_db:Module]

Some files were not shown because too many files have changed in this diff Show More