⚠️ Work in Progress: This project is undergoing a complete refactor with many new features being added. It is not recommended for production use at this time. Please wait for a stable release.
SeekingData is a cross-platform desktop application that integrates SFT (Supervised Fine-Tuning) data generation and Harbor task management, featuring Material Design 3 for a modern user experience.
| Feature | Description |
|---|---|
| Single Processing | File upload (PDF/DOCX/TXT), URL extraction, AI-powered generation |
| Batch Processing | Bulk URL processing with real-time progress tracking |
| Format Converter | Alpaca ↔ OpenAI bidirectional conversion |
| CoT Generator | Chain of Thought reasoning data generation |
| Image Dataset | Automatic image description generation |
| Video Dataset | Video understanding data processing |
| Dataset Sharing | One-click upload to HuggingFace |
| Feature | Description |
|---|---|
| GitHub Task Generator | Auto-generate tasks from GitHub repositories |
| Visual Task Builder | Drag-and-drop editing with Monaco Editor |
| Task Manager | List, search, view details, export tasks |
| Task Validation | Integrated Harbor validation tools |
- Framework: React 18 + Vite 5
- UI Design: Material Design 3
- Styling: TailwindCSS 3.4
- State Management: Zustand
- Routing: React Router DOM 7
- Code Editor: Monaco Editor
- Flow Editor: React Flow
- Framework: FastAPI 0.115+
- Language: Python 3.12
- Validation: Pydantic 2.10+
- LLM Integration: LiteLLM 1.40+
- Document Processing: Docling 2.0+
- Agent Framework: Camel AI 0.2.89
- Task Framework: Harbor 0.1.45
- Framework: Electron 33
- Packaging: Electron Builder
- Platforms: macOS, Windows, Linux
- Node.js 18+
- Python 3.12+
- uv (Python package manager)
- yarn (Node package manager)
# Clone repository
git clone https://github.com/yourusername/SeekingData.git
cd SeekingData
# Install frontend dependencies
yarn install
# Install backend dependencies
cd backend
uv venv .venv --python 3.12
source .venv/bin/activate # macOS/Linux
# or .venv\Scripts\activate # Windows
uv pip install -r requirements.txt# Terminal 1: Start backend
cd backend
source .venv/bin/activate
uvicorn main:app --reload --port 5001
# Terminal 2: Start frontend
yarn devAccess the application at: http://localhost:3002
# macOS
yarn build:mac
# Windows
yarn build:win
# Linux
yarn build:linux# LLM API Configuration
OPENAI_API_KEY=sk-xxx
# GitHub Token (optional, for GitHub task generation)
GITHUB_TOKEN=ghp_xxx
# Application
APP_NAME=SeekingData
APP_VERSION=0.1.0
DEBUG=trueConfigure via the Settings page in the application:
- API Base URL: LLM provider endpoint
- API Key: Your secret API key
- Model: Model identifier (e.g., qwen/qwen3.5-plus)
- Suggestions Count: Number of suggestions per request (1-10)
SeekingData/
├── src/ # React frontend
│ ├── components/
│ │ ├── sft/ # SFT data generation
│ │ ├── harbor/ # Harbor task management
│ │ ├── ui/ # Material Design 3 components
│ │ └── layout/ # Layout components
│ ├── lib/ # Utilities and stores
│ └── pages/ # Page components
├── backend/ # FastAPI backend
│ ├── agents/ # AI agents (GitHub, etc.)
│ ├── api/routes/ # API endpoints
│ ├── models/ # Pydantic models
│ ├── services/ # Business logic
│ └── tasks/ # Harbor task storage
├── electron/ # Electron main process
├── scripts/ # Build scripts
└── docs/ # Documentation
| Method | Endpoint | Description |
|---|---|---|
| GET | /api/sft/config |
Get current configuration |
| POST | /api/sft/config |
Save configuration |
| POST | /api/sft/generate |
Generate SFT data |
| POST | /api/sft/batch |
Batch URL processing |
| POST | /api/sft/convert |
Format conversion |
| GET | /api/harbor/tasks |
List all tasks |
| POST | /api/harbor/tasks |
Create new task |
| GET | /api/harbor/tasks/{id} |
Get task details |
| POST | /api/harbor/github/generate |
Generate from GitHub |
The application supports any LiteLLM-compatible model:
| Provider | Model Examples |
|---|---|
| OpenAI | gpt-4, gpt-4o, gpt-3.5-turbo |
| Qwen | qwen/qwen3.5-plus, qwen/qwen-max |
| Moonshot | moonshot/kimi-k2.5 |
| Zhipu | zhipu/glm-5, zhipu/glm-4 |
| MiniMax | minimax/MiniMax-M2.5 |
| DeepSeek | openai/deepseek-v3.2 |
Contributions are welcome! Please read our contributing guidelines before submitting a pull request.
This project is licensed under the MIT License - see the LICENSE file for details.
- Harbor - Agent task framework
- Camel AI - AI agent framework
- FastAPI - Modern web framework
- React - UI library
- Electron - Cross-platform desktop apps
- LiteLLM - Unified LLM interface
Made with ❤️ by SeekingX-AILab
