6 Commits

Author SHA1 Message Date
Cameron Cordes
39d284dbbb Update dependencies
Some checks failed
Core Repos/ImageApi/pipeline/pr-master Something is wrong with the build of this commit
2021-10-11 21:48:44 -04:00
Cameron Cordes
125ba6192e Elevate insertion logs to info and fix error logs 2021-10-11 21:33:22 -04:00
Cameron Cordes
50d557001b Add created timestamps for tags 2021-10-11 21:33:22 -04:00
Cameron Cordes
cf9dd826c1 Improve add tag endpoint and add get tag endpoint
Flattened out the add tag logic to make it more functional.
2021-10-11 21:33:18 -04:00
Cameron Cordes
4834cacfc3 Create Tag tables and Add Tag endpoint 2021-10-10 22:05:36 -04:00
Cameron Cordes
51081d01c6 Update dependencies 2021-10-10 22:04:01 -04:00
110 changed files with 1938 additions and 24680 deletions

View File

@@ -1,184 +0,0 @@
---
description: Perform a non-destructive cross-artifact consistency and quality analysis across spec.md, plan.md, and tasks.md after task generation.
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Goal
Identify inconsistencies, duplications, ambiguities, and underspecified items across the three core artifacts (`spec.md`, `plan.md`, `tasks.md`) before implementation. This command MUST run only after `/speckit.tasks` has successfully produced a complete `tasks.md`.
## Operating Constraints
**STRICTLY READ-ONLY**: Do **not** modify any files. Output a structured analysis report. Offer an optional remediation plan (user must explicitly approve before any follow-up editing commands would be invoked manually).
**Constitution Authority**: The project constitution (`.specify/memory/constitution.md`) is **non-negotiable** within this analysis scope. Constitution conflicts are automatically CRITICAL and require adjustment of the spec, plan, or tasks—not dilution, reinterpretation, or silent ignoring of the principle. If a principle itself needs to change, that must occur in a separate, explicit constitution update outside `/speckit.analyze`.
## Execution Steps
### 1. Initialize Analysis Context
Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` once from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS. Derive absolute paths:
- SPEC = FEATURE_DIR/spec.md
- PLAN = FEATURE_DIR/plan.md
- TASKS = FEATURE_DIR/tasks.md
Abort with an error message if any required file is missing (instruct the user to run missing prerequisite command).
For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
### 2. Load Artifacts (Progressive Disclosure)
Load only the minimal necessary context from each artifact:
**From spec.md:**
- Overview/Context
- Functional Requirements
- Non-Functional Requirements
- User Stories
- Edge Cases (if present)
**From plan.md:**
- Architecture/stack choices
- Data Model references
- Phases
- Technical constraints
**From tasks.md:**
- Task IDs
- Descriptions
- Phase grouping
- Parallel markers [P]
- Referenced file paths
**From constitution:**
- Load `.specify/memory/constitution.md` for principle validation
### 3. Build Semantic Models
Create internal representations (do not include raw artifacts in output):
- **Requirements inventory**: Each functional + non-functional requirement with a stable key (derive slug based on imperative phrase; e.g., "User can upload file" → `user-can-upload-file`)
- **User story/action inventory**: Discrete user actions with acceptance criteria
- **Task coverage mapping**: Map each task to one or more requirements or stories (inference by keyword / explicit reference patterns like IDs or key phrases)
- **Constitution rule set**: Extract principle names and MUST/SHOULD normative statements
### 4. Detection Passes (Token-Efficient Analysis)
Focus on high-signal findings. Limit to 50 findings total; aggregate remainder in overflow summary.
#### A. Duplication Detection
- Identify near-duplicate requirements
- Mark lower-quality phrasing for consolidation
#### B. Ambiguity Detection
- Flag vague adjectives (fast, scalable, secure, intuitive, robust) lacking measurable criteria
- Flag unresolved placeholders (TODO, TKTK, ???, `<placeholder>`, etc.)
#### C. Underspecification
- Requirements with verbs but missing object or measurable outcome
- User stories missing acceptance criteria alignment
- Tasks referencing files or components not defined in spec/plan
#### D. Constitution Alignment
- Any requirement or plan element conflicting with a MUST principle
- Missing mandated sections or quality gates from constitution
#### E. Coverage Gaps
- Requirements with zero associated tasks
- Tasks with no mapped requirement/story
- Non-functional requirements not reflected in tasks (e.g., performance, security)
#### F. Inconsistency
- Terminology drift (same concept named differently across files)
- Data entities referenced in plan but absent in spec (or vice versa)
- Task ordering contradictions (e.g., integration tasks before foundational setup tasks without dependency note)
- Conflicting requirements (e.g., one requires Next.js while other specifies Vue)
### 5. Severity Assignment
Use this heuristic to prioritize findings:
- **CRITICAL**: Violates constitution MUST, missing core spec artifact, or requirement with zero coverage that blocks baseline functionality
- **HIGH**: Duplicate or conflicting requirement, ambiguous security/performance attribute, untestable acceptance criterion
- **MEDIUM**: Terminology drift, missing non-functional task coverage, underspecified edge case
- **LOW**: Style/wording improvements, minor redundancy not affecting execution order
### 6. Produce Compact Analysis Report
Output a Markdown report (no file writes) with the following structure:
## Specification Analysis Report
| ID | Category | Severity | Location(s) | Summary | Recommendation |
|----|----------|----------|-------------|---------|----------------|
| A1 | Duplication | HIGH | spec.md:L120-134 | Two similar requirements ... | Merge phrasing; keep clearer version |
(Add one row per finding; generate stable IDs prefixed by category initial.)
**Coverage Summary Table:**
| Requirement Key | Has Task? | Task IDs | Notes |
|-----------------|-----------|----------|-------|
**Constitution Alignment Issues:** (if any)
**Unmapped Tasks:** (if any)
**Metrics:**
- Total Requirements
- Total Tasks
- Coverage % (requirements with >=1 task)
- Ambiguity Count
- Duplication Count
- Critical Issues Count
### 7. Provide Next Actions
At end of report, output a concise Next Actions block:
- If CRITICAL issues exist: Recommend resolving before `/speckit.implement`
- If only LOW/MEDIUM: User may proceed, but provide improvement suggestions
- Provide explicit command suggestions: e.g., "Run /speckit.specify with refinement", "Run /speckit.plan to adjust architecture", "Manually edit tasks.md to add coverage for 'performance-metrics'"
### 8. Offer Remediation
Ask the user: "Would you like me to suggest concrete remediation edits for the top N issues?" (Do NOT apply them automatically.)
## Operating Principles
### Context Efficiency
- **Minimal high-signal tokens**: Focus on actionable findings, not exhaustive documentation
- **Progressive disclosure**: Load artifacts incrementally; don't dump all content into analysis
- **Token-efficient output**: Limit findings table to 50 rows; summarize overflow
- **Deterministic results**: Rerunning without changes should produce consistent IDs and counts
### Analysis Guidelines
- **NEVER modify files** (this is read-only analysis)
- **NEVER hallucinate missing sections** (if absent, report them accurately)
- **Prioritize constitution violations** (these are always CRITICAL)
- **Use examples over exhaustive rules** (cite specific instances, not generic patterns)
- **Report zero issues gracefully** (emit success report with coverage statistics)
## Context
$ARGUMENTS

View File

@@ -1,294 +0,0 @@
---
description: Generate a custom checklist for the current feature based on user requirements.
---
## Checklist Purpose: "Unit Tests for English"
**CRITICAL CONCEPT**: Checklists are **UNIT TESTS FOR REQUIREMENTS WRITING** - they validate the quality, clarity, and completeness of requirements in a given domain.
**NOT for verification/testing**:
- ❌ NOT "Verify the button clicks correctly"
- ❌ NOT "Test error handling works"
- ❌ NOT "Confirm the API returns 200"
- ❌ NOT checking if code/implementation matches the spec
**FOR requirements quality validation**:
- ✅ "Are visual hierarchy requirements defined for all card types?" (completeness)
- ✅ "Is 'prominent display' quantified with specific sizing/positioning?" (clarity)
- ✅ "Are hover state requirements consistent across all interactive elements?" (consistency)
- ✅ "Are accessibility requirements defined for keyboard navigation?" (coverage)
- ✅ "Does the spec define what happens when logo image fails to load?" (edge cases)
**Metaphor**: If your spec is code written in English, the checklist is its unit test suite. You're testing whether the requirements are well-written, complete, unambiguous, and ready for implementation - NOT whether the implementation works.
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Execution Steps
1. **Setup**: Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json` from repo root and parse JSON for FEATURE_DIR and AVAILABLE_DOCS list.
- All file paths must be absolute.
- For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Clarify intent (dynamic)**: Derive up to THREE initial contextual clarifying questions (no pre-baked catalog). They MUST:
- Be generated from the user's phrasing + extracted signals from spec/plan/tasks
- Only ask about information that materially changes checklist content
- Be skipped individually if already unambiguous in `$ARGUMENTS`
- Prefer precision over breadth
Generation algorithm:
1. Extract signals: feature domain keywords (e.g., auth, latency, UX, API), risk indicators ("critical", "must", "compliance"), stakeholder hints ("QA", "review", "security team"), and explicit deliverables ("a11y", "rollback", "contracts").
2. Cluster signals into candidate focus areas (max 4) ranked by relevance.
3. Identify probable audience & timing (author, reviewer, QA, release) if not explicit.
4. Detect missing dimensions: scope breadth, depth/rigor, risk emphasis, exclusion boundaries, measurable acceptance criteria.
5. Formulate questions chosen from these archetypes:
- Scope refinement (e.g., "Should this include integration touchpoints with X and Y or stay limited to local module correctness?")
- Risk prioritization (e.g., "Which of these potential risk areas should receive mandatory gating checks?")
- Depth calibration (e.g., "Is this a lightweight pre-commit sanity list or a formal release gate?")
- Audience framing (e.g., "Will this be used by the author only or peers during PR review?")
- Boundary exclusion (e.g., "Should we explicitly exclude performance tuning items this round?")
- Scenario class gap (e.g., "No recovery flows detected—are rollback / partial failure paths in scope?")
Question formatting rules:
- If presenting options, generate a compact table with columns: Option | Candidate | Why It Matters
- Limit to AE options maximum; omit table if a free-form answer is clearer
- Never ask the user to restate what they already said
- Avoid speculative categories (no hallucination). If uncertain, ask explicitly: "Confirm whether X belongs in scope."
Defaults when interaction impossible:
- Depth: Standard
- Audience: Reviewer (PR) if code-related; Author otherwise
- Focus: Top 2 relevance clusters
Output the questions (label Q1/Q2/Q3). After answers: if ≥2 scenario classes (Alternate / Exception / Recovery / Non-Functional domain) remain unclear, you MAY ask up to TWO more targeted followups (Q4/Q5) with a one-line justification each (e.g., "Unresolved recovery path risk"). Do not exceed five total questions. Skip escalation if user explicitly declines more.
3. **Understand user request**: Combine `$ARGUMENTS` + clarifying answers:
- Derive checklist theme (e.g., security, review, deploy, ux)
- Consolidate explicit must-have items mentioned by user
- Map focus selections to category scaffolding
- Infer any missing context from spec/plan/tasks (do NOT hallucinate)
4. **Load feature context**: Read from FEATURE_DIR:
- spec.md: Feature requirements and scope
- plan.md (if exists): Technical details, dependencies
- tasks.md (if exists): Implementation tasks
**Context Loading Strategy**:
- Load only necessary portions relevant to active focus areas (avoid full-file dumping)
- Prefer summarizing long sections into concise scenario/requirement bullets
- Use progressive disclosure: add follow-on retrieval only if gaps detected
- If source docs are large, generate interim summary items instead of embedding raw text
5. **Generate checklist** - Create "Unit Tests for Requirements":
- Create `FEATURE_DIR/checklists/` directory if it doesn't exist
- Generate unique checklist filename:
- Use short, descriptive name based on domain (e.g., `ux.md`, `api.md`, `security.md`)
- Format: `[domain].md`
- If file exists, append to existing file
- Number items sequentially starting from CHK001
- Each `/speckit.checklist` run creates a NEW file (never overwrites existing checklists)
**CORE PRINCIPLE - Test the Requirements, Not the Implementation**:
Every checklist item MUST evaluate the REQUIREMENTS THEMSELVES for:
- **Completeness**: Are all necessary requirements present?
- **Clarity**: Are requirements unambiguous and specific?
- **Consistency**: Do requirements align with each other?
- **Measurability**: Can requirements be objectively verified?
- **Coverage**: Are all scenarios/edge cases addressed?
**Category Structure** - Group items by requirement quality dimensions:
- **Requirement Completeness** (Are all necessary requirements documented?)
- **Requirement Clarity** (Are requirements specific and unambiguous?)
- **Requirement Consistency** (Do requirements align without conflicts?)
- **Acceptance Criteria Quality** (Are success criteria measurable?)
- **Scenario Coverage** (Are all flows/cases addressed?)
- **Edge Case Coverage** (Are boundary conditions defined?)
- **Non-Functional Requirements** (Performance, Security, Accessibility, etc. - are they specified?)
- **Dependencies & Assumptions** (Are they documented and validated?)
- **Ambiguities & Conflicts** (What needs clarification?)
**HOW TO WRITE CHECKLIST ITEMS - "Unit Tests for English"**:
**WRONG** (Testing implementation):
- "Verify landing page displays 3 episode cards"
- "Test hover states work on desktop"
- "Confirm logo click navigates home"
**CORRECT** (Testing requirements quality):
- "Are the exact number and layout of featured episodes specified?" [Completeness]
- "Is 'prominent display' quantified with specific sizing/positioning?" [Clarity]
- "Are hover state requirements consistent across all interactive elements?" [Consistency]
- "Are keyboard navigation requirements defined for all interactive UI?" [Coverage]
- "Is the fallback behavior specified when logo image fails to load?" [Edge Cases]
- "Are loading states defined for asynchronous episode data?" [Completeness]
- "Does the spec define visual hierarchy for competing UI elements?" [Clarity]
**ITEM STRUCTURE**:
Each item should follow this pattern:
- Question format asking about requirement quality
- Focus on what's WRITTEN (or not written) in the spec/plan
- Include quality dimension in brackets [Completeness/Clarity/Consistency/etc.]
- Reference spec section `[Spec §X.Y]` when checking existing requirements
- Use `[Gap]` marker when checking for missing requirements
**EXAMPLES BY QUALITY DIMENSION**:
Completeness:
- "Are error handling requirements defined for all API failure modes? [Gap]"
- "Are accessibility requirements specified for all interactive elements? [Completeness]"
- "Are mobile breakpoint requirements defined for responsive layouts? [Gap]"
Clarity:
- "Is 'fast loading' quantified with specific timing thresholds? [Clarity, Spec §NFR-2]"
- "Are 'related episodes' selection criteria explicitly defined? [Clarity, Spec §FR-5]"
- "Is 'prominent' defined with measurable visual properties? [Ambiguity, Spec §FR-4]"
Consistency:
- "Do navigation requirements align across all pages? [Consistency, Spec §FR-10]"
- "Are card component requirements consistent between landing and detail pages? [Consistency]"
Coverage:
- "Are requirements defined for zero-state scenarios (no episodes)? [Coverage, Edge Case]"
- "Are concurrent user interaction scenarios addressed? [Coverage, Gap]"
- "Are requirements specified for partial data loading failures? [Coverage, Exception Flow]"
Measurability:
- "Are visual hierarchy requirements measurable/testable? [Acceptance Criteria, Spec §FR-1]"
- "Can 'balanced visual weight' be objectively verified? [Measurability, Spec §FR-2]"
**Scenario Classification & Coverage** (Requirements Quality Focus):
- Check if requirements exist for: Primary, Alternate, Exception/Error, Recovery, Non-Functional scenarios
- For each scenario class, ask: "Are [scenario type] requirements complete, clear, and consistent?"
- If scenario class missing: "Are [scenario type] requirements intentionally excluded or missing? [Gap]"
- Include resilience/rollback when state mutation occurs: "Are rollback requirements defined for migration failures? [Gap]"
**Traceability Requirements**:
- MINIMUM: ≥80% of items MUST include at least one traceability reference
- Each item should reference: spec section `[Spec §X.Y]`, or use markers: `[Gap]`, `[Ambiguity]`, `[Conflict]`, `[Assumption]`
- If no ID system exists: "Is a requirement & acceptance criteria ID scheme established? [Traceability]"
**Surface & Resolve Issues** (Requirements Quality Problems):
Ask questions about the requirements themselves:
- Ambiguities: "Is the term 'fast' quantified with specific metrics? [Ambiguity, Spec §NFR-1]"
- Conflicts: "Do navigation requirements conflict between §FR-10 and §FR-10a? [Conflict]"
- Assumptions: "Is the assumption of 'always available podcast API' validated? [Assumption]"
- Dependencies: "Are external podcast API requirements documented? [Dependency, Gap]"
- Missing definitions: "Is 'visual hierarchy' defined with measurable criteria? [Gap]"
**Content Consolidation**:
- Soft cap: If raw candidate items > 40, prioritize by risk/impact
- Merge near-duplicates checking the same requirement aspect
- If >5 low-impact edge cases, create one item: "Are edge cases X, Y, Z addressed in requirements? [Coverage]"
**🚫 ABSOLUTELY PROHIBITED** - These make it an implementation test, not a requirements test:
- ❌ Any item starting with "Verify", "Test", "Confirm", "Check" + implementation behavior
- ❌ References to code execution, user actions, system behavior
- ❌ "Displays correctly", "works properly", "functions as expected"
- ❌ "Click", "navigate", "render", "load", "execute"
- ❌ Test cases, test plans, QA procedures
- ❌ Implementation details (frameworks, APIs, algorithms)
**✅ REQUIRED PATTERNS** - These test requirements quality:
- ✅ "Are [requirement type] defined/specified/documented for [scenario]?"
- ✅ "Is [vague term] quantified/clarified with specific criteria?"
- ✅ "Are requirements consistent between [section A] and [section B]?"
- ✅ "Can [requirement] be objectively measured/verified?"
- ✅ "Are [edge cases/scenarios] addressed in requirements?"
- ✅ "Does the spec define [missing aspect]?"
6. **Structure Reference**: Generate the checklist following the canonical template in `.specify/templates/checklist-template.md` for title, meta section, category headings, and ID formatting. If template is unavailable, use: H1 title, purpose/created meta lines, `##` category sections containing `- [ ] CHK### <requirement item>` lines with globally incrementing IDs starting at CHK001.
7. **Report**: Output full path to created checklist, item count, and remind user that each run creates a new file. Summarize:
- Focus areas selected
- Depth level
- Actor/timing
- Any explicit user-specified must-have items incorporated
**Important**: Each `/speckit.checklist` command invocation creates a checklist file using short, descriptive names unless file already exists. This allows:
- Multiple checklists of different types (e.g., `ux.md`, `test.md`, `security.md`)
- Simple, memorable filenames that indicate checklist purpose
- Easy identification and navigation in the `checklists/` folder
To avoid clutter, use descriptive types and clean up obsolete checklists when done.
## Example Checklist Types & Sample Items
**UX Requirements Quality:** `ux.md`
Sample items (testing the requirements, NOT the implementation):
- "Are visual hierarchy requirements defined with measurable criteria? [Clarity, Spec §FR-1]"
- "Is the number and positioning of UI elements explicitly specified? [Completeness, Spec §FR-1]"
- "Are interaction state requirements (hover, focus, active) consistently defined? [Consistency]"
- "Are accessibility requirements specified for all interactive elements? [Coverage, Gap]"
- "Is fallback behavior defined when images fail to load? [Edge Case, Gap]"
- "Can 'prominent display' be objectively measured? [Measurability, Spec §FR-4]"
**API Requirements Quality:** `api.md`
Sample items:
- "Are error response formats specified for all failure scenarios? [Completeness]"
- "Are rate limiting requirements quantified with specific thresholds? [Clarity]"
- "Are authentication requirements consistent across all endpoints? [Consistency]"
- "Are retry/timeout requirements defined for external dependencies? [Coverage, Gap]"
- "Is versioning strategy documented in requirements? [Gap]"
**Performance Requirements Quality:** `performance.md`
Sample items:
- "Are performance requirements quantified with specific metrics? [Clarity]"
- "Are performance targets defined for all critical user journeys? [Coverage]"
- "Are performance requirements under different load conditions specified? [Completeness]"
- "Can performance requirements be objectively measured? [Measurability]"
- "Are degradation requirements defined for high-load scenarios? [Edge Case, Gap]"
**Security Requirements Quality:** `security.md`
Sample items:
- "Are authentication requirements specified for all protected resources? [Coverage]"
- "Are data protection requirements defined for sensitive information? [Completeness]"
- "Is the threat model documented and requirements aligned to it? [Traceability]"
- "Are security requirements consistent with compliance obligations? [Consistency]"
- "Are security failure/breach response requirements defined? [Gap, Exception Flow]"
## Anti-Examples: What NOT To Do
**❌ WRONG - These test implementation, not requirements:**
```markdown
- [ ] CHK001 - Verify landing page displays 3 episode cards [Spec §FR-001]
- [ ] CHK002 - Test hover states work correctly on desktop [Spec §FR-003]
- [ ] CHK003 - Confirm logo click navigates to home page [Spec §FR-010]
- [ ] CHK004 - Check that related episodes section shows 3-5 items [Spec §FR-005]
```
**✅ CORRECT - These test requirements quality:**
```markdown
- [ ] CHK001 - Are the number and layout of featured episodes explicitly specified? [Completeness, Spec §FR-001]
- [ ] CHK002 - Are hover state requirements consistently defined for all interactive elements? [Consistency, Spec §FR-003]
- [ ] CHK003 - Are navigation requirements clear for all clickable brand elements? [Clarity, Spec §FR-010]
- [ ] CHK004 - Is the selection criteria for related episodes documented? [Gap, Spec §FR-005]
- [ ] CHK005 - Are loading state requirements defined for asynchronous episode data? [Gap]
- [ ] CHK006 - Can "visual hierarchy" requirements be objectively measured? [Measurability, Spec §FR-001]
```
**Key Differences:**
- Wrong: Tests if the system works correctly
- Correct: Tests if the requirements are written correctly
- Wrong: Verification of behavior
- Correct: Validation of requirement quality
- Wrong: "Does it do X?"
- Correct: "Is X clearly specified?"

View File

@@ -1,181 +0,0 @@
---
description: Identify underspecified areas in the current feature spec by asking up to 5 highly targeted clarification questions and encoding answers back into the spec.
handoffs:
- label: Build Technical Plan
agent: speckit.plan
prompt: Create a plan for the spec. I am building with...
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
Goal: Detect and reduce ambiguity or missing decision points in the active feature specification and record the clarifications directly in the spec file.
Note: This clarification workflow is expected to run (and be completed) BEFORE invoking `/speckit.plan`. If the user explicitly states they are skipping clarification (e.g., exploratory spike), you may proceed, but must warn that downstream rework risk increases.
Execution steps:
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -PathsOnly` from repo root **once** (combined `--json --paths-only` mode / `-Json -PathsOnly`). Parse minimal JSON payload fields:
- `FEATURE_DIR`
- `FEATURE_SPEC`
- (Optionally capture `IMPL_PLAN`, `TASKS` for future chained flows.)
- If JSON parsing fails, abort and instruct user to re-run `/speckit.specify` or verify feature branch environment.
- For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. Load the current spec file. Perform a structured ambiguity & coverage scan using this taxonomy. For each category, mark status: Clear / Partial / Missing. Produce an internal coverage map used for prioritization (do not output raw map unless no questions will be asked).
Functional Scope & Behavior:
- Core user goals & success criteria
- Explicit out-of-scope declarations
- User roles / personas differentiation
Domain & Data Model:
- Entities, attributes, relationships
- Identity & uniqueness rules
- Lifecycle/state transitions
- Data volume / scale assumptions
Interaction & UX Flow:
- Critical user journeys / sequences
- Error/empty/loading states
- Accessibility or localization notes
Non-Functional Quality Attributes:
- Performance (latency, throughput targets)
- Scalability (horizontal/vertical, limits)
- Reliability & availability (uptime, recovery expectations)
- Observability (logging, metrics, tracing signals)
- Security & privacy (authN/Z, data protection, threat assumptions)
- Compliance / regulatory constraints (if any)
Integration & External Dependencies:
- External services/APIs and failure modes
- Data import/export formats
- Protocol/versioning assumptions
Edge Cases & Failure Handling:
- Negative scenarios
- Rate limiting / throttling
- Conflict resolution (e.g., concurrent edits)
Constraints & Tradeoffs:
- Technical constraints (language, storage, hosting)
- Explicit tradeoffs or rejected alternatives
Terminology & Consistency:
- Canonical glossary terms
- Avoided synonyms / deprecated terms
Completion Signals:
- Acceptance criteria testability
- Measurable Definition of Done style indicators
Misc / Placeholders:
- TODO markers / unresolved decisions
- Ambiguous adjectives ("robust", "intuitive") lacking quantification
For each category with Partial or Missing status, add a candidate question opportunity unless:
- Clarification would not materially change implementation or validation strategy
- Information is better deferred to planning phase (note internally)
3. Generate (internally) a prioritized queue of candidate clarification questions (maximum 5). Do NOT output them all at once. Apply these constraints:
- Maximum of 10 total questions across the whole session.
- Each question must be answerable with EITHER:
- A short multiplechoice selection (25 distinct, mutually exclusive options), OR
- A one-word / shortphrase answer (explicitly constrain: "Answer in <=5 words").
- Only include questions whose answers materially impact architecture, data modeling, task decomposition, test design, UX behavior, operational readiness, or compliance validation.
- Ensure category coverage balance: attempt to cover the highest impact unresolved categories first; avoid asking two low-impact questions when a single high-impact area (e.g., security posture) is unresolved.
- Exclude questions already answered, trivial stylistic preferences, or plan-level execution details (unless blocking correctness).
- Favor clarifications that reduce downstream rework risk or prevent misaligned acceptance tests.
- If more than 5 categories remain unresolved, select the top 5 by (Impact * Uncertainty) heuristic.
4. Sequential questioning loop (interactive):
- Present EXACTLY ONE question at a time.
- For multiplechoice questions:
- **Analyze all options** and determine the **most suitable option** based on:
- Best practices for the project type
- Common patterns in similar implementations
- Risk reduction (security, performance, maintainability)
- Alignment with any explicit project goals or constraints visible in the spec
- Present your **recommended option prominently** at the top with clear reasoning (1-2 sentences explaining why this is the best choice).
- Format as: `**Recommended:** Option [X] - <reasoning>`
- Then render all options as a Markdown table:
| Option | Description |
|--------|-------------|
| A | <Option A description> |
| B | <Option B description> |
| C | <Option C description> (add D/E as needed up to 5) |
| Short | Provide a different short answer (<=5 words) (Include only if free-form alternative is appropriate) |
- After the table, add: `You can reply with the option letter (e.g., "A"), accept the recommendation by saying "yes" or "recommended", or provide your own short answer.`
- For shortanswer style (no meaningful discrete options):
- Provide your **suggested answer** based on best practices and context.
- Format as: `**Suggested:** <your proposed answer> - <brief reasoning>`
- Then output: `Format: Short answer (<=5 words). You can accept the suggestion by saying "yes" or "suggested", or provide your own answer.`
- After the user answers:
- If the user replies with "yes", "recommended", or "suggested", use your previously stated recommendation/suggestion as the answer.
- Otherwise, validate the answer maps to one option or fits the <=5 word constraint.
- If ambiguous, ask for a quick disambiguation (count still belongs to same question; do not advance).
- Once satisfactory, record it in working memory (do not yet write to disk) and move to the next queued question.
- Stop asking further questions when:
- All critical ambiguities resolved early (remaining queued items become unnecessary), OR
- User signals completion ("done", "good", "no more"), OR
- You reach 5 asked questions.
- Never reveal future queued questions in advance.
- If no valid questions exist at start, immediately report no critical ambiguities.
5. Integration after EACH accepted answer (incremental update approach):
- Maintain in-memory representation of the spec (loaded once at start) plus the raw file contents.
- For the first integrated answer in this session:
- Ensure a `## Clarifications` section exists (create it just after the highest-level contextual/overview section per the spec template if missing).
- Under it, create (if not present) a `### Session YYYY-MM-DD` subheading for today.
- Append a bullet line immediately after acceptance: `- Q: <question> → A: <final answer>`.
- Then immediately apply the clarification to the most appropriate section(s):
- Functional ambiguity → Update or add a bullet in Functional Requirements.
- User interaction / actor distinction → Update User Stories or Actors subsection (if present) with clarified role, constraint, or scenario.
- Data shape / entities → Update Data Model (add fields, types, relationships) preserving ordering; note added constraints succinctly.
- Non-functional constraint → Add/modify measurable criteria in Non-Functional / Quality Attributes section (convert vague adjective to metric or explicit target).
- Edge case / negative flow → Add a new bullet under Edge Cases / Error Handling (or create such subsection if template provides placeholder for it).
- Terminology conflict → Normalize term across spec; retain original only if necessary by adding `(formerly referred to as "X")` once.
- If the clarification invalidates an earlier ambiguous statement, replace that statement instead of duplicating; leave no obsolete contradictory text.
- Save the spec file AFTER each integration to minimize risk of context loss (atomic overwrite).
- Preserve formatting: do not reorder unrelated sections; keep heading hierarchy intact.
- Keep each inserted clarification minimal and testable (avoid narrative drift).
6. Validation (performed after EACH write plus final pass):
- Clarifications session contains exactly one bullet per accepted answer (no duplicates).
- Total asked (accepted) questions ≤ 5.
- Updated sections contain no lingering vague placeholders the new answer was meant to resolve.
- No contradictory earlier statement remains (scan for now-invalid alternative choices removed).
- Markdown structure valid; only allowed new headings: `## Clarifications`, `### Session YYYY-MM-DD`.
- Terminology consistency: same canonical term used across all updated sections.
7. Write the updated spec back to `FEATURE_SPEC`.
8. Report completion (after questioning loop ends or early termination):
- Number of questions asked & answered.
- Path to updated spec.
- Sections touched (list names).
- Coverage summary table listing each taxonomy category with Status: Resolved (was Partial/Missing and addressed), Deferred (exceeds question quota or better suited for planning), Clear (already sufficient), Outstanding (still Partial/Missing but low impact).
- If any Outstanding or Deferred remain, recommend whether to proceed to `/speckit.plan` or run `/speckit.clarify` again later post-plan.
- Suggested next command.
Behavior rules:
- If no meaningful ambiguities found (or all potential questions would be low-impact), respond: "No critical ambiguities detected worth formal clarification." and suggest proceeding.
- If spec file missing, instruct user to run `/speckit.specify` first (do not create a new spec here).
- Never exceed 5 total asked questions (clarification retries for a single question do not count as new questions).
- Avoid speculative tech stack questions unless the absence blocks functional clarity.
- Respect user early termination signals ("stop", "done", "proceed").
- If no questions asked due to full coverage, output a compact coverage summary (all categories Clear) then suggest advancing.
- If quota reached with unresolved high-impact categories remaining, explicitly flag them under Deferred with rationale.
Context for prioritization: $ARGUMENTS

View File

@@ -1,84 +0,0 @@
---
description: Create or update the project constitution from interactive or provided principle inputs, ensuring all dependent templates stay in sync.
handoffs:
- label: Build Specification
agent: speckit.specify
prompt: Implement the feature specification based on the updated constitution. I want to build...
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
You are updating the project constitution at `.specify/memory/constitution.md`. This file is a TEMPLATE containing placeholder tokens in square brackets (e.g. `[PROJECT_NAME]`, `[PRINCIPLE_1_NAME]`). Your job is to (a) collect/derive concrete values, (b) fill the template precisely, and (c) propagate any amendments across dependent artifacts.
**Note**: If `.specify/memory/constitution.md` does not exist yet, it should have been initialized from `.specify/templates/constitution-template.md` during project setup. If it's missing, copy the template first.
Follow this execution flow:
1. Load the existing constitution at `.specify/memory/constitution.md`.
- Identify every placeholder token of the form `[ALL_CAPS_IDENTIFIER]`.
**IMPORTANT**: The user might require less or more principles than the ones used in the template. If a number is specified, respect that - follow the general template. You will update the doc accordingly.
2. Collect/derive values for placeholders:
- If user input (conversation) supplies a value, use it.
- Otherwise infer from existing repo context (README, docs, prior constitution versions if embedded).
- For governance dates: `RATIFICATION_DATE` is the original adoption date (if unknown ask or mark TODO), `LAST_AMENDED_DATE` is today if changes are made, otherwise keep previous.
- `CONSTITUTION_VERSION` must increment according to semantic versioning rules:
- MAJOR: Backward incompatible governance/principle removals or redefinitions.
- MINOR: New principle/section added or materially expanded guidance.
- PATCH: Clarifications, wording, typo fixes, non-semantic refinements.
- If version bump type ambiguous, propose reasoning before finalizing.
3. Draft the updated constitution content:
- Replace every placeholder with concrete text (no bracketed tokens left except intentionally retained template slots that the project has chosen not to define yet—explicitly justify any left).
- Preserve heading hierarchy and comments can be removed once replaced unless they still add clarifying guidance.
- Ensure each Principle section: succinct name line, paragraph (or bullet list) capturing nonnegotiable rules, explicit rationale if not obvious.
- Ensure Governance section lists amendment procedure, versioning policy, and compliance review expectations.
4. Consistency propagation checklist (convert prior checklist into active validations):
- Read `.specify/templates/plan-template.md` and ensure any "Constitution Check" or rules align with updated principles.
- Read `.specify/templates/spec-template.md` for scope/requirements alignment—update if constitution adds/removes mandatory sections or constraints.
- Read `.specify/templates/tasks-template.md` and ensure task categorization reflects new or removed principle-driven task types (e.g., observability, versioning, testing discipline).
- Read each command file in `.specify/templates/commands/*.md` (including this one) to verify no outdated references (agent-specific names like CLAUDE only) remain when generic guidance is required.
- Read any runtime guidance docs (e.g., `README.md`, `docs/quickstart.md`, or agent-specific guidance files if present). Update references to principles changed.
5. Produce a Sync Impact Report (prepend as an HTML comment at top of the constitution file after update):
- Version change: old → new
- List of modified principles (old title → new title if renamed)
- Added sections
- Removed sections
- Templates requiring updates (✅ updated / ⚠ pending) with file paths
- Follow-up TODOs if any placeholders intentionally deferred.
6. Validation before final output:
- No remaining unexplained bracket tokens.
- Version line matches report.
- Dates ISO format YYYY-MM-DD.
- Principles are declarative, testable, and free of vague language ("should" → replace with MUST/SHOULD rationale where appropriate).
7. Write the completed constitution back to `.specify/memory/constitution.md` (overwrite).
8. Output a final summary to the user with:
- New version and bump rationale.
- Any files flagged for manual follow-up.
- Suggested commit message (e.g., `docs: amend constitution to vX.Y.Z (principle additions + governance update)`).
Formatting & Style Requirements:
- Use Markdown headings exactly as in the template (do not demote/promote levels).
- Wrap long rationale lines to keep readability (<100 chars ideally) but do not hard enforce with awkward breaks.
- Keep a single blank line between sections.
- Avoid trailing whitespace.
If the user supplies partial updates (e.g., only one principle revision), still perform validation and version decision steps.
If critical info missing (e.g., ratification date truly unknown), insert `TODO(<FIELD_NAME>): explanation` and include in the Sync Impact Report under deferred items.
Do not create a new template; always operate on the existing `.specify/memory/constitution.md` file.

View File

@@ -1,135 +0,0 @@
---
description: Execute the implementation plan by processing and executing all tasks defined in tasks.md
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Check checklists status** (if FEATURE_DIR/checklists/ exists):
- Scan all checklist files in the checklists/ directory
- For each checklist, count:
- Total items: All lines matching `- [ ]` or `- [X]` or `- [x]`
- Completed items: Lines matching `- [X]` or `- [x]`
- Incomplete items: Lines matching `- [ ]`
- Create a status table:
```text
| Checklist | Total | Completed | Incomplete | Status |
|-----------|-------|-----------|------------|--------|
| ux.md | 12 | 12 | 0 | ✓ PASS |
| test.md | 8 | 5 | 3 | ✗ FAIL |
| security.md | 6 | 6 | 0 | ✓ PASS |
```
- Calculate overall status:
- **PASS**: All checklists have 0 incomplete items
- **FAIL**: One or more checklists have incomplete items
- **If any checklist is incomplete**:
- Display the table with incomplete item counts
- **STOP** and ask: "Some checklists are incomplete. Do you want to proceed with implementation anyway? (yes/no)"
- Wait for user response before continuing
- If user says "no" or "wait" or "stop", halt execution
- If user says "yes" or "proceed" or "continue", proceed to step 3
- **If all checklists are complete**:
- Display the table showing all checklists passed
- Automatically proceed to step 3
3. Load and analyze the implementation context:
- **REQUIRED**: Read tasks.md for the complete task list and execution plan
- **REQUIRED**: Read plan.md for tech stack, architecture, and file structure
- **IF EXISTS**: Read data-model.md for entities and relationships
- **IF EXISTS**: Read contracts/ for API specifications and test requirements
- **IF EXISTS**: Read research.md for technical decisions and constraints
- **IF EXISTS**: Read quickstart.md for integration scenarios
4. **Project Setup Verification**:
- **REQUIRED**: Create/verify ignore files based on actual project setup:
**Detection & Creation Logic**:
- Check if the following command succeeds to determine if the repository is a git repo (create/verify .gitignore if so):
```sh
git rev-parse --git-dir 2>/dev/null
```
- Check if Dockerfile* exists or Docker in plan.md → create/verify .dockerignore
- Check if .eslintrc* exists → create/verify .eslintignore
- Check if eslint.config.* exists → ensure the config's `ignores` entries cover required patterns
- Check if .prettierrc* exists → create/verify .prettierignore
- Check if .npmrc or package.json exists → create/verify .npmignore (if publishing)
- Check if terraform files (*.tf) exist → create/verify .terraformignore
- Check if .helmignore needed (helm charts present) → create/verify .helmignore
**If ignore file already exists**: Verify it contains essential patterns, append missing critical patterns only
**If ignore file missing**: Create with full pattern set for detected technology
**Common Patterns by Technology** (from plan.md tech stack):
- **Node.js/JavaScript/TypeScript**: `node_modules/`, `dist/`, `build/`, `*.log`, `.env*`
- **Python**: `__pycache__/`, `*.pyc`, `.venv/`, `venv/`, `dist/`, `*.egg-info/`
- **Java**: `target/`, `*.class`, `*.jar`, `.gradle/`, `build/`
- **C#/.NET**: `bin/`, `obj/`, `*.user`, `*.suo`, `packages/`
- **Go**: `*.exe`, `*.test`, `vendor/`, `*.out`
- **Ruby**: `.bundle/`, `log/`, `tmp/`, `*.gem`, `vendor/bundle/`
- **PHP**: `vendor/`, `*.log`, `*.cache`, `*.env`
- **Rust**: `target/`, `debug/`, `release/`, `*.rs.bk`, `*.rlib`, `*.prof*`, `.idea/`, `*.log`, `.env*`
- **Kotlin**: `build/`, `out/`, `.gradle/`, `.idea/`, `*.class`, `*.jar`, `*.iml`, `*.log`, `.env*`
- **C++**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.so`, `*.a`, `*.exe`, `*.dll`, `.idea/`, `*.log`, `.env*`
- **C**: `build/`, `bin/`, `obj/`, `out/`, `*.o`, `*.a`, `*.so`, `*.exe`, `Makefile`, `config.log`, `.idea/`, `*.log`, `.env*`
- **Swift**: `.build/`, `DerivedData/`, `*.swiftpm/`, `Packages/`
- **R**: `.Rproj.user/`, `.Rhistory`, `.RData`, `.Ruserdata`, `*.Rproj`, `packrat/`, `renv/`
- **Universal**: `.DS_Store`, `Thumbs.db`, `*.tmp`, `*.swp`, `.vscode/`, `.idea/`
**Tool-Specific Patterns**:
- **Docker**: `node_modules/`, `.git/`, `Dockerfile*`, `.dockerignore`, `*.log*`, `.env*`, `coverage/`
- **ESLint**: `node_modules/`, `dist/`, `build/`, `coverage/`, `*.min.js`
- **Prettier**: `node_modules/`, `dist/`, `build/`, `coverage/`, `package-lock.json`, `yarn.lock`, `pnpm-lock.yaml`
- **Terraform**: `.terraform/`, `*.tfstate*`, `*.tfvars`, `.terraform.lock.hcl`
- **Kubernetes/k8s**: `*.secret.yaml`, `secrets/`, `.kube/`, `kubeconfig*`, `*.key`, `*.crt`
5. Parse tasks.md structure and extract:
- **Task phases**: Setup, Tests, Core, Integration, Polish
- **Task dependencies**: Sequential vs parallel execution rules
- **Task details**: ID, description, file paths, parallel markers [P]
- **Execution flow**: Order and dependency requirements
6. Execute implementation following the task plan:
- **Phase-by-phase execution**: Complete each phase before moving to the next
- **Respect dependencies**: Run sequential tasks in order, parallel tasks [P] can run together
- **Follow TDD approach**: Execute test tasks before their corresponding implementation tasks
- **File-based coordination**: Tasks affecting the same files must run sequentially
- **Validation checkpoints**: Verify each phase completion before proceeding
7. Implementation execution rules:
- **Setup first**: Initialize project structure, dependencies, configuration
- **Tests before code**: If you need to write tests for contracts, entities, and integration scenarios
- **Core development**: Implement models, services, CLI commands, endpoints
- **Integration work**: Database connections, middleware, logging, external services
- **Polish and validation**: Unit tests, performance optimization, documentation
8. Progress tracking and error handling:
- Report progress after each completed task
- Halt execution if any non-parallel task fails
- For parallel tasks [P], continue with successful tasks, report failed ones
- Provide clear error messages with context for debugging
- Suggest next steps if implementation cannot proceed
- **IMPORTANT** For completed tasks, make sure to mark the task off as [X] in the tasks file.
9. Completion validation:
- Verify all required tasks are completed
- Check that implemented features match the original specification
- Validate that tests pass and coverage meets requirements
- Confirm the implementation follows the technical plan
- Report final status with summary of completed work
Note: This command assumes a complete task breakdown exists in tasks.md. If tasks are incomplete or missing, suggest running `/speckit.tasks` first to regenerate the task list.

View File

@@ -1,90 +0,0 @@
---
description: Execute the implementation planning workflow using the plan template to generate design artifacts.
handoffs:
- label: Create Tasks
agent: speckit.tasks
prompt: Break the plan into tasks
send: true
- label: Create Checklist
agent: speckit.checklist
prompt: Create a checklist for the following domain...
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. **Setup**: Run `.specify/scripts/powershell/setup-plan.ps1 -Json` from repo root and parse JSON for FEATURE_SPEC, IMPL_PLAN, SPECS_DIR, BRANCH. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load context**: Read FEATURE_SPEC and `.specify/memory/constitution.md`. Load IMPL_PLAN template (already copied).
3. **Execute plan workflow**: Follow the structure in IMPL_PLAN template to:
- Fill Technical Context (mark unknowns as "NEEDS CLARIFICATION")
- Fill Constitution Check section from constitution
- Evaluate gates (ERROR if violations unjustified)
- Phase 0: Generate research.md (resolve all NEEDS CLARIFICATION)
- Phase 1: Generate data-model.md, contracts/, quickstart.md
- Phase 1: Update agent context by running the agent script
- Re-evaluate Constitution Check post-design
4. **Stop and report**: Command ends after Phase 2 planning. Report branch, IMPL_PLAN path, and generated artifacts.
## Phases
### Phase 0: Outline & Research
1. **Extract unknowns from Technical Context** above:
- For each NEEDS CLARIFICATION → research task
- For each dependency → best practices task
- For each integration → patterns task
2. **Generate and dispatch research agents**:
```text
For each unknown in Technical Context:
Task: "Research {unknown} for {feature context}"
For each technology choice:
Task: "Find best practices for {tech} in {domain}"
```
3. **Consolidate findings** in `research.md` using format:
- Decision: [what was chosen]
- Rationale: [why chosen]
- Alternatives considered: [what else evaluated]
**Output**: research.md with all NEEDS CLARIFICATION resolved
### Phase 1: Design & Contracts
**Prerequisites:** `research.md` complete
1. **Extract entities from feature spec** → `data-model.md`:
- Entity name, fields, relationships
- Validation rules from requirements
- State transitions if applicable
2. **Define interface contracts** (if project has external interfaces) → `/contracts/`:
- Identify what interfaces the project exposes to users or other systems
- Document the contract format appropriate for the project type
- Examples: public APIs for libraries, command schemas for CLI tools, endpoints for web services, grammars for parsers, UI contracts for applications
- Skip if project is purely internal (build scripts, one-off tools, etc.)
3. **Agent context update**:
- Run `.specify/scripts/powershell/update-agent-context.ps1 -AgentType claude`
- These scripts detect which AI agent is in use
- Update the appropriate agent-specific context file
- Add only new technology from current plan
- Preserve manual additions between markers
**Output**: data-model.md, /contracts/*, quickstart.md, agent-specific file
## Key rules
- Use absolute paths
- ERROR on gate failures or unresolved clarifications

View File

@@ -1,258 +0,0 @@
---
description: Create or update the feature specification from a natural language feature description.
handoffs:
- label: Build Technical Plan
agent: speckit.plan
prompt: Create a plan for the spec. I am building with...
- label: Clarify Spec Requirements
agent: speckit.clarify
prompt: Clarify specification requirements
send: true
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
The text the user typed after `/speckit.specify` in the triggering message **is** the feature description. Assume you always have it available in this conversation even if `$ARGUMENTS` appears literally below. Do not ask the user to repeat it unless they provided an empty command.
Given that feature description, do this:
1. **Generate a concise short name** (2-4 words) for the branch:
- Analyze the feature description and extract the most meaningful keywords
- Create a 2-4 word short name that captures the essence of the feature
- Use action-noun format when possible (e.g., "add-user-auth", "fix-payment-bug")
- Preserve technical terms and acronyms (OAuth2, API, JWT, etc.)
- Keep it concise but descriptive enough to understand the feature at a glance
- Examples:
- "I want to add user authentication" → "user-auth"
- "Implement OAuth2 integration for the API" → "oauth2-api-integration"
- "Create a dashboard for analytics" → "analytics-dashboard"
- "Fix payment processing timeout bug" → "fix-payment-timeout"
2. **Check for existing branches before creating new one**:
a. First, fetch all remote branches to ensure we have the latest information:
```bash
git fetch --all --prune
```
b. Find the highest feature number across all sources for the short-name:
- Remote branches: `git ls-remote --heads origin | grep -E 'refs/heads/[0-9]+-<short-name>$'`
- Local branches: `git branch | grep -E '^[* ]*[0-9]+-<short-name>$'`
- Specs directories: Check for directories matching `specs/[0-9]+-<short-name>`
c. Determine the next available number:
- Extract all numbers from all three sources
- Find the highest number N
- Use N+1 for the new branch number
d. Run the script `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS"` with the calculated number and short-name:
- Pass `--number N+1` and `--short-name "your-short-name"` along with the feature description
- Bash example: `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS" --json --number 5 --short-name "user-auth" "Add user authentication"`
- PowerShell example: `.specify/scripts/powershell/create-new-feature.ps1 -Json "$ARGUMENTS" -Json -Number 5 -ShortName "user-auth" "Add user authentication"`
**IMPORTANT**:
- Check all three sources (remote branches, local branches, specs directories) to find the highest number
- Only match branches/directories with the exact short-name pattern
- If no existing branches/directories found with this short-name, start with number 1
- You must only ever run this script once per feature
- The JSON is provided in the terminal as output - always refer to it to get the actual content you're looking for
- The JSON output will contain BRANCH_NAME and SPEC_FILE paths
- For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot")
3. Load `.specify/templates/spec-template.md` to understand required sections.
4. Follow this execution flow:
1. Parse user description from Input
If empty: ERROR "No feature description provided"
2. Extract key concepts from description
Identify: actors, actions, data, constraints
3. For unclear aspects:
- Make informed guesses based on context and industry standards
- Only mark with [NEEDS CLARIFICATION: specific question] if:
- The choice significantly impacts feature scope or user experience
- Multiple reasonable interpretations exist with different implications
- No reasonable default exists
- **LIMIT: Maximum 3 [NEEDS CLARIFICATION] markers total**
- Prioritize clarifications by impact: scope > security/privacy > user experience > technical details
4. Fill User Scenarios & Testing section
If no clear user flow: ERROR "Cannot determine user scenarios"
5. Generate Functional Requirements
Each requirement must be testable
Use reasonable defaults for unspecified details (document assumptions in Assumptions section)
6. Define Success Criteria
Create measurable, technology-agnostic outcomes
Include both quantitative metrics (time, performance, volume) and qualitative measures (user satisfaction, task completion)
Each criterion must be verifiable without implementation details
7. Identify Key Entities (if data involved)
8. Return: SUCCESS (spec ready for planning)
5. Write the specification to SPEC_FILE using the template structure, replacing placeholders with concrete details derived from the feature description (arguments) while preserving section order and headings.
6. **Specification Quality Validation**: After writing the initial spec, validate it against quality criteria:
a. **Create Spec Quality Checklist**: Generate a checklist file at `FEATURE_DIR/checklists/requirements.md` using the checklist template structure with these validation items:
```markdown
# Specification Quality Checklist: [FEATURE NAME]
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: [DATE]
**Feature**: [Link to spec.md]
## Content Quality
- [ ] No implementation details (languages, frameworks, APIs)
- [ ] Focused on user value and business needs
- [ ] Written for non-technical stakeholders
- [ ] All mandatory sections completed
## Requirement Completeness
- [ ] No [NEEDS CLARIFICATION] markers remain
- [ ] Requirements are testable and unambiguous
- [ ] Success criteria are measurable
- [ ] Success criteria are technology-agnostic (no implementation details)
- [ ] All acceptance scenarios are defined
- [ ] Edge cases are identified
- [ ] Scope is clearly bounded
- [ ] Dependencies and assumptions identified
## Feature Readiness
- [ ] All functional requirements have clear acceptance criteria
- [ ] User scenarios cover primary flows
- [ ] Feature meets measurable outcomes defined in Success Criteria
- [ ] No implementation details leak into specification
## Notes
- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`
```
b. **Run Validation Check**: Review the spec against each checklist item:
- For each item, determine if it passes or fails
- Document specific issues found (quote relevant spec sections)
c. **Handle Validation Results**:
- **If all items pass**: Mark checklist complete and proceed to step 6
- **If items fail (excluding [NEEDS CLARIFICATION])**:
1. List the failing items and specific issues
2. Update the spec to address each issue
3. Re-run validation until all items pass (max 3 iterations)
4. If still failing after 3 iterations, document remaining issues in checklist notes and warn user
- **If [NEEDS CLARIFICATION] markers remain**:
1. Extract all [NEEDS CLARIFICATION: ...] markers from the spec
2. **LIMIT CHECK**: If more than 3 markers exist, keep only the 3 most critical (by scope/security/UX impact) and make informed guesses for the rest
3. For each clarification needed (max 3), present options to user in this format:
```markdown
## Question [N]: [Topic]
**Context**: [Quote relevant spec section]
**What we need to know**: [Specific question from NEEDS CLARIFICATION marker]
**Suggested Answers**:
| Option | Answer | Implications |
|--------|--------|--------------|
| A | [First suggested answer] | [What this means for the feature] |
| B | [Second suggested answer] | [What this means for the feature] |
| C | [Third suggested answer] | [What this means for the feature] |
| Custom | Provide your own answer | [Explain how to provide custom input] |
**Your choice**: _[Wait for user response]_
```
4. **CRITICAL - Table Formatting**: Ensure markdown tables are properly formatted:
- Use consistent spacing with pipes aligned
- Each cell should have spaces around content: `| Content |` not `|Content|`
- Header separator must have at least 3 dashes: `|--------|`
- Test that the table renders correctly in markdown preview
5. Number questions sequentially (Q1, Q2, Q3 - max 3 total)
6. Present all questions together before waiting for responses
7. Wait for user to respond with their choices for all questions (e.g., "Q1: A, Q2: Custom - [details], Q3: B")
8. Update the spec by replacing each [NEEDS CLARIFICATION] marker with the user's selected or provided answer
9. Re-run validation after all clarifications are resolved
d. **Update Checklist**: After each validation iteration, update the checklist file with current pass/fail status
7. Report completion with branch name, spec file path, checklist results, and readiness for the next phase (`/speckit.clarify` or `/speckit.plan`).
**NOTE:** The script creates and checks out the new branch and initializes the spec file before writing.
## General Guidelines
## Quick Guidelines
- Focus on **WHAT** users need and **WHY**.
- Avoid HOW to implement (no tech stack, APIs, code structure).
- Written for business stakeholders, not developers.
- DO NOT create any checklists that are embedded in the spec. That will be a separate command.
### Section Requirements
- **Mandatory sections**: Must be completed for every feature
- **Optional sections**: Include only when relevant to the feature
- When a section doesn't apply, remove it entirely (don't leave as "N/A")
### For AI Generation
When creating this spec from a user prompt:
1. **Make informed guesses**: Use context, industry standards, and common patterns to fill gaps
2. **Document assumptions**: Record reasonable defaults in the Assumptions section
3. **Limit clarifications**: Maximum 3 [NEEDS CLARIFICATION] markers - use only for critical decisions that:
- Significantly impact feature scope or user experience
- Have multiple reasonable interpretations with different implications
- Lack any reasonable default
4. **Prioritize clarifications**: scope > security/privacy > user experience > technical details
5. **Think like a tester**: Every vague requirement should fail the "testable and unambiguous" checklist item
6. **Common areas needing clarification** (only if no reasonable default exists):
- Feature scope and boundaries (include/exclude specific use cases)
- User types and permissions (if multiple conflicting interpretations possible)
- Security/compliance requirements (when legally/financially significant)
**Examples of reasonable defaults** (don't ask about these):
- Data retention: Industry-standard practices for the domain
- Performance targets: Standard web/mobile app expectations unless specified
- Error handling: User-friendly messages with appropriate fallbacks
- Authentication method: Standard session-based or OAuth2 for web apps
- Integration patterns: Use project-appropriate patterns (REST/GraphQL for web services, function calls for libraries, CLI args for tools, etc.)
### Success Criteria Guidelines
Success criteria must be:
1. **Measurable**: Include specific metrics (time, percentage, count, rate)
2. **Technology-agnostic**: No mention of frameworks, languages, databases, or tools
3. **User-focused**: Describe outcomes from user/business perspective, not system internals
4. **Verifiable**: Can be tested/validated without knowing implementation details
**Good examples**:
- "Users can complete checkout in under 3 minutes"
- "System supports 10,000 concurrent users"
- "95% of searches return results in under 1 second"
- "Task completion rate improves by 40%"
**Bad examples** (implementation-focused):
- "API response time is under 200ms" (too technical, use "Users see results instantly")
- "Database can handle 1000 TPS" (implementation detail, use user-facing metric)
- "React components render efficiently" (framework-specific)
- "Redis cache hit rate above 80%" (technology-specific)

View File

@@ -1,137 +0,0 @@
---
description: Generate an actionable, dependency-ordered tasks.md for the feature based on available design artifacts.
handoffs:
- label: Analyze For Consistency
agent: speckit.analyze
prompt: Run a project analysis for consistency
send: true
- label: Implement Project
agent: speckit.implement
prompt: Start the implementation in phases
send: true
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. **Setup**: Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
2. **Load design documents**: Read from FEATURE_DIR:
- **Required**: plan.md (tech stack, libraries, structure), spec.md (user stories with priorities)
- **Optional**: data-model.md (entities), contracts/ (interface contracts), research.md (decisions), quickstart.md (test scenarios)
- Note: Not all projects have all documents. Generate tasks based on what's available.
3. **Execute task generation workflow**:
- Load plan.md and extract tech stack, libraries, project structure
- Load spec.md and extract user stories with their priorities (P1, P2, P3, etc.)
- If data-model.md exists: Extract entities and map to user stories
- If contracts/ exists: Map interface contracts to user stories
- If research.md exists: Extract decisions for setup tasks
- Generate tasks organized by user story (see Task Generation Rules below)
- Generate dependency graph showing user story completion order
- Create parallel execution examples per user story
- Validate task completeness (each user story has all needed tasks, independently testable)
4. **Generate tasks.md**: Use `.specify/templates/tasks-template.md` as structure, fill with:
- Correct feature name from plan.md
- Phase 1: Setup tasks (project initialization)
- Phase 2: Foundational tasks (blocking prerequisites for all user stories)
- Phase 3+: One phase per user story (in priority order from spec.md)
- Each phase includes: story goal, independent test criteria, tests (if requested), implementation tasks
- Final Phase: Polish & cross-cutting concerns
- All tasks must follow the strict checklist format (see Task Generation Rules below)
- Clear file paths for each task
- Dependencies section showing story completion order
- Parallel execution examples per story
- Implementation strategy section (MVP first, incremental delivery)
5. **Report**: Output path to generated tasks.md and summary:
- Total task count
- Task count per user story
- Parallel opportunities identified
- Independent test criteria for each story
- Suggested MVP scope (typically just User Story 1)
- Format validation: Confirm ALL tasks follow the checklist format (checkbox, ID, labels, file paths)
Context for task generation: $ARGUMENTS
The tasks.md should be immediately executable - each task must be specific enough that an LLM can complete it without additional context.
## Task Generation Rules
**CRITICAL**: Tasks MUST be organized by user story to enable independent implementation and testing.
**Tests are OPTIONAL**: Only generate test tasks if explicitly requested in the feature specification or if user requests TDD approach.
### Checklist Format (REQUIRED)
Every task MUST strictly follow this format:
```text
- [ ] [TaskID] [P?] [Story?] Description with file path
```
**Format Components**:
1. **Checkbox**: ALWAYS start with `- [ ]` (markdown checkbox)
2. **Task ID**: Sequential number (T001, T002, T003...) in execution order
3. **[P] marker**: Include ONLY if task is parallelizable (different files, no dependencies on incomplete tasks)
4. **[Story] label**: REQUIRED for user story phase tasks only
- Format: [US1], [US2], [US3], etc. (maps to user stories from spec.md)
- Setup phase: NO story label
- Foundational phase: NO story label
- User Story phases: MUST have story label
- Polish phase: NO story label
5. **Description**: Clear action with exact file path
**Examples**:
- ✅ CORRECT: `- [ ] T001 Create project structure per implementation plan`
- ✅ CORRECT: `- [ ] T005 [P] Implement authentication middleware in src/middleware/auth.py`
- ✅ CORRECT: `- [ ] T012 [P] [US1] Create User model in src/models/user.py`
- ✅ CORRECT: `- [ ] T014 [US1] Implement UserService in src/services/user_service.py`
- ❌ WRONG: `- [ ] Create User model` (missing ID and Story label)
- ❌ WRONG: `T001 [US1] Create model` (missing checkbox)
- ❌ WRONG: `- [ ] [US1] Create User model` (missing Task ID)
- ❌ WRONG: `- [ ] T001 [US1] Create model` (missing file path)
### Task Organization
1. **From User Stories (spec.md)** - PRIMARY ORGANIZATION:
- Each user story (P1, P2, P3...) gets its own phase
- Map all related components to their story:
- Models needed for that story
- Services needed for that story
- Interfaces/UI needed for that story
- If tests requested: Tests specific to that story
- Mark story dependencies (most stories should be independent)
2. **From Contracts**:
- Map each interface contract → to the user story it serves
- If tests requested: Each interface contract → contract test task [P] before implementation in that story's phase
3. **From Data Model**:
- Map each entity to the user story(ies) that need it
- If entity serves multiple stories: Put in earliest story or Setup phase
- Relationships → service layer tasks in appropriate story phase
4. **From Setup/Infrastructure**:
- Shared infrastructure → Setup phase (Phase 1)
- Foundational/blocking tasks → Foundational phase (Phase 2)
- Story-specific setup → within that story's phase
### Phase Structure
- **Phase 1**: Setup (project initialization)
- **Phase 2**: Foundational (blocking prerequisites - MUST complete before user stories)
- **Phase 3+**: User Stories in priority order (P1, P2, P3...)
- Within each story: Tests (if requested) → Models → Services → Endpoints → Integration
- Each phase should be a complete, independently testable increment
- **Final Phase**: Polish & Cross-Cutting Concerns

View File

@@ -1,30 +0,0 @@
---
description: Convert existing tasks into actionable, dependency-ordered GitHub issues for the feature based on available design artifacts.
tools: ['github/github-mcp-server/issue_write']
---
## User Input
```text
$ARGUMENTS
```
You **MUST** consider the user input before proceeding (if not empty).
## Outline
1. Run `.specify/scripts/powershell/check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks` from repo root and parse FEATURE_DIR and AVAILABLE_DOCS list. All paths must be absolute. For single quotes in args like "I'm Groot", use escape syntax: e.g 'I'\''m Groot' (or double-quote if possible: "I'm Groot").
1. From the executed script, extract the path to **tasks**.
1. Get the Git remote by running:
```bash
git config --get remote.origin.url
```
> [!CAUTION]
> ONLY PROCEED TO NEXT STEPS IF THE REMOTE IS A GITHUB URL
1. For each task in the list, use the GitHub MCP server to create a new issue in the repository that is representative of the Git remote.
> [!CAUTION]
> UNDER NO CIRCUMSTANCES EVER CREATE ISSUES IN REPOSITORIES THAT DO NOT MATCH THE REMOTE URL

2
.gitignore vendored
View File

@@ -2,7 +2,6 @@
database/target
*.db
.env
/tmp
# Default ignored files
.idea/shelf/
@@ -12,4 +11,3 @@ database/target
.idea/dataSources.local.xml
# Editor-based HTTP Client requests
.idea/httpRequests/
/.claude/settings.local.json

1
.idea/image-api.iml generated
View File

@@ -3,7 +3,6 @@
<component name="NewModuleRootManager">
<content url="file://$MODULE_DIR$">
<sourceFolder url="file://$MODULE_DIR$/src" isTestSource="false" />
<excludeFolder url="file://$MODULE_DIR$/.idea/dataSources" />
<excludeFolder url="file://$MODULE_DIR$/target" />
</content>
<orderEntry type="inheritedJdk" />

View File

@@ -1,149 +0,0 @@
<!--
Sync Impact Report
==================
Version change: (new) -> 1.0.0
Modified principles: N/A (initial ratification)
Added sections:
- Core Principles (5 principles)
- Technology Stack & Constraints
- Development Workflow
- Governance
Removed sections: N/A
Templates requiring updates:
- .specify/templates/plan-template.md — ✅ no changes needed (Constitution Check section is generic)
- .specify/templates/spec-template.md — ✅ no changes needed
- .specify/templates/tasks-template.md — ✅ no changes needed
- .specify/templates/checklist-template.md — ✅ no changes needed
- .specify/templates/agent-file-template.md — ✅ no changes needed
Follow-up TODOs: None
-->
# ImageApi Constitution
## Core Principles
### I. Layered Architecture
All features MUST follow the established layered architecture:
- **HTTP Layer** (`main.rs`, feature modules): Route handlers, request
parsing, response formatting. No direct database access.
- **Service Layer** (`files.rs`, `exif.rs`, `memories.rs`, etc.): Business
logic. No HTTP-specific types.
- **DAO Layer** (`database/` trait definitions): Trait-based data access
contracts. Every DAO MUST be defined as a trait to enable mock
implementations for testing.
- **Database Layer** (Diesel ORM, `schema.rs`): Concrete `Sqlite*Dao`
implementations. All queries traced with OpenTelemetry.
New features MUST NOT bypass layers (e.g., HTTP handlers MUST NOT
execute raw SQL). Actix actors are permitted for long-running async
work (video processing, file watching) but MUST interact with the
DAO layer through the established trait interfaces.
### II. Path Safety (NON-NEGOTIABLE)
All user-supplied file paths MUST be validated against `BASE_PATH`
using `is_valid_full_path()` before any filesystem operation. This
prevents directory traversal attacks.
- Paths stored in the database MUST be relative to `BASE_PATH`.
- Paths passed to external tools (ffmpeg, image processing) MUST be
fully resolved absolute paths.
- Extension detection MUST use the centralized helpers in
`file_types.rs` (case-insensitive). Manual string matching on
extensions is prohibited.
### III. Trait-Based Testability
All data access MUST go through trait-based DAOs so that every
handler and service can be tested with mock implementations.
- Each DAO trait MUST be defined in `src/database/` and require
`Sync + Send`.
- Mock DAOs for testing MUST live in `src/testhelpers.rs`.
- Integration tests against real SQLite MUST use in-memory databases
via `in_memory_db_connection()` from `database::test`.
- Handler tests MUST use `actix_web::test` utilities with JWT token
injection (using `Claims::valid_user()` and the `test_key` secret).
- New DAO implementations MUST include a `#[cfg(test)]` constructor
(e.g., `from_connection`) accepting an injected connection.
### IV. Environment-Driven Configuration
Server behavior MUST be controlled through environment variables
loaded from `.env` files. Hard-coded paths, URLs, or secrets are
prohibited.
- Required variables MUST call `.expect()` with a clear message at
startup so misconfiguration fails fast.
- Optional variables MUST use `.unwrap_or_else()` with sensible
defaults and be documented in `README.md`.
- Any new environment variable MUST be added to the README
environment section before the feature is considered complete.
### V. Observability
All database operations and HTTP handlers MUST be instrumented
with OpenTelemetry spans via the `trace_db_call` helper or
equivalent tracing macros.
- Release builds export traces to the configured OTLP endpoint.
- Debug builds use the basic logger.
- Prometheus metrics (`imageserver_image_total`,
`imageserver_video_total`) MUST be maintained for key counters.
- Errors MUST be logged at `error!` level with sufficient context
for debugging without reproducing the issue.
## Technology Stack & Constraints
- **Language**: Rust (stable toolchain, Cargo build system)
- **HTTP Framework**: Actix-web 4
- **ORM**: Diesel 2.2 with SQLite backend
- **Auth**: JWT (HS256) via `jsonwebtoken` crate, bcrypt password
hashing
- **Video Processing**: ffmpeg/ffprobe (CLI, must be on PATH)
- **Image Processing**: `image` crate for thumbnails, `kamadak-exif`
for EXIF extraction
- **Tracing**: OpenTelemetry with OTLP export (release),
basic logger (debug)
- **Testing**: `cargo test`, `actix_web::test`, in-memory SQLite
External dependencies (ffmpeg, Ollama) are optional runtime
requirements. The server MUST start and serve core functionality
(images, thumbnails, tags) without them. Features that depend on
optional services MUST degrade gracefully with logged warnings,
not panics.
## Development Workflow
- `cargo fmt` MUST pass before committing.
- `cargo clippy` warnings MUST be resolved or explicitly suppressed
with a justification comment.
- `cargo test` MUST pass with all tests green before merging to
master.
- Database schema changes MUST use Diesel migrations
(`diesel migration generate`), with hand-written SQL in `up.sql`
and `down.sql`, followed by `diesel print-schema` to regenerate
`schema.rs`.
- Features MUST be developed on named branches
(`###-feature-name`) and merged to master via pull request.
- File uploads MUST preserve existing files (append timestamp on
conflict, never overwrite).
## Governance
This constitution defines the non-negotiable architectural and
development standards for the ImageApi project. All code changes
MUST comply with these principles.
- **Amendments**: Any change to this constitution MUST be documented
with a version bump, rationale, and updated Sync Impact Report.
- **Versioning**: MAJOR for principle removals/redefinitions, MINOR
for new principles or material expansions, PATCH for wording
clarifications.
- **Compliance**: Pull request reviews SHOULD verify adherence to
these principles. The CLAUDE.md file provides runtime development
guidance and MUST remain consistent with this constitution.
**Version**: 1.0.0 | **Ratified**: 2026-02-26 | **Last Amended**: 2026-02-26

View File

@@ -1,148 +0,0 @@
#!/usr/bin/env pwsh
# Consolidated prerequisite checking script (PowerShell)
#
# This script provides unified prerequisite checking for Spec-Driven Development workflow.
# It replaces the functionality previously spread across multiple scripts.
#
# Usage: ./check-prerequisites.ps1 [OPTIONS]
#
# OPTIONS:
# -Json Output in JSON format
# -RequireTasks Require tasks.md to exist (for implementation phase)
# -IncludeTasks Include tasks.md in AVAILABLE_DOCS list
# -PathsOnly Only output path variables (no validation)
# -Help, -h Show help message
[CmdletBinding()]
param(
[switch]$Json,
[switch]$RequireTasks,
[switch]$IncludeTasks,
[switch]$PathsOnly,
[switch]$Help
)
$ErrorActionPreference = 'Stop'
# Show help if requested
if ($Help) {
Write-Output @"
Usage: check-prerequisites.ps1 [OPTIONS]
Consolidated prerequisite checking for Spec-Driven Development workflow.
OPTIONS:
-Json Output in JSON format
-RequireTasks Require tasks.md to exist (for implementation phase)
-IncludeTasks Include tasks.md in AVAILABLE_DOCS list
-PathsOnly Only output path variables (no prerequisite validation)
-Help, -h Show this help message
EXAMPLES:
# Check task prerequisites (plan.md required)
.\check-prerequisites.ps1 -Json
# Check implementation prerequisites (plan.md + tasks.md required)
.\check-prerequisites.ps1 -Json -RequireTasks -IncludeTasks
# Get feature paths only (no validation)
.\check-prerequisites.ps1 -PathsOnly
"@
exit 0
}
# Source common functions
. "$PSScriptRoot/common.ps1"
# Get feature paths and validate branch
$paths = Get-FeaturePathsEnv
if (-not (Test-FeatureBranch -Branch $paths.CURRENT_BRANCH -HasGit:$paths.HAS_GIT)) {
exit 1
}
# If paths-only mode, output paths and exit (support combined -Json -PathsOnly)
if ($PathsOnly) {
if ($Json) {
[PSCustomObject]@{
REPO_ROOT = $paths.REPO_ROOT
BRANCH = $paths.CURRENT_BRANCH
FEATURE_DIR = $paths.FEATURE_DIR
FEATURE_SPEC = $paths.FEATURE_SPEC
IMPL_PLAN = $paths.IMPL_PLAN
TASKS = $paths.TASKS
} | ConvertTo-Json -Compress
} else {
Write-Output "REPO_ROOT: $($paths.REPO_ROOT)"
Write-Output "BRANCH: $($paths.CURRENT_BRANCH)"
Write-Output "FEATURE_DIR: $($paths.FEATURE_DIR)"
Write-Output "FEATURE_SPEC: $($paths.FEATURE_SPEC)"
Write-Output "IMPL_PLAN: $($paths.IMPL_PLAN)"
Write-Output "TASKS: $($paths.TASKS)"
}
exit 0
}
# Validate required directories and files
if (-not (Test-Path $paths.FEATURE_DIR -PathType Container)) {
Write-Output "ERROR: Feature directory not found: $($paths.FEATURE_DIR)"
Write-Output "Run /speckit.specify first to create the feature structure."
exit 1
}
if (-not (Test-Path $paths.IMPL_PLAN -PathType Leaf)) {
Write-Output "ERROR: plan.md not found in $($paths.FEATURE_DIR)"
Write-Output "Run /speckit.plan first to create the implementation plan."
exit 1
}
# Check for tasks.md if required
if ($RequireTasks -and -not (Test-Path $paths.TASKS -PathType Leaf)) {
Write-Output "ERROR: tasks.md not found in $($paths.FEATURE_DIR)"
Write-Output "Run /speckit.tasks first to create the task list."
exit 1
}
# Build list of available documents
$docs = @()
# Always check these optional docs
if (Test-Path $paths.RESEARCH) { $docs += 'research.md' }
if (Test-Path $paths.DATA_MODEL) { $docs += 'data-model.md' }
# Check contracts directory (only if it exists and has files)
if ((Test-Path $paths.CONTRACTS_DIR) -and (Get-ChildItem -Path $paths.CONTRACTS_DIR -ErrorAction SilentlyContinue | Select-Object -First 1)) {
$docs += 'contracts/'
}
if (Test-Path $paths.QUICKSTART) { $docs += 'quickstart.md' }
# Include tasks.md if requested and it exists
if ($IncludeTasks -and (Test-Path $paths.TASKS)) {
$docs += 'tasks.md'
}
# Output results
if ($Json) {
# JSON output
[PSCustomObject]@{
FEATURE_DIR = $paths.FEATURE_DIR
AVAILABLE_DOCS = $docs
} | ConvertTo-Json -Compress
} else {
# Text output
Write-Output "FEATURE_DIR:$($paths.FEATURE_DIR)"
Write-Output "AVAILABLE_DOCS:"
# Show status of each potential document
Test-FileExists -Path $paths.RESEARCH -Description 'research.md' | Out-Null
Test-FileExists -Path $paths.DATA_MODEL -Description 'data-model.md' | Out-Null
Test-DirHasFiles -Path $paths.CONTRACTS_DIR -Description 'contracts/' | Out-Null
Test-FileExists -Path $paths.QUICKSTART -Description 'quickstart.md' | Out-Null
if ($IncludeTasks) {
Test-FileExists -Path $paths.TASKS -Description 'tasks.md' | Out-Null
}
}

View File

@@ -1,137 +0,0 @@
#!/usr/bin/env pwsh
# Common PowerShell functions analogous to common.sh
function Get-RepoRoot {
try {
$result = git rev-parse --show-toplevel 2>$null
if ($LASTEXITCODE -eq 0) {
return $result
}
} catch {
# Git command failed
}
# Fall back to script location for non-git repos
return (Resolve-Path (Join-Path $PSScriptRoot "../../..")).Path
}
function Get-CurrentBranch {
# First check if SPECIFY_FEATURE environment variable is set
if ($env:SPECIFY_FEATURE) {
return $env:SPECIFY_FEATURE
}
# Then check git if available
try {
$result = git rev-parse --abbrev-ref HEAD 2>$null
if ($LASTEXITCODE -eq 0) {
return $result
}
} catch {
# Git command failed
}
# For non-git repos, try to find the latest feature directory
$repoRoot = Get-RepoRoot
$specsDir = Join-Path $repoRoot "specs"
if (Test-Path $specsDir) {
$latestFeature = ""
$highest = 0
Get-ChildItem -Path $specsDir -Directory | ForEach-Object {
if ($_.Name -match '^(\d{3})-') {
$num = [int]$matches[1]
if ($num -gt $highest) {
$highest = $num
$latestFeature = $_.Name
}
}
}
if ($latestFeature) {
return $latestFeature
}
}
# Final fallback
return "main"
}
function Test-HasGit {
try {
git rev-parse --show-toplevel 2>$null | Out-Null
return ($LASTEXITCODE -eq 0)
} catch {
return $false
}
}
function Test-FeatureBranch {
param(
[string]$Branch,
[bool]$HasGit = $true
)
# For non-git repos, we can't enforce branch naming but still provide output
if (-not $HasGit) {
Write-Warning "[specify] Warning: Git repository not detected; skipped branch validation"
return $true
}
if ($Branch -notmatch '^[0-9]{3}-') {
Write-Output "ERROR: Not on a feature branch. Current branch: $Branch"
Write-Output "Feature branches should be named like: 001-feature-name"
return $false
}
return $true
}
function Get-FeatureDir {
param([string]$RepoRoot, [string]$Branch)
Join-Path $RepoRoot "specs/$Branch"
}
function Get-FeaturePathsEnv {
$repoRoot = Get-RepoRoot
$currentBranch = Get-CurrentBranch
$hasGit = Test-HasGit
$featureDir = Get-FeatureDir -RepoRoot $repoRoot -Branch $currentBranch
[PSCustomObject]@{
REPO_ROOT = $repoRoot
CURRENT_BRANCH = $currentBranch
HAS_GIT = $hasGit
FEATURE_DIR = $featureDir
FEATURE_SPEC = Join-Path $featureDir 'spec.md'
IMPL_PLAN = Join-Path $featureDir 'plan.md'
TASKS = Join-Path $featureDir 'tasks.md'
RESEARCH = Join-Path $featureDir 'research.md'
DATA_MODEL = Join-Path $featureDir 'data-model.md'
QUICKSTART = Join-Path $featureDir 'quickstart.md'
CONTRACTS_DIR = Join-Path $featureDir 'contracts'
}
}
function Test-FileExists {
param([string]$Path, [string]$Description)
if (Test-Path -Path $Path -PathType Leaf) {
Write-Output "$Description"
return $true
} else {
Write-Output "$Description"
return $false
}
}
function Test-DirHasFiles {
param([string]$Path, [string]$Description)
if ((Test-Path -Path $Path -PathType Container) -and (Get-ChildItem -Path $Path -ErrorAction SilentlyContinue | Where-Object { -not $_.PSIsContainer } | Select-Object -First 1)) {
Write-Output "$Description"
return $true
} else {
Write-Output "$Description"
return $false
}
}

View File

@@ -1,283 +0,0 @@
#!/usr/bin/env pwsh
# Create a new feature
[CmdletBinding()]
param(
[switch]$Json,
[string]$ShortName,
[int]$Number = 0,
[switch]$Help,
[Parameter(ValueFromRemainingArguments = $true)]
[string[]]$FeatureDescription
)
$ErrorActionPreference = 'Stop'
# Show help if requested
if ($Help) {
Write-Host "Usage: ./create-new-feature.ps1 [-Json] [-ShortName <name>] [-Number N] <feature description>"
Write-Host ""
Write-Host "Options:"
Write-Host " -Json Output in JSON format"
Write-Host " -ShortName <name> Provide a custom short name (2-4 words) for the branch"
Write-Host " -Number N Specify branch number manually (overrides auto-detection)"
Write-Host " -Help Show this help message"
Write-Host ""
Write-Host "Examples:"
Write-Host " ./create-new-feature.ps1 'Add user authentication system' -ShortName 'user-auth'"
Write-Host " ./create-new-feature.ps1 'Implement OAuth2 integration for API'"
exit 0
}
# Check if feature description provided
if (-not $FeatureDescription -or $FeatureDescription.Count -eq 0) {
Write-Error "Usage: ./create-new-feature.ps1 [-Json] [-ShortName <name>] <feature description>"
exit 1
}
$featureDesc = ($FeatureDescription -join ' ').Trim()
# Resolve repository root. Prefer git information when available, but fall back
# to searching for repository markers so the workflow still functions in repositories that
# were initialized with --no-git.
function Find-RepositoryRoot {
param(
[string]$StartDir,
[string[]]$Markers = @('.git', '.specify')
)
$current = Resolve-Path $StartDir
while ($true) {
foreach ($marker in $Markers) {
if (Test-Path (Join-Path $current $marker)) {
return $current
}
}
$parent = Split-Path $current -Parent
if ($parent -eq $current) {
# Reached filesystem root without finding markers
return $null
}
$current = $parent
}
}
function Get-HighestNumberFromSpecs {
param([string]$SpecsDir)
$highest = 0
if (Test-Path $SpecsDir) {
Get-ChildItem -Path $SpecsDir -Directory | ForEach-Object {
if ($_.Name -match '^(\d+)') {
$num = [int]$matches[1]
if ($num -gt $highest) { $highest = $num }
}
}
}
return $highest
}
function Get-HighestNumberFromBranches {
param()
$highest = 0
try {
$branches = git branch -a 2>$null
if ($LASTEXITCODE -eq 0) {
foreach ($branch in $branches) {
# Clean branch name: remove leading markers and remote prefixes
$cleanBranch = $branch.Trim() -replace '^\*?\s+', '' -replace '^remotes/[^/]+/', ''
# Extract feature number if branch matches pattern ###-*
if ($cleanBranch -match '^(\d+)-') {
$num = [int]$matches[1]
if ($num -gt $highest) { $highest = $num }
}
}
}
} catch {
# If git command fails, return 0
Write-Verbose "Could not check Git branches: $_"
}
return $highest
}
function Get-NextBranchNumber {
param(
[string]$SpecsDir
)
# Fetch all remotes to get latest branch info (suppress errors if no remotes)
try {
git fetch --all --prune 2>$null | Out-Null
} catch {
# Ignore fetch errors
}
# Get highest number from ALL branches (not just matching short name)
$highestBranch = Get-HighestNumberFromBranches
# Get highest number from ALL specs (not just matching short name)
$highestSpec = Get-HighestNumberFromSpecs -SpecsDir $SpecsDir
# Take the maximum of both
$maxNum = [Math]::Max($highestBranch, $highestSpec)
# Return next number
return $maxNum + 1
}
function ConvertTo-CleanBranchName {
param([string]$Name)
return $Name.ToLower() -replace '[^a-z0-9]', '-' -replace '-{2,}', '-' -replace '^-', '' -replace '-$', ''
}
$fallbackRoot = (Find-RepositoryRoot -StartDir $PSScriptRoot)
if (-not $fallbackRoot) {
Write-Error "Error: Could not determine repository root. Please run this script from within the repository."
exit 1
}
try {
$repoRoot = git rev-parse --show-toplevel 2>$null
if ($LASTEXITCODE -eq 0) {
$hasGit = $true
} else {
throw "Git not available"
}
} catch {
$repoRoot = $fallbackRoot
$hasGit = $false
}
Set-Location $repoRoot
$specsDir = Join-Path $repoRoot 'specs'
New-Item -ItemType Directory -Path $specsDir -Force | Out-Null
# Function to generate branch name with stop word filtering and length filtering
function Get-BranchName {
param([string]$Description)
# Common stop words to filter out
$stopWords = @(
'i', 'a', 'an', 'the', 'to', 'for', 'of', 'in', 'on', 'at', 'by', 'with', 'from',
'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had',
'do', 'does', 'did', 'will', 'would', 'should', 'could', 'can', 'may', 'might', 'must', 'shall',
'this', 'that', 'these', 'those', 'my', 'your', 'our', 'their',
'want', 'need', 'add', 'get', 'set'
)
# Convert to lowercase and extract words (alphanumeric only)
$cleanName = $Description.ToLower() -replace '[^a-z0-9\s]', ' '
$words = $cleanName -split '\s+' | Where-Object { $_ }
# Filter words: remove stop words and words shorter than 3 chars (unless they're uppercase acronyms in original)
$meaningfulWords = @()
foreach ($word in $words) {
# Skip stop words
if ($stopWords -contains $word) { continue }
# Keep words that are length >= 3 OR appear as uppercase in original (likely acronyms)
if ($word.Length -ge 3) {
$meaningfulWords += $word
} elseif ($Description -match "\b$($word.ToUpper())\b") {
# Keep short words if they appear as uppercase in original (likely acronyms)
$meaningfulWords += $word
}
}
# If we have meaningful words, use first 3-4 of them
if ($meaningfulWords.Count -gt 0) {
$maxWords = if ($meaningfulWords.Count -eq 4) { 4 } else { 3 }
$result = ($meaningfulWords | Select-Object -First $maxWords) -join '-'
return $result
} else {
# Fallback to original logic if no meaningful words found
$result = ConvertTo-CleanBranchName -Name $Description
$fallbackWords = ($result -split '-') | Where-Object { $_ } | Select-Object -First 3
return [string]::Join('-', $fallbackWords)
}
}
# Generate branch name
if ($ShortName) {
# Use provided short name, just clean it up
$branchSuffix = ConvertTo-CleanBranchName -Name $ShortName
} else {
# Generate from description with smart filtering
$branchSuffix = Get-BranchName -Description $featureDesc
}
# Determine branch number
if ($Number -eq 0) {
if ($hasGit) {
# Check existing branches on remotes
$Number = Get-NextBranchNumber -SpecsDir $specsDir
} else {
# Fall back to local directory check
$Number = (Get-HighestNumberFromSpecs -SpecsDir $specsDir) + 1
}
}
$featureNum = ('{0:000}' -f $Number)
$branchName = "$featureNum-$branchSuffix"
# GitHub enforces a 244-byte limit on branch names
# Validate and truncate if necessary
$maxBranchLength = 244
if ($branchName.Length -gt $maxBranchLength) {
# Calculate how much we need to trim from suffix
# Account for: feature number (3) + hyphen (1) = 4 chars
$maxSuffixLength = $maxBranchLength - 4
# Truncate suffix
$truncatedSuffix = $branchSuffix.Substring(0, [Math]::Min($branchSuffix.Length, $maxSuffixLength))
# Remove trailing hyphen if truncation created one
$truncatedSuffix = $truncatedSuffix -replace '-$', ''
$originalBranchName = $branchName
$branchName = "$featureNum-$truncatedSuffix"
Write-Warning "[specify] Branch name exceeded GitHub's 244-byte limit"
Write-Warning "[specify] Original: $originalBranchName ($($originalBranchName.Length) bytes)"
Write-Warning "[specify] Truncated to: $branchName ($($branchName.Length) bytes)"
}
if ($hasGit) {
try {
git checkout -b $branchName | Out-Null
} catch {
Write-Warning "Failed to create git branch: $branchName"
}
} else {
Write-Warning "[specify] Warning: Git repository not detected; skipped branch creation for $branchName"
}
$featureDir = Join-Path $specsDir $branchName
New-Item -ItemType Directory -Path $featureDir -Force | Out-Null
$template = Join-Path $repoRoot '.specify/templates/spec-template.md'
$specFile = Join-Path $featureDir 'spec.md'
if (Test-Path $template) {
Copy-Item $template $specFile -Force
} else {
New-Item -ItemType File -Path $specFile | Out-Null
}
# Set the SPECIFY_FEATURE environment variable for the current session
$env:SPECIFY_FEATURE = $branchName
if ($Json) {
$obj = [PSCustomObject]@{
BRANCH_NAME = $branchName
SPEC_FILE = $specFile
FEATURE_NUM = $featureNum
HAS_GIT = $hasGit
}
$obj | ConvertTo-Json -Compress
} else {
Write-Output "BRANCH_NAME: $branchName"
Write-Output "SPEC_FILE: $specFile"
Write-Output "FEATURE_NUM: $featureNum"
Write-Output "HAS_GIT: $hasGit"
Write-Output "SPECIFY_FEATURE environment variable set to: $branchName"
}

View File

@@ -1,61 +0,0 @@
#!/usr/bin/env pwsh
# Setup implementation plan for a feature
[CmdletBinding()]
param(
[switch]$Json,
[switch]$Help
)
$ErrorActionPreference = 'Stop'
# Show help if requested
if ($Help) {
Write-Output "Usage: ./setup-plan.ps1 [-Json] [-Help]"
Write-Output " -Json Output results in JSON format"
Write-Output " -Help Show this help message"
exit 0
}
# Load common functions
. "$PSScriptRoot/common.ps1"
# Get all paths and variables from common functions
$paths = Get-FeaturePathsEnv
# Check if we're on a proper feature branch (only for git repos)
if (-not (Test-FeatureBranch -Branch $paths.CURRENT_BRANCH -HasGit $paths.HAS_GIT)) {
exit 1
}
# Ensure the feature directory exists
New-Item -ItemType Directory -Path $paths.FEATURE_DIR -Force | Out-Null
# Copy plan template if it exists, otherwise note it or create empty file
$template = Join-Path $paths.REPO_ROOT '.specify/templates/plan-template.md'
if (Test-Path $template) {
Copy-Item $template $paths.IMPL_PLAN -Force
Write-Output "Copied plan template to $($paths.IMPL_PLAN)"
} else {
Write-Warning "Plan template not found at $template"
# Create a basic plan file if template doesn't exist
New-Item -ItemType File -Path $paths.IMPL_PLAN -Force | Out-Null
}
# Output results
if ($Json) {
$result = [PSCustomObject]@{
FEATURE_SPEC = $paths.FEATURE_SPEC
IMPL_PLAN = $paths.IMPL_PLAN
SPECS_DIR = $paths.FEATURE_DIR
BRANCH = $paths.CURRENT_BRANCH
HAS_GIT = $paths.HAS_GIT
}
$result | ConvertTo-Json -Compress
} else {
Write-Output "FEATURE_SPEC: $($paths.FEATURE_SPEC)"
Write-Output "IMPL_PLAN: $($paths.IMPL_PLAN)"
Write-Output "SPECS_DIR: $($paths.FEATURE_DIR)"
Write-Output "BRANCH: $($paths.CURRENT_BRANCH)"
Write-Output "HAS_GIT: $($paths.HAS_GIT)"
}

View File

@@ -1,452 +0,0 @@
#!/usr/bin/env pwsh
<#!
.SYNOPSIS
Update agent context files with information from plan.md (PowerShell version)
.DESCRIPTION
Mirrors the behavior of scripts/bash/update-agent-context.sh:
1. Environment Validation
2. Plan Data Extraction
3. Agent File Management (create from template or update existing)
4. Content Generation (technology stack, recent changes, timestamp)
5. Multi-Agent Support (claude, gemini, copilot, cursor-agent, qwen, opencode, codex, windsurf, kilocode, auggie, roo, codebuddy, amp, shai, q, agy, bob, qodercli)
.PARAMETER AgentType
Optional agent key to update a single agent. If omitted, updates all existing agent files (creating a default Claude file if none exist).
.EXAMPLE
./update-agent-context.ps1 -AgentType claude
.EXAMPLE
./update-agent-context.ps1 # Updates all existing agent files
.NOTES
Relies on common helper functions in common.ps1
#>
param(
[Parameter(Position=0)]
[ValidateSet('claude','gemini','copilot','cursor-agent','qwen','opencode','codex','windsurf','kilocode','auggie','roo','codebuddy','amp','shai','q','agy','bob','qodercli','generic')]
[string]$AgentType
)
$ErrorActionPreference = 'Stop'
# Import common helpers
$ScriptDir = Split-Path -Parent $MyInvocation.MyCommand.Path
. (Join-Path $ScriptDir 'common.ps1')
# Acquire environment paths
$envData = Get-FeaturePathsEnv
$REPO_ROOT = $envData.REPO_ROOT
$CURRENT_BRANCH = $envData.CURRENT_BRANCH
$HAS_GIT = $envData.HAS_GIT
$IMPL_PLAN = $envData.IMPL_PLAN
$NEW_PLAN = $IMPL_PLAN
# Agent file paths
$CLAUDE_FILE = Join-Path $REPO_ROOT 'CLAUDE.md'
$GEMINI_FILE = Join-Path $REPO_ROOT 'GEMINI.md'
$COPILOT_FILE = Join-Path $REPO_ROOT '.github/agents/copilot-instructions.md'
$CURSOR_FILE = Join-Path $REPO_ROOT '.cursor/rules/specify-rules.mdc'
$QWEN_FILE = Join-Path $REPO_ROOT 'QWEN.md'
$AGENTS_FILE = Join-Path $REPO_ROOT 'AGENTS.md'
$WINDSURF_FILE = Join-Path $REPO_ROOT '.windsurf/rules/specify-rules.md'
$KILOCODE_FILE = Join-Path $REPO_ROOT '.kilocode/rules/specify-rules.md'
$AUGGIE_FILE = Join-Path $REPO_ROOT '.augment/rules/specify-rules.md'
$ROO_FILE = Join-Path $REPO_ROOT '.roo/rules/specify-rules.md'
$CODEBUDDY_FILE = Join-Path $REPO_ROOT 'CODEBUDDY.md'
$QODER_FILE = Join-Path $REPO_ROOT 'QODER.md'
$AMP_FILE = Join-Path $REPO_ROOT 'AGENTS.md'
$SHAI_FILE = Join-Path $REPO_ROOT 'SHAI.md'
$Q_FILE = Join-Path $REPO_ROOT 'AGENTS.md'
$AGY_FILE = Join-Path $REPO_ROOT '.agent/rules/specify-rules.md'
$BOB_FILE = Join-Path $REPO_ROOT 'AGENTS.md'
$TEMPLATE_FILE = Join-Path $REPO_ROOT '.specify/templates/agent-file-template.md'
# Parsed plan data placeholders
$script:NEW_LANG = ''
$script:NEW_FRAMEWORK = ''
$script:NEW_DB = ''
$script:NEW_PROJECT_TYPE = ''
function Write-Info {
param(
[Parameter(Mandatory=$true)]
[string]$Message
)
Write-Host "INFO: $Message"
}
function Write-Success {
param(
[Parameter(Mandatory=$true)]
[string]$Message
)
Write-Host "$([char]0x2713) $Message"
}
function Write-WarningMsg {
param(
[Parameter(Mandatory=$true)]
[string]$Message
)
Write-Warning $Message
}
function Write-Err {
param(
[Parameter(Mandatory=$true)]
[string]$Message
)
Write-Host "ERROR: $Message" -ForegroundColor Red
}
function Validate-Environment {
if (-not $CURRENT_BRANCH) {
Write-Err 'Unable to determine current feature'
if ($HAS_GIT) { Write-Info "Make sure you're on a feature branch" } else { Write-Info 'Set SPECIFY_FEATURE environment variable or create a feature first' }
exit 1
}
if (-not (Test-Path $NEW_PLAN)) {
Write-Err "No plan.md found at $NEW_PLAN"
Write-Info 'Ensure you are working on a feature with a corresponding spec directory'
if (-not $HAS_GIT) { Write-Info 'Use: $env:SPECIFY_FEATURE=your-feature-name or create a new feature first' }
exit 1
}
if (-not (Test-Path $TEMPLATE_FILE)) {
Write-Err "Template file not found at $TEMPLATE_FILE"
Write-Info 'Run specify init to scaffold .specify/templates, or add agent-file-template.md there.'
exit 1
}
}
function Extract-PlanField {
param(
[Parameter(Mandatory=$true)]
[string]$FieldPattern,
[Parameter(Mandatory=$true)]
[string]$PlanFile
)
if (-not (Test-Path $PlanFile)) { return '' }
# Lines like **Language/Version**: Python 3.12
$regex = "^\*\*$([Regex]::Escape($FieldPattern))\*\*: (.+)$"
Get-Content -LiteralPath $PlanFile -Encoding utf8 | ForEach-Object {
if ($_ -match $regex) {
$val = $Matches[1].Trim()
if ($val -notin @('NEEDS CLARIFICATION','N/A')) { return $val }
}
} | Select-Object -First 1
}
function Parse-PlanData {
param(
[Parameter(Mandatory=$true)]
[string]$PlanFile
)
if (-not (Test-Path $PlanFile)) { Write-Err "Plan file not found: $PlanFile"; return $false }
Write-Info "Parsing plan data from $PlanFile"
$script:NEW_LANG = Extract-PlanField -FieldPattern 'Language/Version' -PlanFile $PlanFile
$script:NEW_FRAMEWORK = Extract-PlanField -FieldPattern 'Primary Dependencies' -PlanFile $PlanFile
$script:NEW_DB = Extract-PlanField -FieldPattern 'Storage' -PlanFile $PlanFile
$script:NEW_PROJECT_TYPE = Extract-PlanField -FieldPattern 'Project Type' -PlanFile $PlanFile
if ($NEW_LANG) { Write-Info "Found language: $NEW_LANG" } else { Write-WarningMsg 'No language information found in plan' }
if ($NEW_FRAMEWORK) { Write-Info "Found framework: $NEW_FRAMEWORK" }
if ($NEW_DB -and $NEW_DB -ne 'N/A') { Write-Info "Found database: $NEW_DB" }
if ($NEW_PROJECT_TYPE) { Write-Info "Found project type: $NEW_PROJECT_TYPE" }
return $true
}
function Format-TechnologyStack {
param(
[Parameter(Mandatory=$false)]
[string]$Lang,
[Parameter(Mandatory=$false)]
[string]$Framework
)
$parts = @()
if ($Lang -and $Lang -ne 'NEEDS CLARIFICATION') { $parts += $Lang }
if ($Framework -and $Framework -notin @('NEEDS CLARIFICATION','N/A')) { $parts += $Framework }
if (-not $parts) { return '' }
return ($parts -join ' + ')
}
function Get-ProjectStructure {
param(
[Parameter(Mandatory=$false)]
[string]$ProjectType
)
if ($ProjectType -match 'web') { return "backend/`nfrontend/`ntests/" } else { return "src/`ntests/" }
}
function Get-CommandsForLanguage {
param(
[Parameter(Mandatory=$false)]
[string]$Lang
)
switch -Regex ($Lang) {
'Python' { return "cd src; pytest; ruff check ." }
'Rust' { return "cargo test; cargo clippy" }
'JavaScript|TypeScript' { return "npm test; npm run lint" }
default { return "# Add commands for $Lang" }
}
}
function Get-LanguageConventions {
param(
[Parameter(Mandatory=$false)]
[string]$Lang
)
if ($Lang) { "${Lang}: Follow standard conventions" } else { 'General: Follow standard conventions' }
}
function New-AgentFile {
param(
[Parameter(Mandatory=$true)]
[string]$TargetFile,
[Parameter(Mandatory=$true)]
[string]$ProjectName,
[Parameter(Mandatory=$true)]
[datetime]$Date
)
if (-not (Test-Path $TEMPLATE_FILE)) { Write-Err "Template not found at $TEMPLATE_FILE"; return $false }
$temp = New-TemporaryFile
Copy-Item -LiteralPath $TEMPLATE_FILE -Destination $temp -Force
$projectStructure = Get-ProjectStructure -ProjectType $NEW_PROJECT_TYPE
$commands = Get-CommandsForLanguage -Lang $NEW_LANG
$languageConventions = Get-LanguageConventions -Lang $NEW_LANG
$escaped_lang = $NEW_LANG
$escaped_framework = $NEW_FRAMEWORK
$escaped_branch = $CURRENT_BRANCH
$content = Get-Content -LiteralPath $temp -Raw -Encoding utf8
$content = $content -replace '\[PROJECT NAME\]',$ProjectName
$content = $content -replace '\[DATE\]',$Date.ToString('yyyy-MM-dd')
# Build the technology stack string safely
$techStackForTemplate = ""
if ($escaped_lang -and $escaped_framework) {
$techStackForTemplate = "- $escaped_lang + $escaped_framework ($escaped_branch)"
} elseif ($escaped_lang) {
$techStackForTemplate = "- $escaped_lang ($escaped_branch)"
} elseif ($escaped_framework) {
$techStackForTemplate = "- $escaped_framework ($escaped_branch)"
}
$content = $content -replace '\[EXTRACTED FROM ALL PLAN.MD FILES\]',$techStackForTemplate
# For project structure we manually embed (keep newlines)
$escapedStructure = [Regex]::Escape($projectStructure)
$content = $content -replace '\[ACTUAL STRUCTURE FROM PLANS\]',$escapedStructure
# Replace escaped newlines placeholder after all replacements
$content = $content -replace '\[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES\]',$commands
$content = $content -replace '\[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE\]',$languageConventions
# Build the recent changes string safely
$recentChangesForTemplate = ""
if ($escaped_lang -and $escaped_framework) {
$recentChangesForTemplate = "- ${escaped_branch}: Added ${escaped_lang} + ${escaped_framework}"
} elseif ($escaped_lang) {
$recentChangesForTemplate = "- ${escaped_branch}: Added ${escaped_lang}"
} elseif ($escaped_framework) {
$recentChangesForTemplate = "- ${escaped_branch}: Added ${escaped_framework}"
}
$content = $content -replace '\[LAST 3 FEATURES AND WHAT THEY ADDED\]',$recentChangesForTemplate
# Convert literal \n sequences introduced by Escape to real newlines
$content = $content -replace '\\n',[Environment]::NewLine
$parent = Split-Path -Parent $TargetFile
if (-not (Test-Path $parent)) { New-Item -ItemType Directory -Path $parent | Out-Null }
Set-Content -LiteralPath $TargetFile -Value $content -NoNewline -Encoding utf8
Remove-Item $temp -Force
return $true
}
function Update-ExistingAgentFile {
param(
[Parameter(Mandatory=$true)]
[string]$TargetFile,
[Parameter(Mandatory=$true)]
[datetime]$Date
)
if (-not (Test-Path $TargetFile)) { return (New-AgentFile -TargetFile $TargetFile -ProjectName (Split-Path $REPO_ROOT -Leaf) -Date $Date) }
$techStack = Format-TechnologyStack -Lang $NEW_LANG -Framework $NEW_FRAMEWORK
$newTechEntries = @()
if ($techStack) {
$escapedTechStack = [Regex]::Escape($techStack)
if (-not (Select-String -Pattern $escapedTechStack -Path $TargetFile -Quiet)) {
$newTechEntries += "- $techStack ($CURRENT_BRANCH)"
}
}
if ($NEW_DB -and $NEW_DB -notin @('N/A','NEEDS CLARIFICATION')) {
$escapedDB = [Regex]::Escape($NEW_DB)
if (-not (Select-String -Pattern $escapedDB -Path $TargetFile -Quiet)) {
$newTechEntries += "- $NEW_DB ($CURRENT_BRANCH)"
}
}
$newChangeEntry = ''
if ($techStack) { $newChangeEntry = "- ${CURRENT_BRANCH}: Added ${techStack}" }
elseif ($NEW_DB -and $NEW_DB -notin @('N/A','NEEDS CLARIFICATION')) { $newChangeEntry = "- ${CURRENT_BRANCH}: Added ${NEW_DB}" }
$lines = Get-Content -LiteralPath $TargetFile -Encoding utf8
$output = New-Object System.Collections.Generic.List[string]
$inTech = $false; $inChanges = $false; $techAdded = $false; $changeAdded = $false; $existingChanges = 0
for ($i=0; $i -lt $lines.Count; $i++) {
$line = $lines[$i]
if ($line -eq '## Active Technologies') {
$output.Add($line)
$inTech = $true
continue
}
if ($inTech -and $line -match '^##\s') {
if (-not $techAdded -and $newTechEntries.Count -gt 0) { $newTechEntries | ForEach-Object { $output.Add($_) }; $techAdded = $true }
$output.Add($line); $inTech = $false; continue
}
if ($inTech -and [string]::IsNullOrWhiteSpace($line)) {
if (-not $techAdded -and $newTechEntries.Count -gt 0) { $newTechEntries | ForEach-Object { $output.Add($_) }; $techAdded = $true }
$output.Add($line); continue
}
if ($line -eq '## Recent Changes') {
$output.Add($line)
if ($newChangeEntry) { $output.Add($newChangeEntry); $changeAdded = $true }
$inChanges = $true
continue
}
if ($inChanges -and $line -match '^##\s') { $output.Add($line); $inChanges = $false; continue }
if ($inChanges -and $line -match '^- ') {
if ($existingChanges -lt 2) { $output.Add($line); $existingChanges++ }
continue
}
if ($line -match '\*\*Last updated\*\*: .*\d{4}-\d{2}-\d{2}') {
$output.Add(($line -replace '\d{4}-\d{2}-\d{2}',$Date.ToString('yyyy-MM-dd')))
continue
}
$output.Add($line)
}
# Post-loop check: if we're still in the Active Technologies section and haven't added new entries
if ($inTech -and -not $techAdded -and $newTechEntries.Count -gt 0) {
$newTechEntries | ForEach-Object { $output.Add($_) }
}
Set-Content -LiteralPath $TargetFile -Value ($output -join [Environment]::NewLine) -Encoding utf8
return $true
}
function Update-AgentFile {
param(
[Parameter(Mandatory=$true)]
[string]$TargetFile,
[Parameter(Mandatory=$true)]
[string]$AgentName
)
if (-not $TargetFile -or -not $AgentName) { Write-Err 'Update-AgentFile requires TargetFile and AgentName'; return $false }
Write-Info "Updating $AgentName context file: $TargetFile"
$projectName = Split-Path $REPO_ROOT -Leaf
$date = Get-Date
$dir = Split-Path -Parent $TargetFile
if (-not (Test-Path $dir)) { New-Item -ItemType Directory -Path $dir | Out-Null }
if (-not (Test-Path $TargetFile)) {
if (New-AgentFile -TargetFile $TargetFile -ProjectName $projectName -Date $date) { Write-Success "Created new $AgentName context file" } else { Write-Err 'Failed to create new agent file'; return $false }
} else {
try {
if (Update-ExistingAgentFile -TargetFile $TargetFile -Date $date) { Write-Success "Updated existing $AgentName context file" } else { Write-Err 'Failed to update agent file'; return $false }
} catch {
Write-Err "Cannot access or update existing file: $TargetFile. $_"
return $false
}
}
return $true
}
function Update-SpecificAgent {
param(
[Parameter(Mandatory=$true)]
[string]$Type
)
switch ($Type) {
'claude' { Update-AgentFile -TargetFile $CLAUDE_FILE -AgentName 'Claude Code' }
'gemini' { Update-AgentFile -TargetFile $GEMINI_FILE -AgentName 'Gemini CLI' }
'copilot' { Update-AgentFile -TargetFile $COPILOT_FILE -AgentName 'GitHub Copilot' }
'cursor-agent' { Update-AgentFile -TargetFile $CURSOR_FILE -AgentName 'Cursor IDE' }
'qwen' { Update-AgentFile -TargetFile $QWEN_FILE -AgentName 'Qwen Code' }
'opencode' { Update-AgentFile -TargetFile $AGENTS_FILE -AgentName 'opencode' }
'codex' { Update-AgentFile -TargetFile $AGENTS_FILE -AgentName 'Codex CLI' }
'windsurf' { Update-AgentFile -TargetFile $WINDSURF_FILE -AgentName 'Windsurf' }
'kilocode' { Update-AgentFile -TargetFile $KILOCODE_FILE -AgentName 'Kilo Code' }
'auggie' { Update-AgentFile -TargetFile $AUGGIE_FILE -AgentName 'Auggie CLI' }
'roo' { Update-AgentFile -TargetFile $ROO_FILE -AgentName 'Roo Code' }
'codebuddy' { Update-AgentFile -TargetFile $CODEBUDDY_FILE -AgentName 'CodeBuddy CLI' }
'qodercli' { Update-AgentFile -TargetFile $QODER_FILE -AgentName 'Qoder CLI' }
'amp' { Update-AgentFile -TargetFile $AMP_FILE -AgentName 'Amp' }
'shai' { Update-AgentFile -TargetFile $SHAI_FILE -AgentName 'SHAI' }
'q' { Update-AgentFile -TargetFile $Q_FILE -AgentName 'Amazon Q Developer CLI' }
'agy' { Update-AgentFile -TargetFile $AGY_FILE -AgentName 'Antigravity' }
'bob' { Update-AgentFile -TargetFile $BOB_FILE -AgentName 'IBM Bob' }
'generic' { Write-Info 'Generic agent: no predefined context file. Use the agent-specific update script for your agent.' }
default { Write-Err "Unknown agent type '$Type'"; Write-Err 'Expected: claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|codebuddy|amp|shai|q|agy|bob|qodercli|generic'; return $false }
}
}
function Update-AllExistingAgents {
$found = $false
$ok = $true
if (Test-Path $CLAUDE_FILE) { if (-not (Update-AgentFile -TargetFile $CLAUDE_FILE -AgentName 'Claude Code')) { $ok = $false }; $found = $true }
if (Test-Path $GEMINI_FILE) { if (-not (Update-AgentFile -TargetFile $GEMINI_FILE -AgentName 'Gemini CLI')) { $ok = $false }; $found = $true }
if (Test-Path $COPILOT_FILE) { if (-not (Update-AgentFile -TargetFile $COPILOT_FILE -AgentName 'GitHub Copilot')) { $ok = $false }; $found = $true }
if (Test-Path $CURSOR_FILE) { if (-not (Update-AgentFile -TargetFile $CURSOR_FILE -AgentName 'Cursor IDE')) { $ok = $false }; $found = $true }
if (Test-Path $QWEN_FILE) { if (-not (Update-AgentFile -TargetFile $QWEN_FILE -AgentName 'Qwen Code')) { $ok = $false }; $found = $true }
if (Test-Path $AGENTS_FILE) { if (-not (Update-AgentFile -TargetFile $AGENTS_FILE -AgentName 'Codex/opencode')) { $ok = $false }; $found = $true }
if (Test-Path $WINDSURF_FILE) { if (-not (Update-AgentFile -TargetFile $WINDSURF_FILE -AgentName 'Windsurf')) { $ok = $false }; $found = $true }
if (Test-Path $KILOCODE_FILE) { if (-not (Update-AgentFile -TargetFile $KILOCODE_FILE -AgentName 'Kilo Code')) { $ok = $false }; $found = $true }
if (Test-Path $AUGGIE_FILE) { if (-not (Update-AgentFile -TargetFile $AUGGIE_FILE -AgentName 'Auggie CLI')) { $ok = $false }; $found = $true }
if (Test-Path $ROO_FILE) { if (-not (Update-AgentFile -TargetFile $ROO_FILE -AgentName 'Roo Code')) { $ok = $false }; $found = $true }
if (Test-Path $CODEBUDDY_FILE) { if (-not (Update-AgentFile -TargetFile $CODEBUDDY_FILE -AgentName 'CodeBuddy CLI')) { $ok = $false }; $found = $true }
if (Test-Path $QODER_FILE) { if (-not (Update-AgentFile -TargetFile $QODER_FILE -AgentName 'Qoder CLI')) { $ok = $false }; $found = $true }
if (Test-Path $SHAI_FILE) { if (-not (Update-AgentFile -TargetFile $SHAI_FILE -AgentName 'SHAI')) { $ok = $false }; $found = $true }
if (Test-Path $Q_FILE) { if (-not (Update-AgentFile -TargetFile $Q_FILE -AgentName 'Amazon Q Developer CLI')) { $ok = $false }; $found = $true }
if (Test-Path $AGY_FILE) { if (-not (Update-AgentFile -TargetFile $AGY_FILE -AgentName 'Antigravity')) { $ok = $false }; $found = $true }
if (Test-Path $BOB_FILE) { if (-not (Update-AgentFile -TargetFile $BOB_FILE -AgentName 'IBM Bob')) { $ok = $false }; $found = $true }
if (-not $found) {
Write-Info 'No existing agent files found, creating default Claude file...'
if (-not (Update-AgentFile -TargetFile $CLAUDE_FILE -AgentName 'Claude Code')) { $ok = $false }
}
return $ok
}
function Print-Summary {
Write-Host ''
Write-Info 'Summary of changes:'
if ($NEW_LANG) { Write-Host " - Added language: $NEW_LANG" }
if ($NEW_FRAMEWORK) { Write-Host " - Added framework: $NEW_FRAMEWORK" }
if ($NEW_DB -and $NEW_DB -ne 'N/A') { Write-Host " - Added database: $NEW_DB" }
Write-Host ''
Write-Info 'Usage: ./update-agent-context.ps1 [-AgentType claude|gemini|copilot|cursor-agent|qwen|opencode|codex|windsurf|kilocode|auggie|roo|codebuddy|amp|shai|q|agy|bob|qodercli|generic]'
}
function Main {
Validate-Environment
Write-Info "=== Updating agent context files for feature $CURRENT_BRANCH ==="
if (-not (Parse-PlanData -PlanFile $NEW_PLAN)) { Write-Err 'Failed to parse plan data'; exit 1 }
$success = $true
if ($AgentType) {
Write-Info "Updating specific agent: $AgentType"
if (-not (Update-SpecificAgent -Type $AgentType)) { $success = $false }
}
else {
Write-Info 'No agent specified, updating all existing agent files...'
if (-not (Update-AllExistingAgents)) { $success = $false }
}
Print-Summary
if ($success) { Write-Success 'Agent context update completed successfully'; exit 0 } else { Write-Err 'Agent context update completed with errors'; exit 1 }
}
Main

View File

@@ -1,28 +0,0 @@
# [PROJECT NAME] Development Guidelines
Auto-generated from all feature plans. Last updated: [DATE]
## Active Technologies
[EXTRACTED FROM ALL PLAN.MD FILES]
## Project Structure
```text
[ACTUAL STRUCTURE FROM PLANS]
```
## Commands
[ONLY COMMANDS FOR ACTIVE TECHNOLOGIES]
## Code Style
[LANGUAGE-SPECIFIC, ONLY FOR LANGUAGES IN USE]
## Recent Changes
[LAST 3 FEATURES AND WHAT THEY ADDED]
<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->

View File

@@ -1,40 +0,0 @@
# [CHECKLIST TYPE] Checklist: [FEATURE NAME]
**Purpose**: [Brief description of what this checklist covers]
**Created**: [DATE]
**Feature**: [Link to spec.md or relevant documentation]
**Note**: This checklist is generated by the `/speckit.checklist` command based on feature context and requirements.
<!--
============================================================================
IMPORTANT: The checklist items below are SAMPLE ITEMS for illustration only.
The /speckit.checklist command MUST replace these with actual items based on:
- User's specific checklist request
- Feature requirements from spec.md
- Technical context from plan.md
- Implementation details from tasks.md
DO NOT keep these sample items in the generated checklist file.
============================================================================
-->
## [Category 1]
- [ ] CHK001 First checklist item with clear action
- [ ] CHK002 Second checklist item
- [ ] CHK003 Third checklist item
## [Category 2]
- [ ] CHK004 Another category item
- [ ] CHK005 Item with specific criteria
- [ ] CHK006 Final item in this category
## Notes
- Check items off as completed: `[x]`
- Add comments or findings inline
- Link to relevant resources or documentation
- Items are numbered sequentially for easy reference

View File

@@ -1,50 +0,0 @@
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->
## Core Principles
### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->
### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->
### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->
### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->
### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->
## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->
[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->
## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->
[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->
## Governance
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->
[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->
**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->

View File

@@ -1,104 +0,0 @@
# Implementation Plan: [FEATURE]
**Branch**: `[###-feature-name]` | **Date**: [DATE] | **Spec**: [link]
**Input**: Feature specification from `/specs/[###-feature-name]/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/plan-template.md` for the execution workflow.
## Summary
[Extract from feature spec: primary requirement + technical approach from research]
## Technical Context
<!--
ACTION REQUIRED: Replace the content in this section with the technical details
for the project. The structure here is presented in advisory capacity to guide
the iteration process.
-->
**Language/Version**: [e.g., Python 3.11, Swift 5.9, Rust 1.75 or NEEDS CLARIFICATION]
**Primary Dependencies**: [e.g., FastAPI, UIKit, LLVM or NEEDS CLARIFICATION]
**Storage**: [if applicable, e.g., PostgreSQL, CoreData, files or N/A]
**Testing**: [e.g., pytest, XCTest, cargo test or NEEDS CLARIFICATION]
**Target Platform**: [e.g., Linux server, iOS 15+, WASM or NEEDS CLARIFICATION]
**Project Type**: [e.g., library/cli/web-service/mobile-app/compiler/desktop-app or NEEDS CLARIFICATION]
**Performance Goals**: [domain-specific, e.g., 1000 req/s, 10k lines/sec, 60 fps or NEEDS CLARIFICATION]
**Constraints**: [domain-specific, e.g., <200ms p95, <100MB memory, offline-capable or NEEDS CLARIFICATION]
**Scale/Scope**: [domain-specific, e.g., 10k users, 1M LOC, 50 screens or NEEDS CLARIFICATION]
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
[Gates determined based on constitution file]
## Project Structure
### Documentation (this feature)
```text
specs/[###-feature]/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
<!--
ACTION REQUIRED: Replace the placeholder tree below with the concrete layout
for this feature. Delete unused options and expand the chosen structure with
real paths (e.g., apps/admin, packages/something). The delivered plan must
not include Option labels.
-->
```text
# [REMOVE IF UNUSED] Option 1: Single project (DEFAULT)
src/
├── models/
├── services/
├── cli/
└── lib/
tests/
├── contract/
├── integration/
└── unit/
# [REMOVE IF UNUSED] Option 2: Web application (when "frontend" + "backend" detected)
backend/
├── src/
│ ├── models/
│ ├── services/
│ └── api/
└── tests/
frontend/
├── src/
│ ├── components/
│ ├── pages/
│ └── services/
└── tests/
# [REMOVE IF UNUSED] Option 3: Mobile + API (when "iOS/Android" detected)
api/
└── [same as backend above]
ios/ or android/
└── [platform-specific structure: feature modules, UI flows, platform tests]
```
**Structure Decision**: [Document the selected structure and reference the real
directories captured above]
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |

View File

@@ -1,115 +0,0 @@
# Feature Specification: [FEATURE NAME]
**Feature Branch**: `[###-feature-name]`
**Created**: [DATE]
**Status**: Draft
**Input**: User description: "$ARGUMENTS"
## User Scenarios & Testing *(mandatory)*
<!--
IMPORTANT: User stories should be PRIORITIZED as user journeys ordered by importance.
Each user story/journey must be INDEPENDENTLY TESTABLE - meaning if you implement just ONE of them,
you should still have a viable MVP (Minimum Viable Product) that delivers value.
Assign priorities (P1, P2, P3, etc.) to each story, where P1 is the most critical.
Think of each story as a standalone slice of functionality that can be:
- Developed independently
- Tested independently
- Deployed independently
- Demonstrated to users independently
-->
### User Story 1 - [Brief Title] (Priority: P1)
[Describe this user journey in plain language]
**Why this priority**: [Explain the value and why it has this priority level]
**Independent Test**: [Describe how this can be tested independently - e.g., "Can be fully tested by [specific action] and delivers [specific value]"]
**Acceptance Scenarios**:
1. **Given** [initial state], **When** [action], **Then** [expected outcome]
2. **Given** [initial state], **When** [action], **Then** [expected outcome]
---
### User Story 2 - [Brief Title] (Priority: P2)
[Describe this user journey in plain language]
**Why this priority**: [Explain the value and why it has this priority level]
**Independent Test**: [Describe how this can be tested independently]
**Acceptance Scenarios**:
1. **Given** [initial state], **When** [action], **Then** [expected outcome]
---
### User Story 3 - [Brief Title] (Priority: P3)
[Describe this user journey in plain language]
**Why this priority**: [Explain the value and why it has this priority level]
**Independent Test**: [Describe how this can be tested independently]
**Acceptance Scenarios**:
1. **Given** [initial state], **When** [action], **Then** [expected outcome]
---
[Add more user stories as needed, each with an assigned priority]
### Edge Cases
<!--
ACTION REQUIRED: The content in this section represents placeholders.
Fill them out with the right edge cases.
-->
- What happens when [boundary condition]?
- How does system handle [error scenario]?
## Requirements *(mandatory)*
<!--
ACTION REQUIRED: The content in this section represents placeholders.
Fill them out with the right functional requirements.
-->
### Functional Requirements
- **FR-001**: System MUST [specific capability, e.g., "allow users to create accounts"]
- **FR-002**: System MUST [specific capability, e.g., "validate email addresses"]
- **FR-003**: Users MUST be able to [key interaction, e.g., "reset their password"]
- **FR-004**: System MUST [data requirement, e.g., "persist user preferences"]
- **FR-005**: System MUST [behavior, e.g., "log all security events"]
*Example of marking unclear requirements:*
- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]
### Key Entities *(include if feature involves data)*
- **[Entity 1]**: [What it represents, key attributes without implementation]
- **[Entity 2]**: [What it represents, relationships to other entities]
## Success Criteria *(mandatory)*
<!--
ACTION REQUIRED: Define measurable success criteria.
These must be technology-agnostic and measurable.
-->
### Measurable Outcomes
- **SC-001**: [Measurable metric, e.g., "Users can complete account creation in under 2 minutes"]
- **SC-002**: [Measurable metric, e.g., "System handles 1000 concurrent users without degradation"]
- **SC-003**: [User satisfaction metric, e.g., "90% of users successfully complete primary task on first attempt"]
- **SC-004**: [Business metric, e.g., "Reduce support tickets related to [X] by 50%"]

View File

@@ -1,251 +0,0 @@
---
description: "Task list template for feature implementation"
---
# Tasks: [FEATURE NAME]
**Input**: Design documents from `/specs/[###-feature-name]/`
**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/
**Tests**: The examples below include test tasks. Tests are OPTIONAL - only include them if explicitly requested in the feature specification.
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Path Conventions
- **Single project**: `src/`, `tests/` at repository root
- **Web app**: `backend/src/`, `frontend/src/`
- **Mobile**: `api/src/`, `ios/src/` or `android/src/`
- Paths shown below assume single project - adjust based on plan.md structure
<!--
============================================================================
IMPORTANT: The tasks below are SAMPLE TASKS for illustration purposes only.
The /speckit.tasks command MUST replace these with actual tasks based on:
- User stories from spec.md (with their priorities P1, P2, P3...)
- Feature requirements from plan.md
- Entities from data-model.md
- Endpoints from contracts/
Tasks MUST be organized by user story so each story can be:
- Implemented independently
- Tested independently
- Delivered as an MVP increment
DO NOT keep these sample tasks in the generated tasks.md file.
============================================================================
-->
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Project initialization and basic structure
- [ ] T001 Create project structure per implementation plan
- [ ] T002 Initialize [language] project with [framework] dependencies
- [ ] T003 [P] Configure linting and formatting tools
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
Examples of foundational tasks (adjust based on your project):
- [ ] T004 Setup database schema and migrations framework
- [ ] T005 [P] Implement authentication/authorization framework
- [ ] T006 [P] Setup API routing and middleware structure
- [ ] T007 Create base models/entities that all stories depend on
- [ ] T008 Configure error handling and logging infrastructure
- [ ] T009 Setup environment configuration management
**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
---
## Phase 3: User Story 1 - [Title] (Priority: P1) 🎯 MVP
**Goal**: [Brief description of what this story delivers]
**Independent Test**: [How to verify this story works on its own]
### Tests for User Story 1 (OPTIONAL - only if tests requested) ⚠️
> **NOTE: Write these tests FIRST, ensure they FAIL before implementation**
- [ ] T010 [P] [US1] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T011 [P] [US1] Integration test for [user journey] in tests/integration/test_[name].py
### Implementation for User Story 1
- [ ] T012 [P] [US1] Create [Entity1] model in src/models/[entity1].py
- [ ] T013 [P] [US1] Create [Entity2] model in src/models/[entity2].py
- [ ] T014 [US1] Implement [Service] in src/services/[service].py (depends on T012, T013)
- [ ] T015 [US1] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T016 [US1] Add validation and error handling
- [ ] T017 [US1] Add logging for user story 1 operations
**Checkpoint**: At this point, User Story 1 should be fully functional and testable independently
---
## Phase 4: User Story 2 - [Title] (Priority: P2)
**Goal**: [Brief description of what this story delivers]
**Independent Test**: [How to verify this story works on its own]
### Tests for User Story 2 (OPTIONAL - only if tests requested) ⚠️
- [ ] T018 [P] [US2] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T019 [P] [US2] Integration test for [user journey] in tests/integration/test_[name].py
### Implementation for User Story 2
- [ ] T020 [P] [US2] Create [Entity] model in src/models/[entity].py
- [ ] T021 [US2] Implement [Service] in src/services/[service].py
- [ ] T022 [US2] Implement [endpoint/feature] in src/[location]/[file].py
- [ ] T023 [US2] Integrate with User Story 1 components (if needed)
**Checkpoint**: At this point, User Stories 1 AND 2 should both work independently
---
## Phase 5: User Story 3 - [Title] (Priority: P3)
**Goal**: [Brief description of what this story delivers]
**Independent Test**: [How to verify this story works on its own]
### Tests for User Story 3 (OPTIONAL - only if tests requested) ⚠️
- [ ] T024 [P] [US3] Contract test for [endpoint] in tests/contract/test_[name].py
- [ ] T025 [P] [US3] Integration test for [user journey] in tests/integration/test_[name].py
### Implementation for User Story 3
- [ ] T026 [P] [US3] Create [Entity] model in src/models/[entity].py
- [ ] T027 [US3] Implement [Service] in src/services/[service].py
- [ ] T028 [US3] Implement [endpoint/feature] in src/[location]/[file].py
**Checkpoint**: All user stories should now be independently functional
---
[Add more user story phases as needed, following the same pattern]
---
## Phase N: Polish & Cross-Cutting Concerns
**Purpose**: Improvements that affect multiple user stories
- [ ] TXXX [P] Documentation updates in docs/
- [ ] TXXX Code cleanup and refactoring
- [ ] TXXX Performance optimization across all stories
- [ ] TXXX [P] Additional unit tests (if requested) in tests/unit/
- [ ] TXXX Security hardening
- [ ] TXXX Run quickstart.md validation
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies - can start immediately
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
- **User Stories (Phase 3+)**: All depend on Foundational phase completion
- User stories can then proceed in parallel (if staffed)
- Or sequentially in priority order (P1 → P2 → P3)
- **Polish (Final Phase)**: Depends on all desired user stories being complete
### User Story Dependencies
- **User Story 1 (P1)**: Can start after Foundational (Phase 2) - No dependencies on other stories
- **User Story 2 (P2)**: Can start after Foundational (Phase 2) - May integrate with US1 but should be independently testable
- **User Story 3 (P3)**: Can start after Foundational (Phase 2) - May integrate with US1/US2 but should be independently testable
### Within Each User Story
- Tests (if included) MUST be written and FAIL before implementation
- Models before services
- Services before endpoints
- Core implementation before integration
- Story complete before moving to next priority
### Parallel Opportunities
- All Setup tasks marked [P] can run in parallel
- All Foundational tasks marked [P] can run in parallel (within Phase 2)
- Once Foundational phase completes, all user stories can start in parallel (if team capacity allows)
- All tests for a user story marked [P] can run in parallel
- Models within a story marked [P] can run in parallel
- Different user stories can be worked on in parallel by different team members
---
## Parallel Example: User Story 1
```bash
# Launch all tests for User Story 1 together (if tests requested):
Task: "Contract test for [endpoint] in tests/contract/test_[name].py"
Task: "Integration test for [user journey] in tests/integration/test_[name].py"
# Launch all models for User Story 1 together:
Task: "Create [Entity1] model in src/models/[entity1].py"
Task: "Create [Entity2] model in src/models/[entity2].py"
```
---
## Implementation Strategy
### MVP First (User Story 1 Only)
1. Complete Phase 1: Setup
2. Complete Phase 2: Foundational (CRITICAL - blocks all stories)
3. Complete Phase 3: User Story 1
4. **STOP and VALIDATE**: Test User Story 1 independently
5. Deploy/demo if ready
### Incremental Delivery
1. Complete Setup + Foundational → Foundation ready
2. Add User Story 1 → Test independently → Deploy/Demo (MVP!)
3. Add User Story 2 → Test independently → Deploy/Demo
4. Add User Story 3 → Test independently → Deploy/Demo
5. Each story adds value without breaking previous stories
### Parallel Team Strategy
With multiple developers:
1. Team completes Setup + Foundational together
2. Once Foundational is done:
- Developer A: User Story 1
- Developer B: User Story 2
- Developer C: User Story 3
3. Stories complete and integrate independently
---
## Notes
- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story for traceability
- Each user story should be independently completable and testable
- Verify tests fail before implementing
- Commit after each task or logical group
- Stop at any checkpoint to validate story independently
- Avoid: vague tasks, same file conflicts, cross-story dependencies that break independence

289
CLAUDE.md
View File

@@ -1,289 +0,0 @@
# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
## Project Overview
An Actix-web REST API for serving images and videos from a filesystem with automatic thumbnail generation, EXIF extraction, tag organization, and a memories feature for browsing photos by date. Uses SQLite/Diesel ORM for data persistence and ffmpeg for video processing.
## Development Commands
### Building & Running
```bash
# Build for development
cargo build
# Build for release (uses thin LTO optimization)
cargo build --release
# Run the server (requires .env file with DATABASE_URL, BASE_PATH, THUMBNAILS, VIDEO_PATH, BIND_URL, SECRET_KEY)
cargo run
# Run with specific log level
RUST_LOG=debug cargo run
```
### Testing
```bash
# Run all tests (requires BASE_PATH in .env)
cargo test
# Run specific test
cargo test test_name
# Run tests with output
cargo test -- --nocapture
```
### Database Migrations
```bash
# Install diesel CLI (one-time setup)
cargo install diesel_cli --no-default-features --features sqlite
# Create new migration
diesel migration generate migration_name
# Run migrations (also runs automatically on app startup)
diesel migration run
# Revert last migration
diesel migration revert
# Regenerate schema.rs after manual migration changes
diesel print-schema > src/database/schema.rs
```
### Code Quality
```bash
# Format code
cargo fmt
# Run clippy linter
cargo clippy
# Fix automatically fixable issues
cargo fix
```
### Utility Binaries
```bash
# Two-phase cleanup: resolve missing files and validate file types
cargo run --bin cleanup_files -- --base-path /path/to/media --database-url ./database.db
# Batch extract EXIF for existing files
cargo run --bin migrate_exif
```
## Architecture Overview
### Core Components
**Layered Architecture:**
- **HTTP Layer** (`main.rs`): Route handlers for images, videos, metadata, tags, favorites, memories
- **Auth Layer** (`auth.rs`): JWT token validation, Claims extraction via FromRequest trait
- **Service Layer** (`files.rs`, `exif.rs`, `memories.rs`): Business logic for file operations and EXIF extraction
- **DAO Layer** (`database/mod.rs`): Trait-based data access (ExifDao, UserDao, FavoriteDao, TagDao)
- **Database Layer**: Diesel ORM with SQLite, schema in `database/schema.rs`
**Async Actor System (Actix):**
- `StreamActor`: Manages ffmpeg video processing lifecycle
- `VideoPlaylistManager`: Scans directories and queues videos
- `PlaylistGenerator`: Creates HLS playlists for video streaming
### Database Schema & Patterns
**Tables:**
- `users`: Authentication (id, username, password_hash)
- `favorites`: User-specific favorites (userid, path)
- `tags`: Custom labels with timestamps
- `tagged_photo`: Many-to-many photo-tag relationships
- `image_exif`: Rich metadata (file_path + 16 EXIF fields: camera, GPS, dates, exposure settings)
**DAO Pattern:**
All database access goes through trait-based DAOs (e.g., `ExifDao`, `SqliteExifDao`). Connection pooling uses `Arc<Mutex<SqliteConnection>>`. All DB operations are traced with OpenTelemetry in release builds.
**Key DAO Methods:**
- `store_exif()`, `get_exif()`, `get_exif_batch()`: EXIF CRUD operations
- `query_by_exif()`: Complex filtering by camera, GPS bounds, date ranges
- Batch operations minimize DB hits during file watching
### File Processing Pipeline
**Thumbnail Generation:**
1. Startup scan: Rayon parallel walk of BASE_PATH
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
3. Videos: extracts frame at 3-second mark via ffmpeg
4. Images: uses `image` crate for JPEG/PNG processing
**File Watching:**
Runs in background thread with two-tier strategy:
- **Quick scan** (default 60s): Recently modified files only
- **Full scan** (default 3600s): Comprehensive directory check
- Batch queries EXIF DB to detect new files
- Configurable via `WATCH_QUICK_INTERVAL_SECONDS` and `WATCH_FULL_INTERVAL_SECONDS`
**EXIF Extraction:**
- Uses `kamadak-exif` crate
- Supports: JPEG, TIFF, RAW (NEF, CR2, CR3), HEIF/HEIC, PNG, WebP
- Extracts: camera make/model, lens, dimensions, GPS coordinates, focal length, aperture, shutter speed, ISO, date taken
- Triggered on upload and during file watching
**File Upload Behavior:**
If file exists, appends timestamp to filename (`photo_1735124234.jpg`) to preserve history without overwrites.
### Authentication Flow
**Login:**
1. POST `/login` with username/password
2. Verify with `bcrypt::verify()` against password_hash
3. Generate JWT with claims: `{ sub: user_id, exp: 5_days_from_now }`
4. Sign with HS256 using `SECRET_KEY` environment variable
**Authorization:**
All protected endpoints extract `Claims` via `FromRequest` trait implementation. Token passed as `Authorization: Bearer <token>` header.
### API Structure
**Key Endpoint Patterns:**
```rust
// Image serving & upload
GET /image?path=...&size=...&format=...
POST /image (multipart file upload)
// Metadata & EXIF
GET /image/metadata?path=...
// Advanced search with filters
GET /photos?path=...&recursive=true&sort=DateTakenDesc&camera_make=Canon&gps_lat=...&gps_lon=...&gps_radius_km=10&date_from=...&date_to=...&tag_ids=1,2,3&media_type=Photo
// Video streaming (HLS)
POST /video/generate (creates .m3u8 playlist + .ts segments)
GET /video/stream?path=... (serves playlist)
// Tags
GET /image/tags/all
POST /image/tags (add tag to file)
DELETE /image/tags (remove tag from file)
POST /image/tags/batch (bulk tag updates)
// Memories (week-based grouping)
GET /memories?path=...&recursive=true
```
**Request Types:**
- `FilesRequest`: Supports complex filtering (tags, EXIF fields, GPS radius, date ranges)
- `SortType`: Shuffle, NameAsc/Desc, TagCountAsc/Desc, DateTakenAsc/Desc
### Important Patterns
**Service Builder Pattern:**
Routes are registered via composable `ServiceBuilder` trait in `service.rs`. Allows modular feature addition.
**Path Validation:**
Always use `is_valid_full_path(&base_path, &requested_path, check_exists)` to prevent directory traversal attacks.
**File Type Detection:**
Centralized in `file_types.rs` with constants `IMAGE_EXTENSIONS` and `VIDEO_EXTENSIONS`. Provides both `Path` and `DirEntry` variants for performance.
**OpenTelemetry Tracing:**
All database operations and HTTP handlers wrapped in spans. In release builds, exports to OTLP endpoint via `OTLP_OTLS_ENDPOINT`. Debug builds use basic logger.
**Memory Exclusion:**
`PathExcluder` in `memories.rs` filters out directories from memories API via `EXCLUDED_DIRS` environment variable (comma-separated paths or substring patterns).
### Startup Sequence
1. Load `.env` file
2. Run embedded Diesel migrations
3. Spawn file watcher thread
4. Create initial thumbnails (parallel scan)
5. Generate video GIF thumbnails
6. Initialize AppState with Actix actors
7. Set up Prometheus metrics (`imageserver_image_total`, `imageserver_video_total`)
8. Scan directory for videos and queue HLS processing
9. Start HTTP server on `BIND_URL` + localhost:8088
## Testing Patterns
Tests require `BASE_PATH` environment variable. Many integration tests create temporary directories and files.
When testing database code:
- Use in-memory SQLite: `DATABASE_URL=":memory:"`
- Run migrations in test setup
- Clean up with `DROP TABLE` or use `#[serial]` from `serial_test` crate if parallel tests conflict
## Common Gotchas
**EXIF Date Parsing:**
Multiple formats supported (EXIF DateTime, ISO8601, Unix timestamp). Fallback chain attempts multiple parsers.
**Video Processing:**
ffmpeg processes run asynchronously via actors. Use `StreamActor` to track completion. HLS segments written to `VIDEO_PATH`.
**File Extensions:**
Extension detection is case-insensitive. Use `file_types.rs` helpers rather than manual string matching.
**Migration Workflow:**
After creating a migration, manually edit the SQL, then regenerate `schema.rs` with `diesel print-schema`. Migrations auto-run on startup via `embedded_migrations!()` macro.
**Path Absolutization:**
Use `path-absolutize` crate's `.absolutize()` method when converting user-provided paths to ensure they're within `BASE_PATH`.
## Required Environment Variables
```bash
DATABASE_URL=./database.db # SQLite database path
BASE_PATH=/path/to/media # Root media directory
THUMBNAILS=/path/to/thumbnails # Thumbnail storage
VIDEO_PATH=/path/to/video/hls # HLS playlist output
GIFS_DIRECTORY=/path/to/gifs # Video GIF thumbnails
BIND_URL=0.0.0.0:8080 # Server binding
CORS_ALLOWED_ORIGINS=http://localhost:3000
SECRET_KEY=your-secret-key-here # JWT signing secret
RUST_LOG=info # Log level
EXCLUDED_DIRS=/private,/archive # Comma-separated paths to exclude from memories
```
Optional:
```bash
WATCH_QUICK_INTERVAL_SECONDS=60 # Quick scan interval
WATCH_FULL_INTERVAL_SECONDS=3600 # Full scan interval
OTLP_OTLS_ENDPOINT=http://... # OpenTelemetry collector (release builds)
# AI Insights Configuration
OLLAMA_PRIMARY_URL=http://desktop:11434 # Primary Ollama server (e.g., desktop)
OLLAMA_FALLBACK_URL=http://server:11434 # Fallback Ollama server (optional, always-on)
OLLAMA_PRIMARY_MODEL=nemotron-3-nano:30b # Model for primary server (default: nemotron-3-nano:30b)
OLLAMA_FALLBACK_MODEL=llama3.2:3b # Model for fallback server (optional, uses primary if not set)
SMS_API_URL=http://localhost:8000 # SMS message API endpoint (default: localhost:8000)
SMS_API_TOKEN=your-api-token # SMS API authentication token (optional)
```
**AI Insights Fallback Behavior:**
- Primary server is tried first with its configured model (5-second connection timeout)
- On connection failure, automatically falls back to secondary server with its model (if configured)
- If `OLLAMA_FALLBACK_MODEL` not set, uses same model as primary server on fallback
- Total request timeout is 120 seconds to accommodate slow LLM inference
- Logs indicate which server and model was used (info level) and failover attempts (warn level)
- Backwards compatible: `OLLAMA_URL` and `OLLAMA_MODEL` still supported as fallbacks
**Model Discovery:**
The `OllamaClient` provides methods to query available models:
- `OllamaClient::list_models(url)` - Returns list of all models on a server
- `OllamaClient::is_model_available(url, model_name)` - Checks if a specific model exists
This allows runtime verification of model availability before generating insights.
## Dependencies of Note
- **actix-web**: HTTP framework
- **diesel**: ORM for SQLite
- **jsonwebtoken**: JWT implementation
- **kamadak-exif**: EXIF parsing
- **image**: Thumbnail generation
- **walkdir**: Directory traversal
- **rayon**: Parallel processing
- **opentelemetry**: Distributed tracing
- **bcrypt**: Password hashing
- **infer**: Magic number file type detection

4518
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,57 +1,38 @@
[package]
name = "image-api"
version = "0.5.2"
version = "0.1.0"
authors = ["Cameron Cordes <cameronc.dev@gmail.com>"]
edition = "2024"
edition = "2018"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[profile.release]
lto = "thin"
lto = true
[dependencies]
actix = "0.13.1"
actix-web = "4"
actix-rt = "2.6"
tokio = { version = "1.42.0", features = ["default", "process", "sync", "macros", "rt-multi-thread"] }
actix-files = "0.6"
actix-cors = "0.7"
actix-multipart = "0.7.2"
actix-governor = "0.5"
actix = "0.10"
actix-web = "3"
actix-rt = "1"
actix-files = "0.5"
actix-multipart = "0.3.0"
futures = "0.3.5"
jsonwebtoken = "9.3.0"
jsonwebtoken = "7.2.0"
serde = "1"
serde_json = "1"
diesel = { version = "2.2.10", features = ["sqlite"] }
libsqlite3-sys = { version = "0.35", features = ["bundled"] }
diesel_migrations = "2.2.0"
diesel = { version = "1.4.8", features = ["sqlite"] }
hmac = "0.11"
sha2 = "0.9"
chrono = "0.4"
clap = { version = "4.5", features = ["derive"] }
dotenv = "0.15"
bcrypt = "0.17.1"
image = { version = "0.25.5", default-features = false, features = ["jpeg", "png", "rayon"] }
infer = "0.16"
walkdir = "2.4.0"
bcrypt = "0.9"
image = { version = "0.23", default-features = false, features = ["jpeg", "png", "jpeg_rayon"] }
walkdir = "2"
rayon = "1.5"
path-absolutize = "3.1"
log = "0.4"
env_logger = "0.11.5"
actix-web-prom = "0.9.0"
prometheus = "0.13"
lazy_static = "1.5"
notify = "4.0"
path-absolutize = "3.0"
log="0.4"
env_logger="0.8"
actix-web-prom = "0.5.1"
prometheus = "0.11"
lazy_static = "1.1"
anyhow = "1.0"
rand = "0.8.5"
opentelemetry = { version = "0.31.0", features = ["default", "metrics", "tracing"] }
opentelemetry_sdk = { version = "0.31.0", features = ["default", "rt-tokio-current-thread", "metrics"] }
opentelemetry-otlp = { version = "0.31.0", features = ["default", "metrics", "tracing", "grpc-tonic"] }
opentelemetry-stdout = "0.31.0"
opentelemetry-appender-log = "0.31.0"
tempfile = "3.20.0"
regex = "1.11.1"
exif = { package = "kamadak-exif", version = "0.6.1" }
reqwest = { version = "0.12", features = ["json"] }
urlencoding = "2.1"
zerocopy = "0.8"
ical = "0.11"
scraper = "0.20"
base64 = "0.22"

2
Jenkinsfile vendored
View File

@@ -1,7 +1,7 @@
pipeline {
agent {
docker {
image 'rust:1.59'
image 'rust:1.51'
args '-v "$PWD":/usr/src/image-api'
}
}

View File

@@ -2,68 +2,14 @@
This is an Actix-web server for serving images and videos from a filesystem.
Upon first run it will generate thumbnails for all images and videos at `BASE_PATH`.
## Features
- Automatic thumbnail generation for images and videos
- EXIF data extraction and storage for photos
- File watching with NFS support (polling-based)
- Video streaming with HLS
- Tag-based organization
- Memories API for browsing photos by date
- **Video Wall** - Auto-generated short preview clips for videos, served via a grid view
- **AI-Powered Photo Insights** - Generate contextual insights from photos using LLMs
- **RAG-based Context Retrieval** - Semantic search over daily conversation summaries
- **Automatic Daily Summaries** - LLM-generated summaries of daily conversations with embeddings
## Environment
There are a handful of required environment variables to have the API run.
They should be defined where the binary is located or above it in an `.env` file.
You must have `ffmpeg` installed for streaming video and generating video thumbnails.
- `DATABASE_URL` is a path or url to a database (currently only SQLite is tested)
- `BASE_PATH` is the root from which you want to serve images and videos
- `THUMBNAILS` is a path where generated thumbnails should be stored
- `VIDEO_PATH` is a path where HLS playlists and video parts should be stored
- `GIFS_DIRECTORY` is a path where generated video GIF thumbnails should be stored
- `BIND_URL` is the url and port to bind to (typically your own IP address)
- `SECRET_KEY` is the *hopefully* random string to sign Tokens with
- `RUST_LOG` is one of `off, error, warn, info, debug, trace`, from least to most noisy [error is default]
- `EXCLUDED_DIRS` is a comma separated list of directories to exclude from the Memories API
- `PREVIEW_CLIPS_DIRECTORY` (optional) is a path where generated video preview clips should be stored [default: `preview_clips`]
- `WATCH_QUICK_INTERVAL_SECONDS` (optional) is the interval in seconds for quick file scans [default: 60]
- `WATCH_FULL_INTERVAL_SECONDS` (optional) is the interval in seconds for full file scans [default: 3600]
### AI Insights Configuration (Optional)
The following environment variables configure AI-powered photo insights and daily conversation summaries:
#### Ollama Configuration
- `OLLAMA_PRIMARY_URL` - Primary Ollama server URL [default: `http://localhost:11434`]
- Example: `http://desktop:11434` (your main/powerful server)
- `OLLAMA_FALLBACK_URL` - Fallback Ollama server URL (optional)
- Example: `http://server:11434` (always-on backup server)
- `OLLAMA_PRIMARY_MODEL` - Model to use on primary server [default: `nemotron-3-nano:30b`]
- Example: `nemotron-3-nano:30b`, `llama3.2:3b`, etc.
- `OLLAMA_FALLBACK_MODEL` - Model to use on fallback server (optional)
- If not set, uses `OLLAMA_PRIMARY_MODEL` on fallback server
**Legacy Variables** (still supported):
- `OLLAMA_URL` - Used if `OLLAMA_PRIMARY_URL` not set
- `OLLAMA_MODEL` - Used if `OLLAMA_PRIMARY_MODEL` not set
#### SMS API Configuration
- `SMS_API_URL` - URL to SMS message API [default: `http://localhost:8000`]
- Used to fetch conversation data for context in insights
- `SMS_API_TOKEN` - Authentication token for SMS API (optional)
#### Fallback Behavior
- Primary server is tried first with 5-second connection timeout
- On failure, automatically falls back to secondary server (if configured)
- Total request timeout is 120 seconds to accommodate LLM inference
- Logs indicate which server/model was used and any failover attempts
#### Daily Summary Generation
Daily conversation summaries are generated automatically on server startup. Configure in `src/main.rs`:
- Date range for summary generation
- Contacts to process
- Model version used for embeddings: `nomic-embed-text:v1.5`

View File

@@ -1,2 +0,0 @@
DROP INDEX IF EXISTS idx_image_exif_file_path;
DROP TABLE IF EXISTS image_exif;

View File

@@ -1,32 +0,0 @@
CREATE TABLE image_exif (
id INTEGER PRIMARY KEY NOT NULL,
file_path TEXT NOT NULL UNIQUE,
-- Camera Information
camera_make TEXT,
camera_model TEXT,
lens_model TEXT,
-- Image Properties
width INTEGER,
height INTEGER,
orientation INTEGER,
-- GPS Coordinates
gps_latitude REAL,
gps_longitude REAL,
gps_altitude REAL,
-- Capture Settings
focal_length REAL,
aperture REAL,
shutter_speed TEXT,
iso INTEGER,
date_taken BIGINT,
-- Housekeeping
created_time BIGINT NOT NULL,
last_modified BIGINT NOT NULL
);
CREATE INDEX idx_image_exif_file_path ON image_exif(file_path);

View File

@@ -1,9 +0,0 @@
-- Rollback indexes
DROP INDEX IF EXISTS idx_favorites_userid;
DROP INDEX IF EXISTS idx_favorites_path;
DROP INDEX IF EXISTS idx_tags_name;
DROP INDEX IF EXISTS idx_tagged_photo_photo_name;
DROP INDEX IF EXISTS idx_tagged_photo_tag_id;
DROP INDEX IF EXISTS idx_image_exif_camera;
DROP INDEX IF EXISTS idx_image_exif_gps;

View File

@@ -1,17 +0,0 @@
-- Add indexes for improved query performance
-- Favorites table indexes
CREATE INDEX IF NOT EXISTS idx_favorites_userid ON favorites(userid);
CREATE INDEX IF NOT EXISTS idx_favorites_path ON favorites(path);
-- Tags table indexes
CREATE INDEX IF NOT EXISTS idx_tags_name ON tags(name);
-- Tagged photos indexes
CREATE INDEX IF NOT EXISTS idx_tagged_photo_photo_name ON tagged_photo(photo_name);
CREATE INDEX IF NOT EXISTS idx_tagged_photo_tag_id ON tagged_photo(tag_id);
-- EXIF table indexes (date_taken already has index from previous migration)
-- Adding composite index for common EXIF queries
CREATE INDEX IF NOT EXISTS idx_image_exif_camera ON image_exif(camera_make, camera_model);
CREATE INDEX IF NOT EXISTS idx_image_exif_gps ON image_exif(gps_latitude, gps_longitude);

View File

@@ -1,3 +0,0 @@
-- Rollback unique constraint on favorites
DROP INDEX IF EXISTS idx_favorites_unique;

View File

@@ -1,12 +0,0 @@
-- Add unique constraint to prevent duplicate favorites per user
-- First, remove any existing duplicates (keep the oldest one)
DELETE FROM favorites
WHERE rowid NOT IN (
SELECT MIN(rowid)
FROM favorites
GROUP BY userid, path
);
-- Add unique index to enforce constraint
CREATE UNIQUE INDEX idx_favorites_unique ON favorites(userid, path);

View File

@@ -1,2 +0,0 @@
-- Remove date_taken index
DROP INDEX IF EXISTS idx_image_exif_date_taken;

View File

@@ -1,2 +0,0 @@
-- Add index on date_taken for efficient date range queries
CREATE INDEX IF NOT EXISTS idx_image_exif_date_taken ON image_exif(date_taken);

View File

@@ -1,3 +0,0 @@
-- Rollback AI insights table
DROP INDEX IF EXISTS idx_photo_insights_path;
DROP TABLE IF EXISTS photo_insights;

View File

@@ -1,11 +0,0 @@
-- AI-generated insights for individual photos
CREATE TABLE IF NOT EXISTS photo_insights (
id INTEGER PRIMARY KEY NOT NULL,
file_path TEXT NOT NULL UNIQUE, -- Full path to the photo
title TEXT NOT NULL, -- "At the beach with Sarah"
summary TEXT NOT NULL, -- 2-3 sentence description
generated_at BIGINT NOT NULL,
model_version TEXT NOT NULL
);
CREATE INDEX IF NOT EXISTS idx_photo_insights_path ON photo_insights(file_path);

View File

@@ -1 +0,0 @@
DROP TABLE daily_conversation_summaries;

View File

@@ -1,19 +0,0 @@
-- Daily conversation summaries for improved RAG quality
-- Each row = one day's conversation with a contact, summarized by LLM and embedded
CREATE TABLE daily_conversation_summaries (
id INTEGER PRIMARY KEY NOT NULL,
date TEXT NOT NULL, -- ISO date "2024-08-15"
contact TEXT NOT NULL, -- Contact name
summary TEXT NOT NULL, -- LLM-generated 3-5 sentence summary
message_count INTEGER NOT NULL, -- Number of messages in this day
embedding BLOB NOT NULL, -- 768-dim vector of the summary
created_at BIGINT NOT NULL, -- When this summary was generated
model_version TEXT NOT NULL, -- "nomic-embed-text:v1.5"
UNIQUE(date, contact)
);
-- Indexes for efficient querying
CREATE INDEX idx_daily_summaries_date ON daily_conversation_summaries(date);
CREATE INDEX idx_daily_summaries_contact ON daily_conversation_summaries(contact);
CREATE INDEX idx_daily_summaries_date_contact ON daily_conversation_summaries(date, contact);

View File

@@ -1 +0,0 @@
DROP TABLE IF EXISTS calendar_events;

View File

@@ -1,20 +0,0 @@
CREATE TABLE calendar_events (
id INTEGER PRIMARY KEY NOT NULL,
event_uid TEXT,
summary TEXT NOT NULL,
description TEXT,
location TEXT,
start_time BIGINT NOT NULL,
end_time BIGINT NOT NULL,
all_day BOOLEAN NOT NULL DEFAULT 0,
organizer TEXT,
attendees TEXT,
embedding BLOB,
created_at BIGINT NOT NULL,
source_file TEXT,
UNIQUE(event_uid, start_time)
);
CREATE INDEX idx_calendar_start_time ON calendar_events(start_time);
CREATE INDEX idx_calendar_end_time ON calendar_events(end_time);
CREATE INDEX idx_calendar_time_range ON calendar_events(start_time, end_time);

View File

@@ -1 +0,0 @@
DROP TABLE IF EXISTS location_history;

View File

@@ -1,19 +0,0 @@
CREATE TABLE location_history (
id INTEGER PRIMARY KEY NOT NULL,
timestamp BIGINT NOT NULL,
latitude REAL NOT NULL,
longitude REAL NOT NULL,
accuracy INTEGER,
activity TEXT,
activity_confidence INTEGER,
place_name TEXT,
place_category TEXT,
embedding BLOB,
created_at BIGINT NOT NULL,
source_file TEXT,
UNIQUE(timestamp, latitude, longitude)
);
CREATE INDEX idx_location_timestamp ON location_history(timestamp);
CREATE INDEX idx_location_coords ON location_history(latitude, longitude);
CREATE INDEX idx_location_activity ON location_history(activity);

View File

@@ -1 +0,0 @@
DROP TABLE IF EXISTS search_history;

View File

@@ -1,13 +0,0 @@
CREATE TABLE search_history (
id INTEGER PRIMARY KEY NOT NULL,
timestamp BIGINT NOT NULL,
query TEXT NOT NULL,
search_engine TEXT,
embedding BLOB NOT NULL,
created_at BIGINT NOT NULL,
source_file TEXT,
UNIQUE(timestamp, query)
);
CREATE INDEX idx_search_timestamp ON search_history(timestamp);
CREATE INDEX idx_search_query ON search_history(query);

View File

@@ -1,4 +0,0 @@
-- Revert search performance optimization indexes
DROP INDEX IF EXISTS idx_image_exif_date_path;
DROP INDEX IF EXISTS idx_tagged_photo_count;

View File

@@ -1,15 +0,0 @@
-- Add composite indexes for search performance optimization
-- This migration addresses N+1 query issues and enables database-level sorting
-- Covering index for date-sorted queries (supports ORDER BY + pagination)
-- Enables efficient date-based sorting without loading all files into memory
CREATE INDEX IF NOT EXISTS idx_image_exif_date_path
ON image_exif(date_taken DESC, file_path);
-- Optimize batch tag count queries with GROUP BY
-- Reduces N individual queries to a single batch query
CREATE INDEX IF NOT EXISTS idx_tagged_photo_count
ON tagged_photo(photo_name, tag_id);
-- Update query planner statistics to optimize query execution
ANALYZE;

View File

@@ -1 +0,0 @@
DROP TABLE IF EXISTS video_preview_clips;

View File

@@ -1,13 +0,0 @@
CREATE TABLE video_preview_clips (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
file_path TEXT NOT NULL UNIQUE,
status TEXT NOT NULL DEFAULT 'pending',
duration_seconds REAL,
file_size_bytes INTEGER,
error_message TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE INDEX idx_preview_clips_file_path ON video_preview_clips(file_path);
CREATE INDEX idx_preview_clips_status ON video_preview_clips(status);

View File

@@ -1,36 +0,0 @@
# Specification Quality Checklist: VideoWall
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-02-25
**Feature**: [spec.md](../spec.md)
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- All items pass validation.
- Assumptions section documents reasonable defaults for format choice, column layout interpretation, and infrastructure reuse.
- No [NEEDS CLARIFICATION] markers were needed — the user description was specific enough to make informed decisions for all requirements.

View File

@@ -1,91 +0,0 @@
# API Contracts: VideoWall
## GET /video/preview
Retrieve the preview clip MP4 file for a given video. If the preview is not yet generated, triggers on-demand generation and returns 202.
**Authentication**: Required (Bearer token)
**Query Parameters**:
| Parameter | Type | Required | Description |
|-----------|------|----------|-------------|
| path | string | yes | Relative path of the source video from BASE_PATH |
**Responses**:
| Status | Content-Type | Body | Description |
|--------|-------------|------|-------------|
| 200 | video/mp4 | MP4 file stream | Preview clip is ready and served |
| 202 | application/json | `{"status": "processing", "path": "<path>"}` | Preview generation has been triggered; client should retry |
| 400 | application/json | `{"error": "Invalid path"}` | Path validation failed |
| 404 | application/json | `{"error": "Video not found"}` | Source video does not exist |
| 500 | application/json | `{"error": "Generation failed: <detail>"}` | Preview generation failed |
**Behavior**:
1. Validate path with `is_valid_full_path()`
2. Check if preview clip exists on disk and status is `complete` → serve MP4 (200)
3. If status is `pending` or no record exists → trigger generation, return 202
4. If status is `processing` → return 202
5. If status is `failed` → return 500 with error detail
---
## POST /video/preview/status
Check the preview generation status for a batch of video paths. Used by the mobile app to determine which previews are ready before requesting them.
**Authentication**: Required (Bearer token)
**Request Body** (application/json):
```json
{
"paths": [
"2024/vacation/beach.mov",
"2024/vacation/sunset.mp4",
"2024/birthday.avi"
]
}
```
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| paths | string[] | yes | Array of relative video paths from BASE_PATH |
**Response** (200, application/json):
```json
{
"previews": [
{
"path": "2024/vacation/beach.mov",
"status": "complete",
"preview_url": "/video/preview?path=2024/vacation/beach.mov"
},
{
"path": "2024/vacation/sunset.mp4",
"status": "processing",
"preview_url": null
},
{
"path": "2024/birthday.avi",
"status": "pending",
"preview_url": null
}
]
}
```
| Field | Type | Description |
|-------|------|-------------|
| previews | object[] | Status for each requested path |
| previews[].path | string | The requested video path |
| previews[].status | string | One of: `pending`, `processing`, `complete`, `failed`, `not_found` |
| previews[].preview_url | string? | Relative URL to fetch the preview (only when status is `complete`) |
**Behavior**:
1. Accept up to 200 paths per request
2. Batch query the `video_preview_clips` table for all paths
3. For paths not in the table, return status `not_found` (video may not exist or hasn't been scanned yet)
4. Return results in the same order as the input paths

View File

@@ -1,62 +0,0 @@
# Data Model: VideoWall
## Entities
### VideoPreviewClip
Tracks the generation status and metadata of preview clips derived from source videos.
**Table**: `video_preview_clips`
| Field | Type | Constraints | Description |
|-------|------|-------------|-------------|
| id | INTEGER | PRIMARY KEY, AUTOINCREMENT | Unique identifier |
| file_path | TEXT | NOT NULL, UNIQUE | Relative path of the source video from BASE_PATH |
| status | TEXT | NOT NULL, DEFAULT 'pending' | Generation status: `pending`, `processing`, `complete`, `failed` |
| duration_seconds | REAL | NULLABLE | Duration of the generated preview clip (≤10s) |
| file_size_bytes | INTEGER | NULLABLE | Size of the generated MP4 file |
| error_message | TEXT | NULLABLE | Error details if status is `failed` |
| created_at | TEXT | NOT NULL | ISO 8601 timestamp when record was created |
| updated_at | TEXT | NOT NULL | ISO 8601 timestamp when record was last updated |
**Indexes**:
- `idx_preview_clips_file_path` on `file_path` (unique, used for lookups and batch queries)
- `idx_preview_clips_status` on `status` (used by file watcher to find pending/failed clips)
### Relationships
- **VideoPreviewClip → Source Video**: One-to-one via `file_path`. The preview clip file on disk is located at `{PREVIEW_CLIPS_DIRECTORY}/{file_path}.mp4`.
- **VideoPreviewClip → image_exif**: Implicit relationship via shared `file_path`. No foreign key needed — the EXIF table may not have an entry for every video.
## State Transitions
```
[new video detected] → pending
pending → processing (when generation starts)
processing → complete (when ffmpeg succeeds)
processing → failed (when ffmpeg fails or times out)
failed → pending (on retry / re-scan)
```
## Validation Rules
- `file_path` must be a valid relative path within BASE_PATH
- `status` must be one of: `pending`, `processing`, `complete`, `failed`
- `duration_seconds` must be > 0 and ≤ 10.0 when status is `complete`
- `file_size_bytes` must be > 0 when status is `complete`
- `error_message` should only be non-null when status is `failed`
## Storage Layout (Filesystem)
```
{PREVIEW_CLIPS_DIRECTORY}/
├── 2024/
│ ├── vacation/
│ │ ├── beach.mp4 # Preview for BASE_PATH/2024/vacation/beach.mov
│ │ └── sunset.mp4 # Preview for BASE_PATH/2024/vacation/sunset.mp4
│ └── birthday.mp4 # Preview for BASE_PATH/2024/birthday.avi
└── 2025/
└── trip.mp4 # Preview for BASE_PATH/2025/trip.mkv
```
All preview clips use `.mp4` extension regardless of source format.

View File

@@ -1,79 +0,0 @@
# Implementation Plan: VideoWall
**Branch**: `001-video-wall` | **Date**: 2026-02-25 | **Spec**: [spec.md](./spec.md)
**Input**: Feature specification from `/specs/001-video-wall/spec.md`
## Summary
Add a VideoWall feature spanning the Rust API backend and React Native mobile app. The backend generates 480p MP4 preview clips (up to 10 seconds, composed of 10 equally spaced 1-second segments) using ffmpeg, extending the existing `OverviewVideo` pattern in `src/video/ffmpeg.rs`. The mobile app adds a VideoWall view using `expo-video` and FlatList to display a responsive 2-3 column grid of simultaneously looping, muted preview clips with audio-on-long-press. Preview clips are cached on disk, served via new API endpoints, and generated proactively by the file watcher.
## Technical Context
**Language/Version**: Rust (stable, Cargo) for backend API; TypeScript / React Native (Expo SDK 52) for mobile app
**Primary Dependencies**: actix-web 4, Diesel 2.2 (SQLite), ffmpeg/ffprobe (CLI), expo-video 3.0, expo-router 6.0, react-native-reanimated 4.1
**Storage**: SQLite (preview clip status tracking), filesystem (MP4 preview clips in `PREVIEW_CLIPS_DIRECTORY`)
**Testing**: `cargo test` for backend; manual testing for mobile app
**Target Platform**: Linux server (API), iOS/Android (mobile app via Expo)
**Project Type**: Mobile app + REST API (two separate repositories)
**Performance Goals**: <3s VideoWall load for 50 pre-generated previews; <30s per clip generation; <5MB per clip; smooth simultaneous playback of 6-12 clips
**Constraints**: Semaphore-limited concurrent ffmpeg processes (existing pattern); 480p resolution to keep bandwidth/CPU manageable; audio track preserved but muted by default
**Scale/Scope**: Hundreds to low thousands of videos per library; single user at a time
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
Constitution is an unfilled template — no project-specific gates defined. **PASS** (no violations possible).
Post-Phase 1 re-check: Still PASS — no gates to evaluate.
## Project Structure
### Documentation (this feature)
```text
specs/001-video-wall/
├── plan.md # This file
├── research.md # Phase 0 output
├── data-model.md # Phase 1 output
├── quickstart.md # Phase 1 output
├── contracts/ # Phase 1 output
│ └── api-endpoints.md
└── tasks.md # Phase 2 output (/speckit.tasks command)
```
### Source Code (repository root)
```text
# Backend (ImageApi - Rust)
src/
├── video/
│ ├── ffmpeg.rs # Add generate_preview_clip() using existing pattern
│ ├── actors.rs # Add PreviewClipGenerator actor (semaphore-limited)
│ └── mod.rs # Add generate_preview_clips() batch function
├── main.rs # Add GET /video/preview, POST /video/preview/status endpoints
│ # Extend file watcher to trigger preview generation
├── database/
│ ├── schema.rs # Add video_preview_clips table
│ └── models.rs # Add VideoPreviewClip model
│ └── preview_dao.rs # New DAO for preview clip status tracking
└── data/
└── mod.rs # Add PreviewClipRequest, PreviewStatusRequest types
# Frontend (SynologyFileViewer - React Native)
app/(app)/grid/
├── video-wall.tsx # New VideoWall view (FlatList grid)
└── _layout.tsx # Add video-wall route to stack
components/
└── VideoWallItem.tsx # Single preview clip cell (expo-video player)
hooks/
└── useVideoWall.ts # Preview clip fetching, status polling, audio state
```
**Structure Decision**: Mobile + API pattern. Backend changes extend existing `src/video/` module and `src/main.rs` handlers following established conventions. Frontend adds a new route under the existing grid stack navigator with a dedicated component and hook.
## Complexity Tracking
No constitution violations to justify.

View File

@@ -1,115 +0,0 @@
# Quickstart: VideoWall
## Prerequisites
- Rust toolchain (stable) with `cargo`
- `diesel_cli` installed (`cargo install diesel_cli --no-default-features --features sqlite`)
- ffmpeg and ffprobe available on PATH
- Node.js 18+ and Expo CLI for mobile app
- `.env` file configured with existing variables plus `PREVIEW_CLIPS_DIRECTORY`
## New Environment Variable
Add to `.env`:
```bash
PREVIEW_CLIPS_DIRECTORY=/path/to/preview-clips # Directory for generated preview MP4s
```
## Backend Development
### 1. Create database migration
```bash
cd C:\Users\ccord\RustroverProjects\ImageApi
diesel migration generate create_video_preview_clips
```
Edit the generated `up.sql`:
```sql
CREATE TABLE video_preview_clips (
id INTEGER PRIMARY KEY AUTOINCREMENT NOT NULL,
file_path TEXT NOT NULL UNIQUE,
status TEXT NOT NULL DEFAULT 'pending',
duration_seconds REAL,
file_size_bytes INTEGER,
error_message TEXT,
created_at TEXT NOT NULL,
updated_at TEXT NOT NULL
);
CREATE INDEX idx_preview_clips_file_path ON video_preview_clips(file_path);
CREATE INDEX idx_preview_clips_status ON video_preview_clips(status);
```
Edit `down.sql`:
```sql
DROP TABLE IF EXISTS video_preview_clips;
```
Regenerate schema:
```bash
diesel migration run
diesel print-schema > src/database/schema.rs
```
### 2. Build and test backend
```bash
cargo build
cargo test
cargo run
```
Test preview endpoint:
```bash
# Check preview status
curl -X POST http://localhost:8080/video/preview/status \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"paths": ["some/video.mp4"]}'
# Request preview clip
curl http://localhost:8080/video/preview?path=some/video.mp4 \
-H "Authorization: Bearer <token>" \
-o preview.mp4
```
### 3. Verify preview clip generation
Check that preview clips appear in `PREVIEW_CLIPS_DIRECTORY` with the expected directory structure mirroring `BASE_PATH`.
## Frontend Development
### 1. Start the mobile app
```bash
cd C:\Users\ccord\development\SynologyFileViewer
npx expo start
```
### 2. Navigate to VideoWall
From the grid view of any folder containing videos, switch to VideoWall mode. The view should display a 2-3 column grid of looping preview clips.
## Key Files to Modify
### Backend (ImageApi)
| File | Change |
|------|--------|
| `src/video/ffmpeg.rs` | Add `generate_preview_clip()` function |
| `src/video/actors.rs` | Add `PreviewClipGenerator` actor |
| `src/video/mod.rs` | Add `generate_preview_clips()` batch function |
| `src/main.rs` | Add endpoints, extend file watcher |
| `src/database/schema.rs` | Regenerated by Diesel |
| `src/database/models.rs` | Add `VideoPreviewClip` struct |
| `src/database/preview_dao.rs` | New DAO file |
| `src/data/mod.rs` | Add request/response types |
| `src/state.rs` | Add PreviewClipGenerator to AppState |
### Frontend (SynologyFileViewer)
| File | Change |
|------|--------|
| `app/(app)/grid/video-wall.tsx` | New VideoWall view |
| `app/(app)/grid/_layout.tsx` | Add route |
| `components/VideoWallItem.tsx` | New preview clip cell component |
| `hooks/useVideoWall.ts` | New hook for preview state management |

View File

@@ -1,91 +0,0 @@
# Research: VideoWall
## R1: FFmpeg Preview Clip Generation Strategy
**Decision**: Use ffmpeg's `select` filter with segment-based extraction, extending the existing `OverviewVideo` pattern in `src/video/ffmpeg.rs`.
**Rationale**: The codebase already has a nearly identical pattern at `src/video/ffmpeg.rs` using `select='lt(mod(t,{interval}),1)'` which selects 1-second frames at evenly spaced intervals across the video duration. The existing pattern outputs GIF; we adapt it to output MP4 at 480p with audio.
**Approach**:
1. Use `ffprobe` to get video duration (existing `get_video_duration()` pattern)
2. Calculate interval: `duration / 10` (or fewer segments for short videos)
3. Use ffmpeg with:
- Video filter: `select='lt(mod(t,{interval}),1)',setpts=N/FRAME_RATE/TB,scale=-2:480`
- Audio filter: `aselect='lt(mod(t,{interval}),1)',asetpts=N/SR/TB`
- Output: MP4 with H.264 video + AAC audio
- CRF 28 (lower quality acceptable for previews, reduces file size)
- Preset: `veryfast` (matches existing HLS transcoding pattern)
**Alternatives considered**:
- Generating separate segment files and concatenating: More complex, no benefit over select filter
- Using GIF output: Rejected per clarification — MP4 is 5-10x smaller with better quality
- Stream copy (no transcode): Not possible since we're extracting non-contiguous segments
## R2: Preview Clip Storage and Caching
**Decision**: Store preview clips on filesystem in a dedicated `PREVIEW_CLIPS_DIRECTORY` mirroring the source directory structure (same pattern as `THUMBNAILS` and `GIFS_DIRECTORY`).
**Rationale**: The project already uses this directory-mirroring pattern for thumbnails and GIF previews. It's simple, requires no database for file lookup (path is deterministic), and integrates naturally with the existing file watcher cleanup logic.
**Storage path formula**: `{PREVIEW_CLIPS_DIRECTORY}/{relative_path_from_BASE_PATH}.mp4`
- Example: Video at `BASE_PATH/2024/vacation.mov` → Preview at `PREVIEW_CLIPS_DIRECTORY/2024/vacation.mp4`
**Alternatives considered**:
- Database BLOBs: Too large, not suited for binary video files
- Content-addressed storage (hash-based): Unnecessary complexity for single-user system
- Flat directory with UUID names: Loses the intuitive mapping that thumbnails/GIFs use
## R3: Preview Generation Status Tracking
**Decision**: Track generation status in SQLite via a new `video_preview_clips` table with Diesel ORM, following the existing DAO pattern.
**Rationale**: The batch status endpoint (FR-004) needs to efficiently check which previews are ready for a list of video paths. A database table is the right tool — it supports batch queries (existing `get_exif_batch()` pattern), survives restarts, and tracks failure states. The file watcher already uses batch DB queries to detect unprocessed files.
**Status values**: `pending`, `processing`, `complete`, `failed`
**Alternatives considered**:
- Filesystem-only (check if .mp4 exists): Cannot track `processing` or `failed` states; race conditions on concurrent requests
- In-memory HashMap: Lost on restart; doesn't support batch queries efficiently across actor boundaries
## R4: Concurrent Generation Limits
**Decision**: Use `Arc<Semaphore>` with a limit of 2 concurrent ffmpeg preview generation processes, matching the existing `PlaylistGenerator` pattern.
**Rationale**: The `PlaylistGenerator` actor in `src/video/actors.rs` already uses this exact pattern to limit concurrent ffmpeg processes. Preview generation is CPU-intensive (transcoding), so limiting concurrency prevents server overload. The semaphore pattern is proven in this codebase.
**Alternatives considered**:
- Unbounded concurrency: Would overwhelm the server with many simultaneous ffmpeg processes
- Queue with single worker: Too slow for batch generation; 2 concurrent is a good balance
- Sharing the existing PlaylistGenerator semaphore: Would cause HLS generation and preview generation to compete for the same slots; better to keep them independent
## R5: Mobile App Video Playback Strategy
**Decision**: Use `expo-video` `VideoView` components inside FlatList items, with muted autoplay and viewport-based pause/resume.
**Rationale**: The app already uses `expo-video` (v3.0.15) for the single video player in `viewer/video.tsx`. The library supports multiple simultaneous players, `loop` mode, and programmatic mute/unmute. FlatList's `viewabilityConfig` callback can be used to pause/resume players based on viewport visibility.
**Key configuration per cell**:
- `player.loop = true`
- `player.muted = true` (default)
- `player.play()` when visible, `player.pause()` when offscreen
- `nativeControls={false}` (no controls needed in grid)
**Audio-on-focus**: On long-press, unmute the pressed player and mute all others. Track the "focused" player ID in hook state.
**Alternatives considered**:
- HLS streaming for previews: Overkill for <10s clips; direct MP4 download is simpler and faster
- Animated GIF display via Image component: Rejected per clarification — MP4 with expo-video is better
- WebView-based player: Poor performance, no native gesture integration
## R6: API Endpoint Design
**Decision**: Two new endpoints — one to serve preview clips, one for batch status checking.
**Rationale**:
- `GET /video/preview?path=...` serves the MP4 file directly (or triggers on-demand generation and returns 202 Accepted). Follows the pattern of `GET /image?path=...` for serving files.
- `POST /video/preview/status` accepts a JSON body with an array of video paths and returns their preview generation status. This allows the mobile app to efficiently determine which previews are ready in a single request (batch pattern from `get_exif_batch()`).
**Alternatives considered**:
- Single endpoint that blocks until generation completes: Bad UX — generation takes up to 30s
- WebSocket for real-time status: Overkill for this use case; polling with batch status is simpler
- Including preview URL in the existing `/photos` response: Would couple the photo listing endpoint to preview generation; better to keep separate

View File

@@ -1,136 +0,0 @@
# Feature Specification: VideoWall
**Feature Branch**: `001-video-wall`
**Created**: 2026-02-25
**Status**: Draft
**Input**: User description: "I would like to implement a new View 'VideoWall' in the React native mobile app, with supporting API/tasks to generate at most 10 second long GIF/Videos that are 10 equally spaced 1 second clips of the original video. This view will display a grid 2/3 columns wide of all these clips playing simultaneously. It should let the user view all videos in the current folder/search results."
## Clarifications
### Session 2026-02-25
- Q: What format should preview clips be generated in (GIF vs video)? → A: MP4 video clips (small files, hardware-accelerated playback, best quality-to-size ratio).
- Q: What resolution should preview clips be generated at? → A: 480p scaled down (sharp in grid cells, small files, smooth simultaneous playback).
- Q: How should audio be handled in preview clips? → A: Audio on focus — muted by default, audio plays when user long-presses on a clip. Audio track is preserved during generation.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Browse Videos as a Visual Wall (Priority: P1)
A user navigates to a folder containing videos and switches to the VideoWall view. The screen fills with a grid of video previews — short looping clips that give a visual summary of each video. All previews play simultaneously, creating an immersive "wall of motion" that lets the user quickly scan and identify videos of interest without opening each one individually.
**Why this priority**: This is the core experience. Without the visual grid of simultaneously playing previews, the feature has no value. This story delivers the primary browsing capability.
**Independent Test**: Can be fully tested by navigating to any folder with videos, switching to VideoWall view, and confirming that preview clips display in a grid and play simultaneously. Delivers immediate visual browsing value.
**Acceptance Scenarios**:
1. **Given** a user is viewing a folder containing 6 videos, **When** they switch to VideoWall view, **Then** they see a grid of 6 video previews arranged in 2-3 columns, all playing simultaneously in a loop.
2. **Given** a user is viewing a folder containing 20 videos, **When** they switch to VideoWall view, **Then** the grid is scrollable and loads previews progressively as they scroll.
3. **Given** a user is in VideoWall view, **When** they tap on a video preview, **Then** they navigate to the full video player for that video.
4. **Given** a user is in VideoWall view with all clips muted, **When** they long-press on a preview clip, **Then** that clip's audio unmutes and all other clips remain muted.
---
### User Story 2 - Server Generates Preview Clips (Priority: P1)
When preview clips are requested for a video that has not yet been processed, the server generates a short preview clip. The preview is composed of 10 equally spaced 1-second segments extracted from the original video, concatenated into a single clip of at most 10 seconds. Once generated, the preview is cached so subsequent requests are served instantly.
**Why this priority**: The VideoWall view depends entirely on having preview clips available. Without server-side generation, there is nothing to display. This is co-priority with Story 1 as they are interdependent.
**Independent Test**: Can be tested by requesting a preview clip for any video via the API and confirming the response is a playable clip of at most 10 seconds composed of segments from different parts of the original video.
**Acceptance Scenarios**:
1. **Given** a video exists that has no preview clip yet, **When** a preview is requested, **Then** the system generates a clip of at most 10 seconds composed of 10 equally spaced 1-second segments from the original video.
2. **Given** a video is shorter than 10 seconds, **When** a preview is requested, **Then** the system generates a preview using fewer segments (as many 1-second clips as the video duration allows), resulting in a shorter preview.
3. **Given** a preview clip was previously generated for a video, **When** it is requested again, **Then** the cached version is served without re-processing.
4. **Given** a video file no longer exists, **When** a preview is requested, **Then** the system returns an appropriate error indicating the source video is missing.
---
### User Story 3 - VideoWall from Search Results (Priority: P2)
A user performs a search or applies filters (tags, date range, camera, location) and the results include videos. They switch to VideoWall view to see preview clips of all matching videos displayed in the same grid layout, allowing visual browsing of search results.
**Why this priority**: Extends the core VideoWall browsing to work with filtered/search result sets. Important for discoverability but depends on Story 1 and 2 being functional first.
**Independent Test**: Can be tested by performing a search that returns videos, switching to VideoWall view, and confirming that only matching videos appear as previews in the grid.
**Acceptance Scenarios**:
1. **Given** a user has search results containing 8 videos and 12 photos, **When** they switch to VideoWall view, **Then** only the 8 video previews are displayed in the grid.
2. **Given** a user applies a tag filter that matches 3 videos, **When** they view the VideoWall, **Then** exactly 3 video previews are shown.
---
### User Story 4 - Background Preview Generation (Priority: P3)
Preview clips are generated proactively in the background for videos discovered during file watching, so that when a user opens VideoWall, most previews are already available and the experience feels instant.
**Why this priority**: Enhances performance and perceived responsiveness. The feature works without this (on-demand generation), but background processing greatly improves the user experience for large libraries.
**Independent Test**: Can be tested by adding new video files to a monitored folder and confirming that preview clips are generated automatically within the next scan cycle, before any user requests them.
**Acceptance Scenarios**:
1. **Given** a new video is added to the media library, **When** the file watcher detects it, **Then** a preview clip is generated in the background without user intervention.
2. **Given** the system is generating previews in the background, **When** a user opens VideoWall, **Then** already-generated previews display immediately while pending ones show a placeholder.
---
### Edge Cases
- What happens when a video is corrupted or cannot be processed? The system shows a placeholder/error state for that video and does not block other previews from loading.
- What happens when the user scrolls quickly through a large library? Previews outside the visible viewport should pause or not load to conserve resources, and resume when scrolled back into view.
- What happens when a video is extremely long (e.g., 4+ hours)? The same algorithm applies — 10 equally spaced 1-second clips — ensuring the preview still represents the full video.
- What happens when a video is exactly 10 seconds long? Each 1-second segment starts at second 0, 1, 2, ... 9, effectively previewing the entire video.
- What happens when storage for preview clips runs low? Preview clips should be reasonably compressed and sized to minimize storage impact.
- What happens when many previews are requested simultaneously (e.g., opening a folder with 100 videos)? The system should queue generation and serve already-cached previews immediately while others are processed.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST generate preview clips for videos as MP4 files scaled to 480p resolution, where each preview is composed of up to 10 equally spaced 1-second segments from the original video, resulting in a clip of at most 10 seconds.
- **FR-002**: System MUST cache generated preview clips so they are only generated once per source video.
- **FR-003**: System MUST provide an endpoint to retrieve a preview clip for a given video path.
- **FR-004**: System MUST provide an endpoint to retrieve preview availability status for a batch of video paths so the client knows which previews are ready.
- **FR-005**: The mobile app MUST display a VideoWall view showing video previews in a grid of 2 columns on smaller screens and 3 columns on larger screens.
- **FR-006**: All visible preview clips in the VideoWall MUST play simultaneously, muted, and loop continuously.
- **FR-006a**: When a user long-presses on a preview clip, the app MUST unmute that clip's audio. Only one clip may have audio at a time.
- **FR-006b**: Preview clips MUST retain their audio track during generation (not stripped) to support audio-on-focus playback.
- **FR-007**: The VideoWall MUST support browsing videos from both folder navigation and search/filter results.
- **FR-008**: Tapping a preview clip in the VideoWall MUST navigate the user to the full video.
- **FR-009**: For videos shorter than 10 seconds, the system MUST generate a preview using as many full 1-second segments as the video duration allows.
- **FR-010**: The system MUST display a placeholder for videos whose preview clips are not yet generated.
- **FR-011**: The system MUST handle unprocessable videos gracefully by showing an error state rather than failing the entire wall.
- **FR-012**: The VideoWall MUST support scrolling through large numbers of videos, loading previews progressively.
- **FR-013**: Preview clips outside the visible viewport SHOULD pause playback to conserve device resources.
### Key Entities
- **Video Preview Clip**: A short looping MP4 video (at most 10 seconds) scaled to 480p resolution, derived from a source video. Composed of up to 10 equally spaced 1-second segments. Associated with exactly one source video by file path. Has a generation status (pending, processing, complete, failed).
- **VideoWall View**: A scrollable grid layout displaying video preview clips. Adapts column count based on screen size (2 or 3 columns). Operates on a set of videos from a folder or search result context.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: Users can visually browse all videos in a folder within 3 seconds of opening VideoWall (for folders with up to 50 videos with pre-generated previews).
- **SC-002**: Preview clips accurately represent the source video by sampling from evenly distributed points across the full duration.
- **SC-003**: All visible previews play simultaneously without noticeable stuttering on standard mobile devices.
- **SC-004**: Generated preview clips are each under 5 MB in size to keep storage and bandwidth manageable.
- **SC-005**: The VideoWall view correctly filters to show only videos (not photos) from the current folder or search results.
- **SC-006**: Users can identify and select a video of interest from the VideoWall and navigate to it in a single tap.
- **SC-007**: Preview generation for a single video completes within 30 seconds on typical hardware.
## Assumptions
- The existing file watcher and thumbnail generation infrastructure will be extended to also trigger preview clip generation.
- Preview clips will be stored alongside existing thumbnails/GIFs in a designated directory on the server.
- The React Native mobile app already has folder navigation and search/filter capabilities that provide the video list context for VideoWall.
- The server already has ffmpeg available for video processing (used for existing HLS and GIF generation).
- Authentication and authorization follow the existing JWT-based pattern; no new auth requirements.
- "2/3 columns" means a responsive layout: 2 columns on phones (portrait), 3 columns on tablets or landscape orientation.
- Preview clips are generated as MP4 video files for optimal quality-to-size ratio and hardware-accelerated mobile playback.

View File

@@ -1,234 +0,0 @@
# Tasks: VideoWall
**Input**: Design documents from `/specs/001-video-wall/`
**Prerequisites**: plan.md (required), spec.md (required), research.md, data-model.md, contracts/
**Tests**: Not explicitly requested — test tasks omitted.
**Organization**: Tasks grouped by user story. US2 (server generation) comes before US1 (mobile view) because the mobile app depends on the API endpoints existing.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Path Conventions
- **Backend (ImageApi)**: `src/` at `C:\Users\ccord\RustroverProjects\ImageApi`
- **Frontend (SynologyFileViewer)**: `app/`, `components/`, `hooks/` at `C:\Users\ccord\development\SynologyFileViewer`
---
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Database migration, new environment variable, shared types
- [x] T001 Create Diesel migration for `video_preview_clips` table: run `diesel migration generate create_video_preview_clips`, write `up.sql` with table definition (id, file_path UNIQUE, status DEFAULT 'pending', duration_seconds, file_size_bytes, error_message, created_at, updated_at) and indexes (idx_preview_clips_file_path, idx_preview_clips_status), write `down.sql` with DROP TABLE. See `data-model.md` for full schema.
- [x] T002 Run migration and regenerate schema: execute `diesel migration run` then `diesel print-schema > src/database/schema.rs` to add the `video_preview_clips` table to `src/database/schema.rs`
- [x] T003 Add `PREVIEW_CLIPS_DIRECTORY` environment variable: read it in `src/main.rs` startup (alongside existing `GIFS_DIRECTORY`), create the directory if it doesn't exist, and add it to `AppState` or pass it where needed. Follow the pattern used for `GIFS_DIRECTORY` and `THUMBNAILS`.
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Diesel model, DAO, and request/response types that all user stories depend on
**CRITICAL**: No user story work can begin until this phase is complete
- [x] T004 [P] Add `VideoPreviewClip` Diesel model struct in `src/database/models.rs` with fields matching the `video_preview_clips` schema table (Queryable, Insertable derives). Add a `NewVideoPreviewClip` struct for inserts.
- [x] T005 [P] Add `PreviewClipRequest` and `PreviewStatusRequest`/`PreviewStatusResponse` types in `src/data/mod.rs`. `PreviewClipRequest` has `path: String`. `PreviewStatusRequest` has `paths: Vec<String>`. `PreviewStatusResponse` has `previews: Vec<PreviewStatusItem>` where each item has `path`, `status`, `preview_url: Option<String>`. All with Serialize/Deserialize derives.
- [x] T006 Create `PreviewDao` trait and `SqlitePreviewDao` implementation in `src/database/preview_dao.rs`. Methods: `insert_preview(file_path, status) -> Result`, `update_status(file_path, status, duration_seconds?, file_size_bytes?, error_message?) -> Result`, `get_preview(file_path) -> Result<Option<VideoPreviewClip>>`, `get_previews_batch(file_paths: &[String]) -> Result<Vec<VideoPreviewClip>>`, `get_by_status(status) -> Result<Vec<VideoPreviewClip>>`. Follow the `ExifDao`/`SqliteExifDao` pattern with `Arc<Mutex<SqliteConnection>>` and OpenTelemetry tracing spans.
- [x] T007 Register `preview_dao` module in `src/database/mod.rs` and add `PreviewDao` to the database module exports. Wire `SqlitePreviewDao` into `AppState` in `src/state.rs` following the existing DAO pattern (e.g., how `ExifDao` is added).
**Checkpoint**: Foundation ready — DAO, models, and types available for all stories
---
## Phase 3: User Story 2 - Server Generates Preview Clips (Priority: P1) MVP
**Goal**: Backend can generate 480p MP4 preview clips (10 equally spaced 1-second segments) and serve them via API endpoints with on-demand generation and batch status checking.
**Independent Test**: Request `GET /video/preview?path=<video>` for any video — should return an MP4 file of at most 10 seconds. Request `POST /video/preview/status` with video paths — should return status for each.
### Implementation for User Story 2
- [x] T008 [P] [US2] Add `generate_preview_clip()` function in `src/video/ffmpeg.rs`. Takes input video path, output MP4 path, and video duration. Uses ffprobe to get duration (existing pattern). Calculates interval = `duration / 10` (or fewer for short videos per FR-009). Builds ffmpeg command with: video filter `select='lt(mod(t,{interval}),1)',setpts=N/FRAME_RATE/TB,scale=-2:480`, audio filter `aselect='lt(mod(t,{interval}),1)',asetpts=N/SR/TB`, codec H.264 CRF 28 preset veryfast, AAC audio. Output path uses `.mp4` extension. Creates parent directories for output. Returns `Result<(f64, u64)>` with (duration_seconds, file_size_bytes). See `research.md` R1 for full ffmpeg strategy.
- [x] T009 [P] [US2] Create `PreviewClipGenerator` actor in `src/video/actors.rs`. Struct holds `Arc<Semaphore>` (limit 2 concurrent), preview clips directory path, base path, and `Arc<dyn PreviewDao>`. Handles `GeneratePreviewMessage { video_path: String }`: acquires semaphore permit, updates DB status to `processing`, calls `generate_preview_clip()`, updates DB to `complete` with duration/size on success or `failed` with error on failure. Follow the `PlaylistGenerator` actor pattern with `tokio::spawn` for async processing.
- [x] T010 [US2] Add `PreviewClipGenerator` actor to `AppState` in `src/state.rs`. Initialize it during server startup in `src/main.rs` with the `PREVIEW_CLIPS_DIRECTORY`, `BASE_PATH`, and preview DAO reference. Start the actor with `PreviewClipGenerator::new(...).start()`.
- [x] T011 [US2] Implement `GET /video/preview` handler in `src/main.rs`. Validate path with `is_valid_full_path()`. Check preview DAO for status: if `complete` → serve MP4 file with `NamedFile::open()` (200); if `processing` → return 202 JSON; if `pending`/not found → insert/update record as `pending`, send `GeneratePreviewMessage` to actor, return 202 JSON; if `failed` → return 500 with error. See `contracts/api-endpoints.md` for full response contract.
- [x] T012 [US2] Implement `POST /video/preview/status` handler in `src/main.rs`. Accept `PreviewStatusRequest` JSON body. Call `preview_dao.get_previews_batch()` for all paths. Map results: for each path, return status and `preview_url` (only when `complete`). Paths not in DB get status `not_found`. Limit to 200 paths per request. Return `PreviewStatusResponse` JSON.
- [x] T013 [US2] Register both new endpoints in route configuration in `src/main.rs`. Add `web::resource("/video/preview").route(web::get().to(get_video_preview))` and `web::resource("/video/preview/status").route(web::post().to(get_preview_status))`. Both require authentication (Claims extraction).
- [x] T014 [US2] Handle short videos (< 10 seconds) in `generate_preview_clip()` in `src/video/ffmpeg.rs`. When duration < 10s, calculate segment count as `floor(duration)` and interval as `duration / segment_count`. When duration < 1s, use the entire video as the preview (just transcode to 480p MP4). Add this logic to the interval calculation in T008.
**Checkpoint**: Backend fully functional — preview clips can be generated, cached, and served via API
---
## Phase 4: User Story 1 - Browse Videos as a Visual Wall (Priority: P1) MVP
**Goal**: Mobile app displays a responsive 2-3 column grid of simultaneously looping, muted video previews with long-press audio and tap-to-navigate.
**Independent Test**: Navigate to a folder with videos in the app, switch to VideoWall view, confirm grid displays with playing previews. Long-press to hear audio. Tap to open full video.
### Implementation for User Story 1
- [x] T015 [P] [US1] Create `useVideoWall` hook in `hooks/useVideoWall.ts` (SynologyFileViewer). Accepts array of `GridItem[]` (video items only, filtered from current files context). Calls `POST /video/preview/status` with video paths on mount to get availability. Returns `{ previewStatuses: Map<string, PreviewStatus>, focusedVideoPath: string | null, setFocusedVideo: (path) => void, refreshStatuses: () => void }`. Uses `authenticatedFetch()` from auth hook. Polls status every 5 seconds for any items still in `pending`/`processing` state, stops polling when all are `complete` or `failed`.
- [x] T016 [P] [US1] Create `VideoWallItem` component in `components/VideoWallItem.tsx` (SynologyFileViewer). Renders an `expo-video` `VideoView` for a single preview clip. Props: `videoPath: string, previewStatus: string, isFocused: boolean, onTap: () => void, onLongPress: () => void, isVisible: boolean`. When `previewStatus === 'complete'`: create `useVideoPlayer` with source URL `${baseUrl}/video/preview?path=${videoPath}` and auth headers, set `player.loop = true`, `player.muted = !isFocused`. When `isVisible` is true → `player.play()`, false → `player.pause()`. When status is not complete: show placeholder (thumbnail image from existing `/image?path=&size=thumb` endpoint with a loading indicator overlay). When `failed`: show error icon overlay. Aspect ratio 16:9 with `nativeControls={false}`.
- [x] T017 [US1] Create VideoWall view in `app/(app)/grid/video-wall.tsx` (SynologyFileViewer). Use `FlatList` with `numColumns` calculated as `Math.floor(dimensions.width / 180)` (targeting 2-3 columns). Get video items from `FilesContext` — filter `allItems` or `filteredItems` to only include video extensions (use same detection as existing `isVideo()` check). Pass items to `useVideoWall` hook. Use `viewabilityConfig` with `viewAreaCoveragePercentThreshold: 50` and `onViewableItemsChanged` callback to track visible items, passing `isVisible` to each `VideoWallItem`. Implement `keyExtractor` using video path. Add scroll-to-top FAB button following existing grid pattern.
- [x] T018 [US1] Add `video-wall` route to stack navigator in `app/(app)/grid/_layout.tsx` (SynologyFileViewer). Add `<Stack.Screen name="video-wall" options={{ title: "Video Wall" }} />` to the existing Stack navigator.
- [x] T019 [US1] Add navigation entry point to switch to VideoWall from the grid view. In `app/(app)/grid/[path].tsx` (SynologyFileViewer), add a header button (e.g., a grid/video icon from `@expo/vector-icons`) that calls `router.push("/grid/video-wall")`. Only show the button when the current folder contains at least one video file.
- [x] T020 [US1] Implement long-press audio-on-focus behavior. In `VideoWallItem`, wrap the VideoView in a `Pressable` with `onLongPress` calling `onLongPress` prop. In `video-wall.tsx`, when `onLongPress` fires for an item: call `setFocusedVideo(path)` if different from current, or `setFocusedVideo(null)` to toggle off. The `isFocused` prop drives `player.muted` in `VideoWallItem` — when focused, unmute; all others stay muted.
- [x] T021 [US1] Implement tap-to-navigate to full video player. In `VideoWallItem`, the `onTap` prop triggers navigation. In `video-wall.tsx`, the `onTap` handler sets the `currentIndex` in `FilesContext` to the tapped video's index and calls `router.push("/grid/viewer/video")` following the existing pattern from `[path].tsx` grid item press.
**Checkpoint**: Full VideoWall experience works for folder browsing with simultaneous playback, audio-on-focus, and tap-to-view
---
## Phase 5: User Story 3 - VideoWall from Search Results (Priority: P2)
**Goal**: VideoWall works with search/filter results, showing only matching videos.
**Independent Test**: Perform a search with filters that returns videos, switch to VideoWall, confirm only matching videos appear.
### Implementation for User Story 3
- [x] T022 [US3] Ensure VideoWall uses `filteredItems` when available. In `app/(app)/grid/video-wall.tsx` (SynologyFileViewer), check if `filteredItems` from `FilesContext` is non-empty — if so, use `filteredItems` filtered to videos only; otherwise use `allItems` filtered to videos. This should already work if T017 reads from the context correctly, but verify the logic handles both folder browsing and search result modes.
- [x] T023 [US3] Add VideoWall toggle from search results. In `app/(app)/search.tsx` (SynologyFileViewer), add a button (same icon as T019) that navigates to `/grid/video-wall` when search results contain at least one video. The `filteredItems` in `FilesContext` should already be populated by the search, so VideoWall will pick them up automatically.
**Checkpoint**: VideoWall works with both folder navigation and search/filter results
---
## Phase 6: User Story 4 - Background Preview Generation (Priority: P3)
**Goal**: Preview clips are generated proactively during file watching so most are ready before users open VideoWall.
**Independent Test**: Add new video files to a monitored folder, wait for file watcher scan cycle, confirm preview clips appear in `PREVIEW_CLIPS_DIRECTORY` without any user request.
### Implementation for User Story 4
- [x] T024 [US4] Extend `process_new_files()` in `src/main.rs` to detect videos missing preview clips. After the existing EXIF batch query, add a batch query via `preview_dao.get_previews_batch()` for all discovered video paths. Collect videos that have no record or have `failed` status (for retry).
- [x] T025 [US4] Queue preview generation for new/unprocessed videos in `process_new_files()` in `src/main.rs`. For each video missing a preview, insert a `pending` record via `preview_dao.insert_preview()` (skip if already exists), then send `GeneratePreviewMessage` to the `PreviewClipGenerator` actor. Follow the existing pattern of sending `QueueVideosMessage` to `VideoPlaylistManager`.
- [x] T026 [US4] Add preview clip directory creation to startup scan in `src/main.rs`. During the initial startup thumbnail generation phase, also check for videos missing preview clips and queue them for generation (same logic as T024/T025 but for the initial full scan). Ensure the `PREVIEW_CLIPS_DIRECTORY` is created at startup if it doesn't exist.
**Checkpoint**: New videos automatically get preview clips generated during file watcher scans
---
## Phase 7: Polish & Cross-Cutting Concerns
**Purpose**: Error handling, loading states, observability
- [x] T027 [P] Add loading/placeholder state for pending previews in `components/VideoWallItem.tsx` (SynologyFileViewer). Show the existing thumbnail from `/image?path=&size=thumb` with a semi-transparent overlay and a loading spinner when preview status is `pending` or `processing`.
- [x] T028 [P] Add error state for failed previews in `components/VideoWallItem.tsx` (SynologyFileViewer). Show the existing thumbnail with an error icon overlay and optional "Retry" text when preview status is `failed`.
- [x] T029 [P] Add OpenTelemetry tracing spans for preview generation in `src/video/actors.rs` and `src/main.rs` endpoints. Follow the existing pattern of `global_tracer().start("preview_clip_generate")` with status and duration attributes.
- [x] T030 Verify cargo build and cargo clippy pass with all backend changes. Fix any warnings or errors.
- [x] T031 Run quickstart.md validation: test both API endpoints manually with curl, verify preview clip file is generated in correct directory structure, confirm mobile app connects and displays VideoWall.
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies — start immediately
- **Foundational (Phase 2)**: Depends on Phase 1 (migration must run first)
- **US2 - Server Generation (Phase 3)**: Depends on Phase 2 (needs DAO, models, types)
- **US1 - Mobile VideoWall (Phase 4)**: Depends on Phase 3 (needs API endpoints to exist)
- **US3 - Search Results (Phase 5)**: Depends on Phase 4 (extends VideoWall view)
- **US4 - Background Generation (Phase 6)**: Depends on Phase 3 only (backend only, no mobile dependency)
- **Polish (Phase 7)**: Depends on Phases 4 and 6
### User Story Dependencies
- **US2 (P1)**: Can start after Foundational — no other story dependencies
- **US1 (P1)**: Depends on US2 (needs preview API endpoints)
- **US3 (P2)**: Depends on US1 (extends the VideoWall view)
- **US4 (P3)**: Depends on US2 only (extends file watcher with preview generation; independent of mobile app)
### Within Each User Story
- Models/types before services/DAO
- DAO before actors
- Actors before endpoints
- Backend endpoints before mobile app views
- Core view before navigation integration
### Parallel Opportunities
**Phase 2**: T004, T005 can run in parallel (different files)
**Phase 3**: T008, T009 can run in parallel (ffmpeg.rs vs actors.rs)
**Phase 4**: T015, T016 can run in parallel (hook vs component, different files)
**Phase 6**: T024, T025 are sequential (same file) but Phase 6 can run in parallel with Phase 4/5
**Phase 7**: T027, T028, T029 can all run in parallel (different files)
---
## Parallel Example: User Story 2
```bash
# Launch parallelizable tasks together:
Task T008: "Add generate_preview_clip() function in src/video/ffmpeg.rs"
Task T009: "Create PreviewClipGenerator actor in src/video/actors.rs"
# Then sequential tasks (depend on T008+T009):
Task T010: "Add PreviewClipGenerator to AppState in src/state.rs"
Task T011: "Implement GET /video/preview handler in src/main.rs"
Task T012: "Implement POST /video/preview/status handler in src/main.rs"
Task T013: "Register endpoints in route configuration in src/main.rs"
Task T014: "Handle short videos in generate_preview_clip() in src/video/ffmpeg.rs"
```
## Parallel Example: User Story 1
```bash
# Launch parallelizable tasks together:
Task T015: "Create useVideoWall hook in hooks/useVideoWall.ts"
Task T016: "Create VideoWallItem component in components/VideoWallItem.tsx"
# Then sequential tasks (depend on T015+T016):
Task T017: "Create VideoWall view in app/(app)/grid/video-wall.tsx"
Task T018: "Add video-wall route to stack navigator"
Task T019: "Add navigation entry point from grid view"
Task T020: "Implement long-press audio-on-focus"
Task T021: "Implement tap-to-navigate to full video player"
```
---
## Implementation Strategy
### MVP First (US2 + US1)
1. Complete Phase 1: Setup (migration, env var)
2. Complete Phase 2: Foundational (model, DAO, types)
3. Complete Phase 3: US2 — Server generates preview clips
4. **STOP and VALIDATE**: Test API with curl per quickstart.md
5. Complete Phase 4: US1 — Mobile VideoWall view
6. **STOP and VALIDATE**: Test end-to-end on device
7. Deploy/demo — this is the MVP!
### Incremental Delivery
1. Setup + Foundational → Foundation ready
2. US2 (Server Generation) → Backend API working (testable with curl)
3. US1 (Mobile VideoWall) → Full end-to-end MVP (testable on device)
4. US3 (Search Results) → Extended browsing from search (incremental value)
5. US4 (Background Generation) → Performance enhancement (clips pre-generated)
6. Polish → Error states, tracing, validation
### Note on US4 Parallelism
US4 (Background Generation) only depends on US2 (backend), not on the mobile app. It can be developed in parallel with US1 by a second developer, or deferred to after MVP is validated.
---
## Notes
- [P] tasks = different files, no dependencies
- [Story] label maps task to specific user story
- Backend work is in `C:\Users\ccord\RustroverProjects\ImageApi`
- Frontend work is in `C:\Users\ccord\development\SynologyFileViewer`
- Commit after each task or logical group
- Stop at any checkpoint to validate story independently

View File

@@ -1,403 +0,0 @@
use anyhow::Result;
use chrono::{NaiveDate, Utc};
use opentelemetry::KeyValue;
use opentelemetry::trace::{Span, Status, TraceContextExt, Tracer};
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use tokio::time::sleep;
use crate::ai::{OllamaClient, SmsApiClient, SmsMessage};
use crate::database::{DailySummaryDao, InsertDailySummary};
use crate::otel::global_tracer;
/// Strip boilerplate prefixes and common phrases from summaries before embedding.
/// This improves embedding diversity by removing structural similarity.
pub fn strip_summary_boilerplate(summary: &str) -> String {
let mut text = summary.trim().to_string();
// Remove markdown headers
while text.starts_with('#') {
if let Some(pos) = text.find('\n') {
text = text[pos..].trim_start().to_string();
} else {
// Single line with just headers, try to extract content after #s
text = text.trim_start_matches('#').trim().to_string();
break;
}
}
// Remove "Summary:" prefix variations (with optional markdown bold)
let prefixes = [
"**Summary:**",
"**Summary**:",
"*Summary:*",
"Summary:",
"**summary:**",
"summary:",
];
for prefix in prefixes {
if text.to_lowercase().starts_with(&prefix.to_lowercase()) {
text = text[prefix.len()..].trim_start().to_string();
break;
}
}
// Remove common opening phrases that add no semantic value
let opening_phrases = [
"Today, Melissa and I discussed",
"Today, Amanda and I discussed",
"Today Melissa and I discussed",
"Today Amanda and I discussed",
"Melissa and I discussed",
"Amanda and I discussed",
"Today, I discussed",
"Today I discussed",
"The conversation covered",
"This conversation covered",
"In this conversation,",
"During this conversation,",
];
for phrase in opening_phrases {
if text.to_lowercase().starts_with(&phrase.to_lowercase()) {
text = text[phrase.len()..].trim_start().to_string();
// Remove leading punctuation/articles after stripping phrase
text = text
.trim_start_matches([',', ':', '-'])
.trim_start()
.to_string();
break;
}
}
// Remove any remaining leading markdown bold markers
if text.starts_with("**")
&& let Some(end) = text[2..].find("**")
{
// Keep the content between ** but remove the markers
let bold_content = &text[2..2 + end];
text = format!("{}{}", bold_content, &text[4 + end..]);
}
text.trim().to_string()
}
/// Generate and embed daily conversation summaries for a date range
/// Default: August 2024 ±30 days (July 1 - September 30, 2024)
pub async fn generate_daily_summaries(
contact: &str,
start_date: Option<NaiveDate>,
end_date: Option<NaiveDate>,
ollama: &OllamaClient,
sms_client: &SmsApiClient,
summary_dao: Arc<Mutex<Box<dyn DailySummaryDao>>>,
) -> Result<()> {
let tracer = global_tracer();
// Get current context (empty in background task) and start span with it
let current_cx = opentelemetry::Context::current();
let mut span = tracer.start_with_context("ai.daily_summary.generate_batch", &current_cx);
span.set_attribute(KeyValue::new("contact", contact.to_string()));
// Create context with this span for child operations
let parent_cx = current_cx.with_span(span);
// Default to August 2024 ±30 days
let start = start_date.unwrap_or_else(|| NaiveDate::from_ymd_opt(2024, 7, 1).unwrap());
let end = end_date.unwrap_or_else(|| NaiveDate::from_ymd_opt(2024, 9, 30).unwrap());
parent_cx
.span()
.set_attribute(KeyValue::new("start_date", start.to_string()));
parent_cx
.span()
.set_attribute(KeyValue::new("end_date", end.to_string()));
parent_cx.span().set_attribute(KeyValue::new(
"date_range_days",
(end - start).num_days() + 1,
));
log::info!("========================================");
log::info!("Starting daily summary generation for {}", contact);
log::info!(
"Date range: {} to {} ({} days)",
start,
end,
(end - start).num_days() + 1
);
log::info!("========================================");
// Fetch all messages for the contact in the date range
log::info!("Fetching messages for date range...");
let _start_timestamp = start.and_hms_opt(0, 0, 0).unwrap().and_utc().timestamp();
let _end_timestamp = end.and_hms_opt(23, 59, 59).unwrap().and_utc().timestamp();
let all_messages = sms_client.fetch_all_messages_for_contact(contact).await?;
// Filter to date range and group by date
let mut messages_by_date: HashMap<NaiveDate, Vec<SmsMessage>> = HashMap::new();
for msg in all_messages {
let msg_dt = chrono::DateTime::from_timestamp(msg.timestamp, 0);
if let Some(dt) = msg_dt {
let date = dt.date_naive();
if date >= start && date <= end {
messages_by_date.entry(date).or_default().push(msg);
}
}
}
log::info!(
"Grouped messages into {} days with activity",
messages_by_date.len()
);
if messages_by_date.is_empty() {
log::warn!("No messages found in date range");
return Ok(());
}
// Sort dates for ordered processing
let mut dates: Vec<NaiveDate> = messages_by_date.keys().cloned().collect();
dates.sort();
let total_days = dates.len();
let mut processed = 0;
let mut skipped = 0;
let mut failed = 0;
log::info!("Processing {} days with messages...", total_days);
for (idx, date) in dates.iter().enumerate() {
let messages = messages_by_date.get(date).unwrap();
let date_str = date.format("%Y-%m-%d").to_string();
// Check if summary already exists
{
let mut dao = summary_dao.lock().expect("Unable to lock DailySummaryDao");
let otel_context = opentelemetry::Context::new();
if dao
.summary_exists(&otel_context, &date_str, contact)
.unwrap_or(false)
{
skipped += 1;
if idx % 10 == 0 {
log::info!(
"Progress: {}/{} ({} processed, {} skipped)",
idx + 1,
total_days,
processed,
skipped
);
}
continue;
}
}
// Generate summary for this day
match generate_and_store_daily_summary(
&parent_cx,
date,
contact,
messages,
ollama,
summary_dao.clone(),
)
.await
{
Ok(_) => {
processed += 1;
log::info!(
"✓ {}/{}: {} ({} messages)",
idx + 1,
total_days,
date_str,
messages.len()
);
}
Err(e) => {
failed += 1;
log::error!("✗ Failed to process {}: {:?}", date_str, e);
}
}
// Rate limiting: sleep 500ms between summaries
if idx < total_days - 1 {
sleep(std::time::Duration::from_millis(500)).await;
}
// Progress logging every 10 days
if idx % 10 == 0 && idx > 0 {
log::info!(
"Progress: {}/{} ({} processed, {} skipped, {} failed)",
idx + 1,
total_days,
processed,
skipped,
failed
);
}
}
log::info!("========================================");
log::info!("Daily summary generation complete!");
log::info!(
"Processed: {}, Skipped: {}, Failed: {}",
processed,
skipped,
failed
);
log::info!("========================================");
// Record final metrics in span
parent_cx
.span()
.set_attribute(KeyValue::new("days_processed", processed as i64));
parent_cx
.span()
.set_attribute(KeyValue::new("days_skipped", skipped as i64));
parent_cx
.span()
.set_attribute(KeyValue::new("days_failed", failed as i64));
parent_cx
.span()
.set_attribute(KeyValue::new("total_days", total_days as i64));
if failed > 0 {
parent_cx
.span()
.set_status(Status::error(format!("{} days failed to process", failed)));
} else {
parent_cx.span().set_status(Status::Ok);
}
Ok(())
}
/// Generate and store a single day's summary
async fn generate_and_store_daily_summary(
parent_cx: &opentelemetry::Context,
date: &NaiveDate,
contact: &str,
messages: &[SmsMessage],
ollama: &OllamaClient,
summary_dao: Arc<Mutex<Box<dyn DailySummaryDao>>>,
) -> Result<()> {
let tracer = global_tracer();
let mut span = tracer.start_with_context("ai.daily_summary.generate_single", parent_cx);
span.set_attribute(KeyValue::new("date", date.to_string()));
span.set_attribute(KeyValue::new("contact", contact.to_string()));
span.set_attribute(KeyValue::new("message_count", messages.len() as i64));
// Format messages for LLM
let messages_text: String = messages
.iter()
.take(200) // Limit to 200 messages per day to avoid token overflow
.map(|m| {
if m.is_sent {
format!("Me: {}", m.body)
} else {
format!("{}: {}", m.contact, m.body)
}
})
.collect::<Vec<_>>()
.join("\n");
let weekday = date.format("%A");
let prompt = format!(
r#"Summarize this day's conversation between me and {}.
CRITICAL FORMAT RULES:
- Do NOT start with "Based on the conversation..." or "Here is a summary..." or similar preambles
- Do NOT repeat the date at the beginning
- Start DIRECTLY with the content - begin with a person's name or action
- Write in past tense, as if recording what happened
NARRATIVE (3-5 sentences):
- What specific topics, activities, or events were discussed?
- What places, people, or organizations were mentioned?
- What plans were made or decisions discussed?
- Clearly distinguish between what "I" did versus what {} did
KEYWORDS (comma-separated):
5-10 specific keywords that capture this conversation's unique content:
- Proper nouns (people, places, brands)
- Specific activities ("drum corps audition" not just "music")
- Distinctive terms that make this day unique
Date: {} ({})
Messages:
{}
YOUR RESPONSE (follow this format EXACTLY):
Summary: [Start directly with content, NO preamble]
Keywords: [specific, unique terms]"#,
contact,
contact,
date.format("%B %d, %Y"),
weekday,
messages_text
);
// Generate summary with LLM
let summary = ollama
.generate(
&prompt,
Some("You are a conversation summarizer. Create clear, factual summaries with precise subject attribution AND extract distinctive keywords. Focus on specific, unique terms that differentiate this conversation from others."),
)
.await?;
log::debug!(
"Generated summary for {}: {}",
date,
summary.chars().take(100).collect::<String>()
);
span.set_attribute(KeyValue::new("summary_length", summary.len() as i64));
// Strip boilerplate before embedding to improve vector diversity
let stripped_summary = strip_summary_boilerplate(&summary);
log::debug!(
"Stripped summary for embedding: {}",
stripped_summary.chars().take(100).collect::<String>()
);
// Embed the stripped summary (store original summary in DB)
let embedding = ollama.generate_embedding(&stripped_summary).await?;
span.set_attribute(KeyValue::new(
"embedding_dimensions",
embedding.len() as i64,
));
// Store in database
let insert = InsertDailySummary {
date: date.format("%Y-%m-%d").to_string(),
contact: contact.to_string(),
summary: summary.trim().to_string(),
message_count: messages.len() as i32,
embedding,
created_at: Utc::now().timestamp(),
// model_version: "nomic-embed-text:v1.5".to_string(),
model_version: "mxbai-embed-large:335m".to_string(),
};
// Create context from current span for DB operation
let child_cx = opentelemetry::Context::current_with_span(span);
let mut dao = summary_dao.lock().expect("Unable to lock DailySummaryDao");
let result = dao
.store_summary(&child_cx, insert)
.map_err(|e| anyhow::anyhow!("Failed to store summary: {:?}", e));
match &result {
Ok(_) => child_cx.span().set_status(Status::Ok),
Err(e) => child_cx.span().set_status(Status::error(e.to_string())),
}
result?;
Ok(())
}

View File

@@ -1,263 +0,0 @@
use actix_web::{HttpRequest, HttpResponse, Responder, delete, get, post, web};
use opentelemetry::KeyValue;
use opentelemetry::trace::{Span, Status, Tracer};
use serde::{Deserialize, Serialize};
use crate::ai::{InsightGenerator, ModelCapabilities, OllamaClient};
use crate::data::Claims;
use crate::database::InsightDao;
use crate::otel::{extract_context_from_request, global_tracer};
use crate::utils::normalize_path;
#[derive(Debug, Deserialize)]
pub struct GeneratePhotoInsightRequest {
pub file_path: String,
#[serde(default)]
pub model: Option<String>,
#[serde(default)]
pub system_prompt: Option<String>,
#[serde(default)]
pub num_ctx: Option<i32>,
}
#[derive(Debug, Deserialize)]
pub struct GetPhotoInsightQuery {
pub path: String,
}
#[derive(Debug, Serialize)]
pub struct PhotoInsightResponse {
pub id: i32,
pub file_path: String,
pub title: String,
pub summary: String,
pub generated_at: i64,
pub model_version: String,
}
#[derive(Debug, Serialize)]
pub struct AvailableModelsResponse {
pub primary: ServerModels,
#[serde(skip_serializing_if = "Option::is_none")]
pub fallback: Option<ServerModels>,
}
#[derive(Debug, Serialize)]
pub struct ServerModels {
pub url: String,
pub models: Vec<ModelCapabilities>,
pub default_model: String,
}
/// POST /insights/generate - Generate insight for a specific photo
#[post("/insights/generate")]
pub async fn generate_insight_handler(
http_request: HttpRequest,
_claims: Claims,
request: web::Json<GeneratePhotoInsightRequest>,
insight_generator: web::Data<InsightGenerator>,
) -> impl Responder {
let parent_context = extract_context_from_request(&http_request);
let tracer = global_tracer();
let mut span = tracer.start_with_context("http.insights.generate", &parent_context);
let normalized_path = normalize_path(&request.file_path);
span.set_attribute(KeyValue::new("file_path", normalized_path.clone()));
if let Some(ref model) = request.model {
span.set_attribute(KeyValue::new("model", model.clone()));
}
if let Some(ref prompt) = request.system_prompt {
span.set_attribute(KeyValue::new("has_custom_prompt", true));
span.set_attribute(KeyValue::new("prompt_length", prompt.len() as i64));
}
if let Some(ctx) = request.num_ctx {
span.set_attribute(KeyValue::new("num_ctx", ctx as i64));
}
log::info!(
"Manual insight generation triggered for photo: {} with model: {:?}, custom_prompt: {}, num_ctx: {:?}",
normalized_path,
request.model,
request.system_prompt.is_some(),
request.num_ctx
);
// Generate insight with optional custom model, system prompt, and context size
let result = insight_generator
.generate_insight_for_photo_with_config(
&normalized_path,
request.model.clone(),
request.system_prompt.clone(),
request.num_ctx,
)
.await;
match result {
Ok(()) => {
span.set_status(Status::Ok);
HttpResponse::Ok().json(serde_json::json!({
"success": true,
"message": "Insight generated successfully"
}))
}
Err(e) => {
log::error!("Failed to generate insight: {:?}", e);
span.set_status(Status::error(e.to_string()));
HttpResponse::InternalServerError().json(serde_json::json!({
"error": format!("Failed to generate insight: {:?}", e)
}))
}
}
}
/// GET /insights?path=/path/to/photo.jpg - Fetch insight for specific photo
#[get("/insights")]
pub async fn get_insight_handler(
_claims: Claims,
query: web::Query<GetPhotoInsightQuery>,
insight_dao: web::Data<std::sync::Mutex<Box<dyn InsightDao>>>,
) -> impl Responder {
let normalized_path = normalize_path(&query.path);
log::debug!("Fetching insight for {}", normalized_path);
let otel_context = opentelemetry::Context::new();
let mut dao = insight_dao.lock().expect("Unable to lock InsightDao");
match dao.get_insight(&otel_context, &normalized_path) {
Ok(Some(insight)) => {
let response = PhotoInsightResponse {
id: insight.id,
file_path: insight.file_path,
title: insight.title,
summary: insight.summary,
generated_at: insight.generated_at,
model_version: insight.model_version,
};
HttpResponse::Ok().json(response)
}
Ok(None) => HttpResponse::NotFound().json(serde_json::json!({
"error": "Insight not found"
})),
Err(e) => {
log::error!("Failed to fetch insight ({}): {:?}", &query.path, e);
HttpResponse::InternalServerError().json(serde_json::json!({
"error": format!("Failed to fetch insight: {:?}", e)
}))
}
}
}
/// DELETE /insights?path=/path/to/photo.jpg - Remove insight (will regenerate on next request)
#[delete("/insights")]
pub async fn delete_insight_handler(
_claims: Claims,
query: web::Query<GetPhotoInsightQuery>,
insight_dao: web::Data<std::sync::Mutex<Box<dyn InsightDao>>>,
) -> impl Responder {
let normalized_path = normalize_path(&query.path);
log::info!("Deleting insight for {}", normalized_path);
let otel_context = opentelemetry::Context::new();
let mut dao = insight_dao.lock().expect("Unable to lock InsightDao");
match dao.delete_insight(&otel_context, &normalized_path) {
Ok(()) => HttpResponse::Ok().json(serde_json::json!({
"success": true,
"message": "Insight deleted successfully"
})),
Err(e) => {
log::error!("Failed to delete insight: {:?}", e);
HttpResponse::InternalServerError().json(serde_json::json!({
"error": format!("Failed to delete insight: {:?}", e)
}))
}
}
}
/// GET /insights/all - Get all insights
#[get("/insights/all")]
pub async fn get_all_insights_handler(
_claims: Claims,
insight_dao: web::Data<std::sync::Mutex<Box<dyn InsightDao>>>,
) -> impl Responder {
log::debug!("Fetching all insights");
let otel_context = opentelemetry::Context::new();
let mut dao = insight_dao.lock().expect("Unable to lock InsightDao");
match dao.get_all_insights(&otel_context) {
Ok(insights) => {
let responses: Vec<PhotoInsightResponse> = insights
.into_iter()
.map(|insight| PhotoInsightResponse {
id: insight.id,
file_path: insight.file_path,
title: insight.title,
summary: insight.summary,
generated_at: insight.generated_at,
model_version: insight.model_version,
})
.collect();
HttpResponse::Ok().json(responses)
}
Err(e) => {
log::error!("Failed to fetch all insights: {:?}", e);
HttpResponse::InternalServerError().json(serde_json::json!({
"error": format!("Failed to fetch insights: {:?}", e)
}))
}
}
}
/// GET /insights/models - List available models from both servers with capabilities
#[get("/insights/models")]
pub async fn get_available_models_handler(
_claims: Claims,
app_state: web::Data<crate::state::AppState>,
) -> impl Responder {
log::debug!("Fetching available models with capabilities");
let ollama_client = &app_state.ollama;
// Fetch models with capabilities from primary server
let primary_models =
match OllamaClient::list_models_with_capabilities(&ollama_client.primary_url).await {
Ok(models) => models,
Err(e) => {
log::warn!("Failed to fetch models from primary server: {:?}", e);
vec![]
}
};
let primary = ServerModels {
url: ollama_client.primary_url.clone(),
models: primary_models,
default_model: ollama_client.primary_model.clone(),
};
// Fetch models with capabilities from fallback server if configured
let fallback = if let Some(fallback_url) = &ollama_client.fallback_url {
match OllamaClient::list_models_with_capabilities(fallback_url).await {
Ok(models) => Some(ServerModels {
url: fallback_url.clone(),
models,
default_model: ollama_client
.fallback_model
.clone()
.unwrap_or_else(|| ollama_client.primary_model.clone()),
}),
Err(e) => {
log::warn!("Failed to fetch models from fallback server: {:?}", e);
None
}
}
} else {
None
};
let response = AvailableModelsResponse { primary, fallback };
HttpResponse::Ok().json(response)
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,16 +0,0 @@
pub mod daily_summary_job;
pub mod handlers;
pub mod insight_generator;
pub mod ollama;
pub mod sms_client;
// strip_summary_boilerplate is used by binaries (test_daily_summary), not the library
#[allow(unused_imports)]
pub use daily_summary_job::{generate_daily_summaries, strip_summary_boilerplate};
pub use handlers::{
delete_insight_handler, generate_insight_handler, get_all_insights_handler,
get_available_models_handler, get_insight_handler,
};
pub use insight_generator::InsightGenerator;
pub use ollama::{ModelCapabilities, OllamaClient};
pub use sms_client::{SmsApiClient, SmsMessage};

View File

@@ -1,666 +0,0 @@
use anyhow::Result;
use chrono::NaiveDate;
use reqwest::Client;
use serde::{Deserialize, Serialize};
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use std::time::{Duration, Instant};
// Cache duration: 15 minutes
const CACHE_DURATION_SECS: u64 = 15 * 60;
// Cached entry with timestamp
#[derive(Clone)]
struct CachedEntry<T> {
data: T,
cached_at: Instant,
}
impl<T> CachedEntry<T> {
fn new(data: T) -> Self {
Self {
data,
cached_at: Instant::now(),
}
}
fn is_expired(&self) -> bool {
self.cached_at.elapsed().as_secs() > CACHE_DURATION_SECS
}
}
// Global cache for model lists and capabilities
lazy_static::lazy_static! {
static ref MODEL_LIST_CACHE: Arc<Mutex<HashMap<String, CachedEntry<Vec<String>>>>> =
Arc::new(Mutex::new(HashMap::new()));
static ref MODEL_CAPABILITIES_CACHE: Arc<Mutex<HashMap<String, CachedEntry<Vec<ModelCapabilities>>>>> =
Arc::new(Mutex::new(HashMap::new()));
}
#[derive(Clone)]
pub struct OllamaClient {
client: Client,
pub primary_url: String,
pub fallback_url: Option<String>,
pub primary_model: String,
pub fallback_model: Option<String>,
num_ctx: Option<i32>,
}
impl OllamaClient {
pub fn new(
primary_url: String,
fallback_url: Option<String>,
primary_model: String,
fallback_model: Option<String>,
) -> Self {
Self {
client: Client::builder()
.connect_timeout(Duration::from_secs(5)) // Quick connection timeout
.timeout(Duration::from_secs(120)) // Total request timeout for generation
.build()
.unwrap_or_else(|_| Client::new()),
primary_url,
fallback_url,
primary_model,
fallback_model,
num_ctx: None,
}
}
pub fn set_num_ctx(&mut self, num_ctx: Option<i32>) {
self.num_ctx = num_ctx;
}
/// List available models on an Ollama server (cached for 15 minutes)
pub async fn list_models(url: &str) -> Result<Vec<String>> {
// Check cache first
{
let cache = MODEL_LIST_CACHE.lock().unwrap();
if let Some(entry) = cache.get(url)
&& !entry.is_expired()
{
log::debug!("Returning cached model list for {}", url);
return Ok(entry.data.clone());
}
}
log::debug!("Fetching fresh model list from {}", url);
let client = Client::builder()
.connect_timeout(Duration::from_secs(5))
.timeout(Duration::from_secs(10))
.build()?;
let response = client.get(format!("{}/api/tags", url)).send().await?;
if !response.status().is_success() {
return Err(anyhow::anyhow!("Failed to list models from {}", url));
}
let tags_response: OllamaTagsResponse = response.json().await?;
let models: Vec<String> = tags_response.models.into_iter().map(|m| m.name).collect();
// Store in cache
{
let mut cache = MODEL_LIST_CACHE.lock().unwrap();
cache.insert(url.to_string(), CachedEntry::new(models.clone()));
}
Ok(models)
}
/// Check if a model is available on a server
pub async fn is_model_available(url: &str, model_name: &str) -> Result<bool> {
let models = Self::list_models(url).await?;
Ok(models.iter().any(|m| m == model_name))
}
/// Clear the model list cache for a specific URL or all URLs
pub fn clear_model_cache(url: Option<&str>) {
let mut cache = MODEL_LIST_CACHE.lock().unwrap();
if let Some(url) = url {
cache.remove(url);
log::debug!("Cleared model list cache for {}", url);
} else {
cache.clear();
log::debug!("Cleared all model list cache entries");
}
}
/// Clear the model capabilities cache for a specific URL or all URLs
pub fn clear_capabilities_cache(url: Option<&str>) {
let mut cache = MODEL_CAPABILITIES_CACHE.lock().unwrap();
if let Some(url) = url {
cache.remove(url);
log::debug!("Cleared model capabilities cache for {}", url);
} else {
cache.clear();
log::debug!("Cleared all model capabilities cache entries");
}
}
/// Check if a model has vision capabilities using the /api/show endpoint
pub async fn check_model_capabilities(
url: &str,
model_name: &str,
) -> Result<ModelCapabilities> {
let client = Client::builder()
.connect_timeout(Duration::from_secs(5))
.timeout(Duration::from_secs(10))
.build()?;
#[derive(Serialize)]
struct ShowRequest {
model: String,
}
let response = client
.post(format!("{}/api/show", url))
.json(&ShowRequest {
model: model_name.to_string(),
})
.send()
.await?;
if !response.status().is_success() {
return Err(anyhow::anyhow!(
"Failed to get model details for {} from {}",
model_name,
url
));
}
let show_response: OllamaShowResponse = response.json().await?;
// Check if "vision" is in the capabilities array
let has_vision = show_response.capabilities.iter().any(|cap| cap == "vision");
Ok(ModelCapabilities {
name: model_name.to_string(),
has_vision,
})
}
/// List all models with their capabilities from a server (cached for 15 minutes)
pub async fn list_models_with_capabilities(url: &str) -> Result<Vec<ModelCapabilities>> {
// Check cache first
{
let cache = MODEL_CAPABILITIES_CACHE.lock().unwrap();
if let Some(entry) = cache.get(url)
&& !entry.is_expired()
{
log::debug!("Returning cached model capabilities for {}", url);
return Ok(entry.data.clone());
}
}
log::debug!("Fetching fresh model capabilities from {}", url);
let models = Self::list_models(url).await?;
let mut capabilities = Vec::new();
for model_name in models {
match Self::check_model_capabilities(url, &model_name).await {
Ok(cap) => capabilities.push(cap),
Err(e) => {
log::warn!("Failed to get capabilities for model {}: {}", model_name, e);
// Fallback: assume no vision if we can't check
capabilities.push(ModelCapabilities {
name: model_name,
has_vision: false,
});
}
}
}
// Store in cache
{
let mut cache = MODEL_CAPABILITIES_CACHE.lock().unwrap();
cache.insert(url.to_string(), CachedEntry::new(capabilities.clone()));
}
Ok(capabilities)
}
/// Extract final answer from thinking model output
/// Handles <think>...</think> tags and takes everything after
fn extract_final_answer(&self, response: &str) -> String {
let response = response.trim();
// Look for </think> tag and take everything after it
if let Some(pos) = response.find("</think>") {
let answer = response[pos + 8..].trim();
if !answer.is_empty() {
return answer.to_string();
}
}
// Fallback: return the whole response trimmed
response.to_string()
}
async fn try_generate(
&self,
url: &str,
model: &str,
prompt: &str,
system: Option<&str>,
images: Option<Vec<String>>,
) -> Result<String> {
let request = OllamaRequest {
model: model.to_string(),
prompt: prompt.to_string(),
stream: false,
system: system.map(|s| s.to_string()),
options: self.num_ctx.map(|ctx| OllamaOptions { num_ctx: ctx }),
images,
};
let response = self
.client
.post(format!("{}/api/generate", url))
.json(&request)
.send()
.await?;
if !response.status().is_success() {
let status = response.status();
let error_body = response.text().await.unwrap_or_default();
return Err(anyhow::anyhow!(
"Ollama request failed: {} - {}",
status,
error_body
));
}
let result: OllamaResponse = response.json().await?;
Ok(result.response)
}
pub async fn generate(&self, prompt: &str, system: Option<&str>) -> Result<String> {
self.generate_with_images(prompt, system, None).await
}
pub async fn generate_with_images(
&self,
prompt: &str,
system: Option<&str>,
images: Option<Vec<String>>,
) -> Result<String> {
log::debug!("=== Ollama Request ===");
log::debug!("Primary model: {}", self.primary_model);
if let Some(sys) = system {
log::debug!("System: {}", sys);
}
log::debug!("Prompt:\n{}", prompt);
if let Some(ref imgs) = images {
log::debug!("Images: {} image(s) included", imgs.len());
}
log::debug!("=====================");
// Try primary server first with primary model
log::info!(
"Attempting to generate with primary server: {} (model: {})",
self.primary_url,
self.primary_model
);
let primary_result = self
.try_generate(
&self.primary_url,
&self.primary_model,
prompt,
system,
images.clone(),
)
.await;
let raw_response = match primary_result {
Ok(response) => {
log::info!("Successfully generated response from primary server");
response
}
Err(e) => {
log::warn!("Primary server failed: {}", e);
// Try fallback server if available
if let Some(fallback_url) = &self.fallback_url {
// Use fallback model if specified, otherwise use primary model
let fallback_model =
self.fallback_model.as_ref().unwrap_or(&self.primary_model);
log::info!(
"Attempting to generate with fallback server: {} (model: {})",
fallback_url,
fallback_model
);
match self
.try_generate(fallback_url, fallback_model, prompt, system, images.clone())
.await
{
Ok(response) => {
log::info!("Successfully generated response from fallback server");
response
}
Err(fallback_e) => {
log::error!("Fallback server also failed: {}", fallback_e);
return Err(anyhow::anyhow!(
"Both primary and fallback servers failed. Primary: {}, Fallback: {}",
e,
fallback_e
));
}
}
} else {
log::error!("No fallback server configured");
return Err(e);
}
}
};
log::debug!("=== Ollama Response ===");
log::debug!("Raw response: {}", raw_response.trim());
log::debug!("=======================");
// Extract final answer from thinking model output
let cleaned = self.extract_final_answer(&raw_response);
log::debug!("=== Cleaned Response ===");
log::debug!("Final answer: {}", cleaned);
log::debug!("========================");
Ok(cleaned)
}
/// Generate a title for a single photo based on its generated summary
pub async fn generate_photo_title(
&self,
summary: &str,
custom_system: Option<&str>,
) -> Result<String> {
let prompt = format!(
r#"Create a short title (maximum 8 words) for the following journal entry:
{}
Capture the key moment or theme. Return ONLY the title, nothing else."#,
summary
);
let system = custom_system.unwrap_or("You are my long term memory assistant. Use only the information provided. Do not invent details.");
let title = self
.generate_with_images(&prompt, Some(system), None)
.await?;
Ok(title.trim().trim_matches('"').to_string())
}
/// Generate a summary for a single photo based on its context
pub async fn generate_photo_summary(
&self,
date: NaiveDate,
location: Option<&str>,
contact: Option<&str>,
sms_summary: Option<&str>,
custom_system: Option<&str>,
image_base64: Option<String>,
) -> Result<String> {
let location_str = location.unwrap_or("Unknown");
let sms_str = sms_summary.unwrap_or("No messages");
let prompt = if image_base64.is_some() {
if let Some(contact_name) = contact {
format!(
r#"Write a 1-3 paragraph description of this moment by analyzing the image and the available context:
Date: {}
Location: {}
Person/Contact: {}
Messages: {}
Analyze the image and use specific details from both the visual content and the context above. The photo is from a folder for {}, so they are likely in or related to this photo. Mention people's names (especially {}), places, or activities if they appear in either the image or the context. Write in first person as Cameron with the tone of a journal entry. If limited information is available, keep it simple and factual based on what you see and know. If the location is unknown omit it"#,
date.format("%B %d, %Y"),
location_str,
contact_name,
sms_str,
contact_name,
contact_name
)
} else {
format!(
r#"Write a 1-3 paragraph description of this moment by analyzing the image and the available context:
Date: {}
Location: {}
Messages: {}
Analyze the image and use specific details from both the visual content and the context above. Mention people's names, places, or activities if they appear in either the image or the context. Write in first person as Cameron with the tone of a journal entry. If limited information is available, keep it simple and factual based on what you see and know. If the location is unknown omit it"#,
date.format("%B %d, %Y"),
location_str,
sms_str
)
}
} else if let Some(contact_name) = contact {
format!(
r#"Write a 1-3 paragraph description of this moment based on the available information:
Date: {}
Location: {}
Person/Contact: {}
Messages: {}
Use only the specific details provided above. The photo is from a folder for {}, so they are likely related to this moment. Mention people's names (especially {}), places, or activities if they appear in the context. Write in first person as Cameron with the tone of a journal entry. If limited information is available, keep it simple and factual. If the location is unknown omit it"#,
date.format("%B %d, %Y"),
location_str,
contact_name,
sms_str,
contact_name,
contact_name
)
} else {
format!(
r#"Write a 1-3 paragraph description of this moment based on the available information:
Date: {}
Location: {}
Messages: {}
Use only the specific details provided above. Mention people's names, places, or activities if they appear in the context. Write in first person as Cameron with the tone of a journal entry. If limited information is available, keep it simple and factual. If the location is unknown omit it"#,
date.format("%B %d, %Y"),
location_str,
sms_str
)
};
let system = custom_system.unwrap_or("You are a memory refreshing assistant who is able to provide insights through analyzing past conversations. Use only the information provided. Do not invent details.");
let images = image_base64.map(|img| vec![img]);
self.generate_with_images(&prompt, Some(system), images)
.await
}
/// Generate an embedding vector for text using nomic-embed-text:v1.5
/// Returns a 768-dimensional vector as Vec<f32>
pub async fn generate_embedding(&self, text: &str) -> Result<Vec<f32>> {
let embeddings = self.generate_embeddings(&[text]).await?;
embeddings
.into_iter()
.next()
.ok_or_else(|| anyhow::anyhow!("No embedding returned"))
}
/// Generate embeddings for multiple texts in a single API call (batch mode)
/// Returns a vector of 768-dimensional vectors
/// This is much more efficient than calling generate_embedding multiple times
pub async fn generate_embeddings(&self, texts: &[&str]) -> Result<Vec<Vec<f32>>> {
let embedding_model = "nomic-embed-text:v1.5";
log::debug!("=== Ollama Batch Embedding Request ===");
log::debug!("Model: {}", embedding_model);
log::debug!("Batch size: {} texts", texts.len());
log::debug!("======================================");
// Try primary server first
log::debug!(
"Attempting to generate {} embeddings with primary server: {} (model: {})",
texts.len(),
self.primary_url,
embedding_model
);
let primary_result = self
.try_generate_embeddings(&self.primary_url, embedding_model, texts)
.await;
let embeddings = match primary_result {
Ok(embeddings) => {
log::debug!(
"Successfully generated {} embeddings from primary server",
embeddings.len()
);
embeddings
}
Err(e) => {
log::warn!("Primary server batch embedding failed: {}", e);
// Try fallback server if available
if let Some(fallback_url) = &self.fallback_url {
log::info!(
"Attempting to generate {} embeddings with fallback server: {} (model: {})",
texts.len(),
fallback_url,
embedding_model
);
match self
.try_generate_embeddings(fallback_url, embedding_model, texts)
.await
{
Ok(embeddings) => {
log::info!(
"Successfully generated {} embeddings from fallback server",
embeddings.len()
);
embeddings
}
Err(fallback_e) => {
log::error!(
"Fallback server batch embedding also failed: {}",
fallback_e
);
return Err(anyhow::anyhow!(
"Both primary and fallback servers failed. Primary: {}, Fallback: {}",
e,
fallback_e
));
}
}
} else {
log::error!("No fallback server configured");
return Err(e);
}
}
};
// Validate embedding dimensions (should be 768 for nomic-embed-text:v1.5)
for (i, embedding) in embeddings.iter().enumerate() {
if embedding.len() != 768 {
log::warn!(
"Unexpected embedding dimensions for item {}: {} (expected 768)",
i,
embedding.len()
);
}
}
Ok(embeddings)
}
/// Internal helper to try generating embeddings for multiple texts from a specific server
async fn try_generate_embeddings(
&self,
url: &str,
model: &str,
texts: &[&str],
) -> Result<Vec<Vec<f32>>> {
let request = OllamaBatchEmbedRequest {
model: model.to_string(),
input: texts.iter().map(|s| s.to_string()).collect(),
};
let response = self
.client
.post(format!("{}/api/embed", url))
.json(&request)
.send()
.await?;
if !response.status().is_success() {
let status = response.status();
let error_body = response.text().await.unwrap_or_default();
return Err(anyhow::anyhow!(
"Ollama batch embedding request failed: {} - {}",
status,
error_body
));
}
let result: OllamaEmbedResponse = response.json().await?;
Ok(result.embeddings)
}
}
#[derive(Serialize)]
struct OllamaRequest {
model: String,
prompt: String,
stream: bool,
#[serde(skip_serializing_if = "Option::is_none")]
system: Option<String>,
#[serde(skip_serializing_if = "Option::is_none")]
options: Option<OllamaOptions>,
#[serde(skip_serializing_if = "Option::is_none")]
images: Option<Vec<String>>,
}
#[derive(Serialize)]
struct OllamaOptions {
num_ctx: i32,
}
#[derive(Deserialize)]
struct OllamaResponse {
response: String,
}
#[derive(Deserialize)]
struct OllamaTagsResponse {
models: Vec<OllamaModel>,
}
#[derive(Deserialize)]
struct OllamaModel {
name: String,
}
#[derive(Deserialize)]
struct OllamaShowResponse {
#[serde(default)]
capabilities: Vec<String>,
}
#[derive(Serialize, Deserialize, Clone, Debug)]
pub struct ModelCapabilities {
pub name: String,
pub has_vision: bool,
}
#[derive(Serialize)]
struct OllamaBatchEmbedRequest {
model: String,
input: Vec<String>,
}
#[derive(Deserialize)]
struct OllamaEmbedResponse {
embeddings: Vec<Vec<f32>>,
}

View File

@@ -1,316 +0,0 @@
use anyhow::Result;
use reqwest::Client;
use serde::Deserialize;
use super::ollama::OllamaClient;
#[derive(Clone)]
pub struct SmsApiClient {
client: Client,
base_url: String,
token: Option<String>,
}
impl SmsApiClient {
pub fn new(base_url: String, token: Option<String>) -> Self {
Self {
client: Client::new(),
base_url,
token,
}
}
/// Fetch messages for a specific contact within ±4 days of the given timestamp
/// Falls back to all contacts if no messages found for the specific contact
/// Messages are sorted by proximity to the center timestamp
pub async fn fetch_messages_for_contact(
&self,
contact: Option<&str>,
center_timestamp: i64,
) -> Result<Vec<SmsMessage>> {
use chrono::Duration;
// Calculate ±4 days range around the center timestamp
let center_dt = chrono::DateTime::from_timestamp(center_timestamp, 0)
.ok_or_else(|| anyhow::anyhow!("Invalid timestamp"))?;
let start_dt = center_dt - Duration::days(4);
let end_dt = center_dt + Duration::days(4);
let start_ts = start_dt.timestamp();
let end_ts = end_dt.timestamp();
// If contact specified, try fetching for that contact first
if let Some(contact_name) = contact {
log::info!(
"Fetching SMS for contact: {} (±4 days from {})",
contact_name,
center_dt.format("%Y-%m-%d %H:%M:%S")
);
let messages = self
.fetch_messages(start_ts, end_ts, Some(contact_name), Some(center_timestamp))
.await?;
if !messages.is_empty() {
log::info!(
"Found {} messages for contact {}",
messages.len(),
contact_name
);
return Ok(messages);
}
log::info!(
"No messages found for contact {}, falling back to all contacts",
contact_name
);
}
// Fallback to all contacts
log::info!(
"Fetching all SMS messages (±4 days from {})",
center_dt.format("%Y-%m-%d %H:%M:%S")
);
self.fetch_messages(start_ts, end_ts, None, Some(center_timestamp))
.await
}
/// Fetch all messages for a specific contact across all time
/// Used for embedding generation - retrieves complete message history
/// Handles pagination automatically if the API returns a limited number of results
pub async fn fetch_all_messages_for_contact(&self, contact: &str) -> Result<Vec<SmsMessage>> {
let start_ts = chrono::DateTime::parse_from_rfc3339("2000-01-01T00:00:00Z")
.unwrap()
.timestamp();
let end_ts = chrono::Utc::now().timestamp();
log::info!("Fetching all historical messages for contact: {}", contact);
let mut all_messages = Vec::new();
let mut offset = 0;
let limit = 1000; // Fetch in batches of 1000
loop {
log::debug!(
"Fetching batch at offset {} for contact {}",
offset,
contact
);
let batch = self
.fetch_messages_paginated(start_ts, end_ts, Some(contact), None, limit, offset)
.await?;
let batch_size = batch.len();
all_messages.extend(batch);
log::debug!(
"Fetched {} messages (total so far: {})",
batch_size,
all_messages.len()
);
// If we got fewer messages than the limit, we've reached the end
if batch_size < limit {
break;
}
offset += limit;
}
log::info!(
"Fetched {} total messages for contact {}",
all_messages.len(),
contact
);
Ok(all_messages)
}
/// Internal method to fetch messages with pagination support
async fn fetch_messages_paginated(
&self,
start_ts: i64,
end_ts: i64,
contact: Option<&str>,
center_timestamp: Option<i64>,
limit: usize,
offset: usize,
) -> Result<Vec<SmsMessage>> {
let mut url = format!(
"{}/api/messages/by-date-range/?start_date={}&end_date={}&limit={}&offset={}",
self.base_url, start_ts, end_ts, limit, offset
);
if let Some(contact_name) = contact {
url.push_str(&format!("&contact={}", urlencoding::encode(contact_name)));
}
if let Some(ts) = center_timestamp {
url.push_str(&format!("&timestamp={}", ts));
}
log::debug!("Fetching SMS messages from: {}", url);
let mut request = self.client.get(&url);
if let Some(token) = &self.token {
request = request.header("Authorization", format!("Bearer {}", token));
}
let response = request.send().await?;
log::debug!("SMS API response status: {}", response.status());
if !response.status().is_success() {
let status = response.status();
let error_body = response.text().await.unwrap_or_default();
log::error!("SMS API request failed: {} - {}", status, error_body);
return Err(anyhow::anyhow!(
"SMS API request failed: {} - {}",
status,
error_body
));
}
let data: SmsApiResponse = response.json().await?;
Ok(data
.messages
.into_iter()
.map(|m| SmsMessage {
contact: m.contact_name,
body: m.body,
timestamp: m.date,
is_sent: m.type_ == 2,
})
.collect())
}
/// Internal method to fetch messages with optional contact filter and timestamp sorting
async fn fetch_messages(
&self,
start_ts: i64,
end_ts: i64,
contact: Option<&str>,
center_timestamp: Option<i64>,
) -> Result<Vec<SmsMessage>> {
// Call Django endpoint
let mut url = format!(
"{}/api/messages/by-date-range/?start_date={}&end_date={}",
self.base_url, start_ts, end_ts
);
// Add contact filter if provided
if let Some(contact_name) = contact {
url.push_str(&format!("&contact={}", urlencoding::encode(contact_name)));
}
// Add timestamp for proximity sorting if provided
if let Some(ts) = center_timestamp {
url.push_str(&format!("&timestamp={}", ts));
}
log::debug!("Fetching SMS messages from: {}", url);
let mut request = self.client.get(&url);
// Add authorization header if token exists
if let Some(token) = &self.token {
request = request.header("Authorization", format!("Bearer {}", token));
}
let response = request.send().await?;
log::debug!("SMS API response status: {}", response.status());
if !response.status().is_success() {
let status = response.status();
let error_body = response.text().await.unwrap_or_default();
log::error!("SMS API request failed: {} - {}", status, error_body);
return Err(anyhow::anyhow!(
"SMS API request failed: {} - {}",
status,
error_body
));
}
let data: SmsApiResponse = response.json().await?;
// Convert to internal format
Ok(data
.messages
.into_iter()
.map(|m| SmsMessage {
contact: m.contact_name,
body: m.body,
timestamp: m.date,
is_sent: m.type_ == 2, // type 2 = sent
})
.collect())
}
pub async fn summarize_context(
&self,
messages: &[SmsMessage],
ollama: &OllamaClient,
) -> Result<String> {
if messages.is_empty() {
return Ok(String::from("No messages on this day"));
}
// Create prompt for Ollama with sender/receiver distinction
let messages_text: String = messages
.iter()
.take(60) // Limit to avoid token overflow
.map(|m| {
if m.is_sent {
format!("Me: {}", m.body)
} else {
format!("{}: {}", m.contact, m.body)
}
})
.collect::<Vec<_>>()
.join("\n");
let prompt = format!(
r#"Summarize these messages in up to 4-5 sentences. Focus on key topics, places, people mentioned, and the overall context of the conversations.
Messages:
{}
Summary:"#,
messages_text
);
ollama
.generate(
&prompt,
// Some("You are a summarizer for the purposes of jogging my memory and highlighting events and situations."),
Some("You are the keeper of memories, ingest the context and give me a casual summary of the moment."),
)
.await
}
}
#[derive(Debug, Clone)]
pub struct SmsMessage {
pub contact: String,
pub body: String,
pub timestamp: i64,
pub is_sent: bool,
}
#[derive(Deserialize)]
struct SmsApiResponse {
messages: Vec<SmsApiMessage>,
}
#[derive(Deserialize)]
struct SmsApiMessage {
contact_name: String,
body: String,
date: i64,
#[serde(rename = "type")]
type_: i32,
}

View File

@@ -1,90 +1,56 @@
use actix_web::Responder;
use actix_web::{
HttpResponse,
web::{self, Json},
};
use actix_web::web::{self, HttpResponse, Json};
use actix_web::{post, Responder};
use chrono::{Duration, Utc};
use jsonwebtoken::{EncodingKey, Header, encode};
use log::{error, info};
use std::sync::Mutex;
use jsonwebtoken::{encode, EncodingKey, Header};
use log::{debug, error};
use crate::{
data::{Claims, CreateAccountRequest, LoginRequest, Token, secret_key},
data::{secret_key, Claims, CreateAccountRequest, LoginRequest, Token},
database::UserDao,
};
/// Validate password meets security requirements
fn validate_password(password: &str) -> Result<(), String> {
if password.len() < 12 {
return Err("Password must be at least 12 characters".into());
}
if !password.chars().any(|c| c.is_uppercase()) {
return Err("Password must contain at least one uppercase letter".into());
}
if !password.chars().any(|c| c.is_lowercase()) {
return Err("Password must contain at least one lowercase letter".into());
}
if !password.chars().any(|c| c.is_numeric()) {
return Err("Password must contain at least one number".into());
}
if !password.chars().any(|c| !c.is_alphanumeric()) {
return Err("Password must contain at least one special character".into());
}
Ok(())
}
#[allow(dead_code)]
async fn register<D: UserDao>(
#[post("/register")]
async fn register(
user: Json<CreateAccountRequest>,
user_dao: web::Data<Mutex<D>>,
user_dao: web::Data<Box<dyn UserDao>>,
) -> impl Responder {
// Validate password strength
if let Err(msg) = validate_password(&user.password) {
return HttpResponse::BadRequest().body(msg);
}
if !user.username.is_empty() && user.password == user.confirmation {
let mut dao = user_dao.lock().expect("Unable to get UserDao");
if dao.user_exists(&user.username) {
HttpResponse::BadRequest().finish()
} else if let Some(_user) = dao.create_user(&user.username, &user.password) {
HttpResponse::Ok().finish()
if !user.username.is_empty() && user.password.len() > 5 && user.password == user.confirmation {
if user_dao.user_exists(&user.username) {
HttpResponse::BadRequest()
} else if let Some(_user) = user_dao.create_user(&user.username, &user.password) {
HttpResponse::Ok()
} else {
HttpResponse::InternalServerError().finish()
HttpResponse::InternalServerError()
}
} else {
HttpResponse::BadRequest().finish()
HttpResponse::BadRequest()
}
}
pub async fn login<D: UserDao>(
pub async fn login(
creds: Json<LoginRequest>,
user_dao: web::Data<Mutex<D>>,
user_dao: web::Data<Box<dyn UserDao>>,
) -> HttpResponse {
info!("Logging in: {}", creds.username);
let mut user_dao = user_dao.lock().expect("Unable to get UserDao");
debug!("Logging in: {}", creds.username);
if let Some(user) = user_dao.get_user(&creds.username, &creds.password) {
let claims = Claims {
sub: user.id.to_string(),
exp: (Utc::now() + Duration::days(5)).timestamp(),
};
let token = match encode(
let token = encode(
&Header::default(),
&claims,
&EncodingKey::from_secret(secret_key().as_bytes()),
) {
Ok(t) => t,
Err(e) => {
error!("Failed to encode JWT: {}", e);
return HttpResponse::InternalServerError().finish();
}
};
)
.unwrap();
HttpResponse::Ok().json(Token { token: &token })
} else {
error!("Failed login attempt for user: '{}'", creds.username);
error!(
"User not found during login or incorrect password: '{}'",
creds.username
);
HttpResponse::NotFound().finish()
}
}
@@ -96,7 +62,7 @@ mod tests {
#[actix_rt::test]
async fn test_login_reports_200_when_user_exists() {
let mut dao = TestUserDao::new();
let dao = TestUserDao::new();
dao.create_user("user", "pass");
let j = Json(LoginRequest {
@@ -104,14 +70,14 @@ mod tests {
password: "pass".to_string(),
});
let response = login::<TestUserDao>(j, web::Data::new(Mutex::new(dao))).await;
let response = login(j, web::Data::new(Box::new(dao))).await;
assert_eq!(response.status(), 200);
}
#[actix_rt::test]
async fn test_login_returns_token_on_success() {
let mut dao = TestUserDao::new();
let dao = TestUserDao::new();
dao.create_user("user", "password");
let j = Json(LoginRequest {
@@ -119,17 +85,15 @@ mod tests {
password: "password".to_string(),
});
let response = login::<TestUserDao>(j, web::Data::new(Mutex::new(dao))).await;
let response = login(j, web::Data::new(Box::new(dao))).await;
assert_eq!(response.status(), 200);
let response_text: String = response.read_to_str();
assert!(response_text.contains("\"token\""));
assert!(response.body().read_to_str().contains("\"token\""));
}
#[actix_rt::test]
async fn test_login_reports_404_when_user_does_not_exist() {
let mut dao = TestUserDao::new();
let dao = TestUserDao::new();
dao.create_user("user", "password");
let j = Json(LoginRequest {
@@ -137,7 +101,7 @@ mod tests {
password: "password".to_string(),
});
let response = login::<TestUserDao>(j, web::Data::new(Mutex::new(dao))).await;
let response = login(j, web::Data::new(Box::new(dao))).await;
assert_eq!(response.status(), 404);
}

View File

@@ -1,143 +0,0 @@
use std::path::PathBuf;
use std::sync::{Arc, Mutex};
use clap::Parser;
use image_api::cleanup::{
CleanupConfig, DatabaseUpdater, resolve_missing_files, validate_file_types,
};
use image_api::database::{SqliteExifDao, SqliteFavoriteDao};
use image_api::tags::SqliteTagDao;
#[derive(Parser, Debug)]
#[command(name = "cleanup_files")]
#[command(about = "File cleanup and fix utility for ImageApi", long_about = None)]
struct Args {
#[arg(long, help = "Preview changes without making them")]
dry_run: bool,
#[arg(long, help = "Auto-fix all issues without prompting")]
auto_fix: bool,
#[arg(long, help = "Skip phase 1 (missing file resolution)")]
skip_phase1: bool,
#[arg(long, help = "Skip phase 2 (file type validation)")]
skip_phase2: bool,
}
fn main() -> anyhow::Result<()> {
// Initialize logging
env_logger::init();
// Load environment variables
dotenv::dotenv()?;
// Parse CLI arguments
let args = Args::parse();
// Get base path from environment
let base_path = dotenv::var("BASE_PATH")?;
let base = PathBuf::from(&base_path);
println!("File Cleanup and Fix Utility");
println!("============================");
println!("Base path: {}", base.display());
println!("Dry run: {}", args.dry_run);
println!("Auto fix: {}", args.auto_fix);
println!();
// Pre-flight checks
if !base.exists() {
eprintln!("Error: Base path does not exist: {}", base.display());
std::process::exit(1);
}
if !base.is_dir() {
eprintln!("Error: Base path is not a directory: {}", base.display());
std::process::exit(1);
}
// Create configuration
let config = CleanupConfig {
base_path: base,
dry_run: args.dry_run,
auto_fix: args.auto_fix,
};
// Create DAOs
println!("Connecting to database...");
let tag_dao: Arc<Mutex<dyn image_api::tags::TagDao>> =
Arc::new(Mutex::new(SqliteTagDao::default()));
let exif_dao: Arc<Mutex<dyn image_api::database::ExifDao>> =
Arc::new(Mutex::new(SqliteExifDao::new()));
let favorites_dao: Arc<Mutex<dyn image_api::database::FavoriteDao>> =
Arc::new(Mutex::new(SqliteFavoriteDao::new()));
// Create database updater
let mut db_updater = DatabaseUpdater::new(tag_dao, exif_dao, favorites_dao);
println!("✓ Database connected\n");
// Track overall statistics
let mut total_issues_found = 0;
let mut total_issues_fixed = 0;
let mut total_errors = Vec::new();
// Phase 1: Missing file resolution
if !args.skip_phase1 {
match resolve_missing_files(&config, &mut db_updater) {
Ok(stats) => {
total_issues_found += stats.issues_found;
total_issues_fixed += stats.issues_fixed;
total_errors.extend(stats.errors);
}
Err(e) => {
eprintln!("Phase 1 failed: {:?}", e);
total_errors.push(format!("Phase 1 error: {}", e));
}
}
} else {
println!("Phase 1: Skipped (--skip-phase1)");
}
// Phase 2: File type validation
if !args.skip_phase2 {
match validate_file_types(&config, &mut db_updater) {
Ok(stats) => {
total_issues_found += stats.issues_found;
total_issues_fixed += stats.issues_fixed;
total_errors.extend(stats.errors);
}
Err(e) => {
eprintln!("Phase 2 failed: {:?}", e);
total_errors.push(format!("Phase 2 error: {}", e));
}
}
} else {
println!("\nPhase 2: Skipped (--skip-phase2)");
}
// Final summary
println!("\n============================");
println!("Cleanup Complete!");
println!("============================");
println!("Total issues found: {}", total_issues_found);
if config.dry_run {
println!("Total issues that would be fixed: {}", total_issues_found);
} else {
println!("Total issues fixed: {}", total_issues_fixed);
}
if !total_errors.is_empty() {
println!("\nErrors encountered:");
for (i, error) in total_errors.iter().enumerate() {
println!(" {}. {}", i + 1, error);
}
println!("\nSome operations failed. Review errors above.");
} else {
println!("\n✓ No errors encountered");
}
Ok(())
}

View File

@@ -1,307 +0,0 @@
use anyhow::Result;
use clap::Parser;
use diesel::prelude::*;
use diesel::sql_query;
use diesel::sqlite::SqliteConnection;
use std::env;
#[derive(Parser, Debug)]
#[command(author, version, about = "Diagnose embedding distribution and identify problematic summaries", long_about = None)]
struct Args {
/// Show detailed per-summary statistics
#[arg(short, long, default_value_t = false)]
verbose: bool,
/// Number of top "central" summaries to show (ones that match everything)
#[arg(short, long, default_value_t = 10)]
top: usize,
/// Test a specific query to see what matches
#[arg(short, long)]
query: Option<String>,
}
#[derive(QueryableByName, Debug)]
struct EmbeddingRow {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
#[diesel(sql_type = diesel::sql_types::Text)]
date: String,
#[diesel(sql_type = diesel::sql_types::Text)]
contact: String,
#[diesel(sql_type = diesel::sql_types::Text)]
summary: String,
#[diesel(sql_type = diesel::sql_types::Binary)]
embedding: Vec<u8>,
}
fn deserialize_embedding(bytes: &[u8]) -> Result<Vec<f32>> {
if !bytes.len().is_multiple_of(4) {
return Err(anyhow::anyhow!("Invalid embedding byte length"));
}
let count = bytes.len() / 4;
let mut vec = Vec::with_capacity(count);
for chunk in bytes.chunks_exact(4) {
let float = f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
vec.push(float);
}
Ok(vec)
}
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
if a.len() != b.len() {
return 0.0;
}
let dot_product: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
let magnitude_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let magnitude_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if magnitude_a == 0.0 || magnitude_b == 0.0 {
return 0.0;
}
dot_product / (magnitude_a * magnitude_b)
}
fn main() -> Result<()> {
dotenv::dotenv().ok();
let args = Args::parse();
let database_url = env::var("DATABASE_URL").unwrap_or_else(|_| "auth.db".to_string());
println!("Connecting to database: {}", database_url);
let mut conn = SqliteConnection::establish(&database_url)?;
// Load all embeddings
println!("\nLoading embeddings from daily_conversation_summaries...");
let rows: Vec<EmbeddingRow> = sql_query(
"SELECT id, date, contact, summary, embedding FROM daily_conversation_summaries ORDER BY date"
)
.load(&mut conn)?;
println!("Found {} summaries with embeddings\n", rows.len());
if rows.is_empty() {
println!("No summaries found!");
return Ok(());
}
// Parse all embeddings
let mut embeddings: Vec<(i32, String, String, String, Vec<f32>)> = Vec::new();
for row in &rows {
match deserialize_embedding(&row.embedding) {
Ok(emb) => {
embeddings.push((
row.id,
row.date.clone(),
row.contact.clone(),
row.summary.clone(),
emb,
));
}
Err(e) => {
println!(
"Warning: Failed to parse embedding for id {}: {}",
row.id, e
);
}
}
}
println!("Successfully parsed {} embeddings\n", embeddings.len());
// Compute embedding statistics
println!("========================================");
println!("EMBEDDING STATISTICS");
println!("========================================\n");
// Check embedding variance (are values clustered or spread out?)
let first_emb = &embeddings[0].4;
let dim = first_emb.len();
println!("Embedding dimensions: {}", dim);
// Calculate mean and std dev per dimension
let mut dim_means: Vec<f32> = vec![0.0; dim];
let mut dim_vars: Vec<f32> = vec![0.0; dim];
for (_, _, _, _, emb) in &embeddings {
for (i, &val) in emb.iter().enumerate() {
dim_means[i] += val;
}
}
for m in &mut dim_means {
*m /= embeddings.len() as f32;
}
for (_, _, _, _, emb) in &embeddings {
for (i, &val) in emb.iter().enumerate() {
let diff = val - dim_means[i];
dim_vars[i] += diff * diff;
}
}
for v in &mut dim_vars {
*v = (*v / embeddings.len() as f32).sqrt();
}
let avg_std_dev: f32 = dim_vars.iter().sum::<f32>() / dim as f32;
let min_std_dev: f32 = dim_vars.iter().cloned().fold(f32::INFINITY, f32::min);
let max_std_dev: f32 = dim_vars.iter().cloned().fold(f32::NEG_INFINITY, f32::max);
println!("Per-dimension standard deviation:");
println!(" Average: {:.6}", avg_std_dev);
println!(" Min: {:.6}", min_std_dev);
println!(" Max: {:.6}", max_std_dev);
println!();
// Compute pairwise similarities
println!("Computing pairwise similarities (this may take a moment)...\n");
let mut all_similarities: Vec<f32> = Vec::new();
let mut per_embedding_avg: Vec<(usize, f32)> = Vec::new();
for i in 0..embeddings.len() {
let mut sum = 0.0;
let mut count = 0;
for j in 0..embeddings.len() {
if i != j {
let sim = cosine_similarity(&embeddings[i].4, &embeddings[j].4);
all_similarities.push(sim);
sum += sim;
count += 1;
}
}
per_embedding_avg.push((i, sum / count as f32));
}
// Sort similarities for percentile analysis
all_similarities.sort_by(|a, b| a.partial_cmp(b).unwrap());
let min_sim = all_similarities.first().copied().unwrap_or(0.0);
let max_sim = all_similarities.last().copied().unwrap_or(0.0);
let median_sim = all_similarities[all_similarities.len() / 2];
let p25 = all_similarities[all_similarities.len() / 4];
let p75 = all_similarities[3 * all_similarities.len() / 4];
let mean_sim: f32 = all_similarities.iter().sum::<f32>() / all_similarities.len() as f32;
println!("========================================");
println!("PAIRWISE SIMILARITY DISTRIBUTION");
println!("========================================\n");
println!("Total pairs analyzed: {}", all_similarities.len());
println!();
println!("Min similarity: {:.4}", min_sim);
println!("25th percentile: {:.4}", p25);
println!("Median similarity: {:.4}", median_sim);
println!("Mean similarity: {:.4}", mean_sim);
println!("75th percentile: {:.4}", p75);
println!("Max similarity: {:.4}", max_sim);
println!();
// Analyze distribution
let count_above_08 = all_similarities.iter().filter(|&&s| s > 0.8).count();
let count_above_07 = all_similarities.iter().filter(|&&s| s > 0.7).count();
let count_above_06 = all_similarities.iter().filter(|&&s| s > 0.6).count();
let count_above_05 = all_similarities.iter().filter(|&&s| s > 0.5).count();
let count_below_03 = all_similarities.iter().filter(|&&s| s < 0.3).count();
println!("Similarity distribution:");
println!(
" > 0.8: {} ({:.1}%)",
count_above_08,
100.0 * count_above_08 as f32 / all_similarities.len() as f32
);
println!(
" > 0.7: {} ({:.1}%)",
count_above_07,
100.0 * count_above_07 as f32 / all_similarities.len() as f32
);
println!(
" > 0.6: {} ({:.1}%)",
count_above_06,
100.0 * count_above_06 as f32 / all_similarities.len() as f32
);
println!(
" > 0.5: {} ({:.1}%)",
count_above_05,
100.0 * count_above_05 as f32 / all_similarities.len() as f32
);
println!(
" < 0.3: {} ({:.1}%)",
count_below_03,
100.0 * count_below_03 as f32 / all_similarities.len() as f32
);
println!();
// Identify "central" embeddings (high average similarity to all others)
per_embedding_avg.sort_by(|a, b| b.1.partial_cmp(&a.1).unwrap());
println!("========================================");
println!("TOP {} MOST 'CENTRAL' SUMMARIES", args.top);
println!("(These match everything with high similarity)");
println!("========================================\n");
for (rank, (idx, avg_sim)) in per_embedding_avg.iter().take(args.top).enumerate() {
let (id, date, contact, summary, _) = &embeddings[*idx];
let preview: String = summary.chars().take(80).collect();
println!("{}. [id={}, avg_sim={:.4}]", rank + 1, id, avg_sim);
println!(" Date: {}, Contact: {}", date, contact);
println!(" Preview: {}...", preview.replace('\n', " "));
println!();
}
// Also show the least central (most unique)
println!("========================================");
println!("TOP {} MOST UNIQUE SUMMARIES", args.top);
println!("(These are most different from others)");
println!("========================================\n");
for (rank, (idx, avg_sim)) in per_embedding_avg.iter().rev().take(args.top).enumerate() {
let (id, date, contact, summary, _) = &embeddings[*idx];
let preview: String = summary.chars().take(80).collect();
println!("{}. [id={}, avg_sim={:.4}]", rank + 1, id, avg_sim);
println!(" Date: {}, Contact: {}", date, contact);
println!(" Preview: {}...", preview.replace('\n', " "));
println!();
}
// Diagnosis
println!("========================================");
println!("DIAGNOSIS");
println!("========================================\n");
if mean_sim > 0.7 {
println!("⚠️ HIGH AVERAGE SIMILARITY ({:.4})", mean_sim);
println!(" All embeddings are very similar to each other.");
println!(" This explains why the same summaries always match.");
println!();
println!(" Possible causes:");
println!(
" 1. Summaries have similar structure/phrasing (e.g., all start with 'Summary:')"
);
println!(" 2. Embedding model isn't capturing semantic differences well");
println!(" 3. Daily conversations have similar topics (e.g., 'good morning', plans)");
println!();
println!(" Recommendations:");
println!(" 1. Try a different embedding model (mxbai-embed-large, bge-large)");
println!(" 2. Improve summary diversity by varying the prompt");
println!(" 3. Extract and embed only keywords/entities, not full summaries");
} else if mean_sim > 0.5 {
println!("⚡ MODERATE AVERAGE SIMILARITY ({:.4})", mean_sim);
println!(" Some clustering in embeddings, but some differentiation exists.");
println!();
println!(" The 'central' summaries above are likely dominating search results.");
println!(" Consider:");
println!(" 1. Filtering out summaries with very high centrality");
println!(" 2. Adding time-based weighting to prefer recent/relevant dates");
println!(" 3. Increasing the similarity threshold from 0.3 to 0.5");
} else {
println!("✅ GOOD EMBEDDING DIVERSITY ({:.4})", mean_sim);
println!(" Embeddings are well-differentiated.");
println!(" If same results keep appearing, the issue may be elsewhere.");
}
Ok(())
}

View File

@@ -1,166 +0,0 @@
use anyhow::{Context, Result};
use chrono::Utc;
use clap::Parser;
use image_api::ai::ollama::OllamaClient;
use image_api::database::calendar_dao::{InsertCalendarEvent, SqliteCalendarEventDao};
use image_api::parsers::ical_parser::parse_ics_file;
use log::{error, info};
use std::sync::{Arc, Mutex};
// Import the trait to use its methods
use image_api::database::CalendarEventDao;
#[derive(Parser, Debug)]
#[command(author, version, about = "Import Google Takeout Calendar data", long_about = None)]
struct Args {
/// Path to the .ics calendar file
#[arg(short, long)]
path: String,
/// Generate embeddings for calendar events (slower but enables semantic search)
#[arg(long, default_value = "false")]
generate_embeddings: bool,
/// Skip events that already exist in the database
#[arg(long, default_value = "true")]
skip_existing: bool,
/// Batch size for embedding generation
#[arg(long, default_value = "128")]
batch_size: usize,
}
#[tokio::main]
async fn main() -> Result<()> {
dotenv::dotenv().ok();
env_logger::init();
let args = Args::parse();
info!("Parsing calendar file: {}", args.path);
let events = parse_ics_file(&args.path).context("Failed to parse .ics file")?;
info!("Found {} calendar events", events.len());
let context = opentelemetry::Context::current();
let ollama = if args.generate_embeddings {
let primary_url = dotenv::var("OLLAMA_PRIMARY_URL")
.or_else(|_| dotenv::var("OLLAMA_URL"))
.unwrap_or_else(|_| "http://localhost:11434".to_string());
let fallback_url = dotenv::var("OLLAMA_FALLBACK_URL").ok();
let primary_model = dotenv::var("OLLAMA_PRIMARY_MODEL")
.or_else(|_| dotenv::var("OLLAMA_MODEL"))
.unwrap_or_else(|_| "nomic-embed-text:v1.5".to_string());
let fallback_model = dotenv::var("OLLAMA_FALLBACK_MODEL").ok();
Some(OllamaClient::new(
primary_url,
fallback_url,
primary_model,
fallback_model,
))
} else {
None
};
let inserted_count = Arc::new(Mutex::new(0));
let skipped_count = Arc::new(Mutex::new(0));
let error_count = Arc::new(Mutex::new(0));
// Process events in batches
// Can't use rayon with async, so process sequentially
for event in &events {
let mut dao_instance = SqliteCalendarEventDao::new();
// Check if event exists
if args.skip_existing
&& let Ok(exists) = dao_instance.event_exists(
&context,
event.event_uid.as_deref().unwrap_or(""),
event.start_time,
)
&& exists
{
*skipped_count.lock().unwrap() += 1;
continue;
}
// Generate embedding if requested (blocking call)
let embedding = if let Some(ref ollama_client) = ollama {
let text = format!(
"{} {} {}",
event.summary,
event.description.as_deref().unwrap_or(""),
event.location.as_deref().unwrap_or("")
);
match tokio::task::block_in_place(|| {
tokio::runtime::Handle::current()
.block_on(async { ollama_client.generate_embedding(&text).await })
}) {
Ok(emb) => Some(emb),
Err(e) => {
error!(
"Failed to generate embedding for event '{}': {}",
event.summary, e
);
None
}
}
} else {
None
};
// Insert into database
let insert_event = InsertCalendarEvent {
event_uid: event.event_uid.clone(),
summary: event.summary.clone(),
description: event.description.clone(),
location: event.location.clone(),
start_time: event.start_time,
end_time: event.end_time,
all_day: event.all_day,
organizer: event.organizer.clone(),
attendees: if event.attendees.is_empty() {
None
} else {
Some(serde_json::to_string(&event.attendees).unwrap_or_default())
},
embedding,
created_at: Utc::now().timestamp(),
source_file: Some(args.path.clone()),
};
match dao_instance.store_event(&context, insert_event) {
Ok(_) => {
*inserted_count.lock().unwrap() += 1;
if *inserted_count.lock().unwrap() % 100 == 0 {
info!("Imported {} events...", *inserted_count.lock().unwrap());
}
}
Err(e) => {
error!("Failed to store event '{}': {:?}", event.summary, e);
*error_count.lock().unwrap() += 1;
}
}
}
let final_inserted = *inserted_count.lock().unwrap();
let final_skipped = *skipped_count.lock().unwrap();
let final_errors = *error_count.lock().unwrap();
info!("\n=== Import Summary ===");
info!("Total events found: {}", events.len());
info!("Successfully inserted: {}", final_inserted);
info!("Skipped (already exist): {}", final_skipped);
info!("Errors: {}", final_errors);
if args.generate_embeddings {
info!("Embeddings were generated for semantic search");
} else {
info!("No embeddings generated (use --generate-embeddings to enable semantic search)");
}
Ok(())
}

View File

@@ -1,114 +0,0 @@
use anyhow::{Context, Result};
use chrono::Utc;
use clap::Parser;
use image_api::database::location_dao::{InsertLocationRecord, SqliteLocationHistoryDao};
use image_api::parsers::location_json_parser::parse_location_json;
use log::{error, info};
// Import the trait to use its methods
use image_api::database::LocationHistoryDao;
#[derive(Parser, Debug)]
#[command(author, version, about = "Import Google Takeout Location History data", long_about = None)]
struct Args {
/// Path to the Location History JSON file
#[arg(short, long)]
path: String,
/// Skip locations that already exist in the database
#[arg(long, default_value = "true")]
skip_existing: bool,
/// Batch size for database inserts
#[arg(long, default_value = "1000")]
batch_size: usize,
}
#[tokio::main]
async fn main() -> Result<()> {
dotenv::dotenv().ok();
env_logger::init();
let args = Args::parse();
info!("Parsing location history file: {}", args.path);
let locations =
parse_location_json(&args.path).context("Failed to parse location history JSON")?;
info!("Found {} location records", locations.len());
let context = opentelemetry::Context::current();
let mut inserted_count = 0;
let mut skipped_count = 0;
let mut error_count = 0;
let mut dao_instance = SqliteLocationHistoryDao::new();
let created_at = Utc::now().timestamp();
// Process in batches using batch insert for massive speedup
for (batch_idx, chunk) in locations.chunks(args.batch_size).enumerate() {
info!(
"Processing batch {} ({} records)...",
batch_idx + 1,
chunk.len()
);
// Convert to InsertLocationRecord
let mut batch_inserts = Vec::with_capacity(chunk.len());
for location in chunk {
// Skip existing check if requested (makes import much slower)
if args.skip_existing
&& let Ok(exists) = dao_instance.location_exists(
&context,
location.timestamp,
location.latitude,
location.longitude,
)
&& exists
{
skipped_count += 1;
continue;
}
batch_inserts.push(InsertLocationRecord {
timestamp: location.timestamp,
latitude: location.latitude,
longitude: location.longitude,
accuracy: location.accuracy,
activity: location.activity.clone(),
activity_confidence: location.activity_confidence,
place_name: None,
place_category: None,
embedding: None,
created_at,
source_file: Some(args.path.clone()),
});
}
// Batch insert entire chunk in single transaction
if !batch_inserts.is_empty() {
match dao_instance.store_locations_batch(&context, batch_inserts) {
Ok(count) => {
inserted_count += count;
info!(
"Imported {} locations (total: {})...",
count, inserted_count
);
}
Err(e) => {
error!("Failed to store batch: {:?}", e);
error_count += chunk.len();
}
}
}
}
info!("\n=== Import Summary ===");
info!("Total locations found: {}", locations.len());
info!("Successfully inserted: {}", inserted_count);
info!("Skipped (already exist): {}", skipped_count);
info!("Errors: {}", error_count);
Ok(())
}

View File

@@ -1,152 +0,0 @@
use anyhow::{Context, Result};
use chrono::Utc;
use clap::Parser;
use image_api::ai::ollama::OllamaClient;
use image_api::database::search_dao::{InsertSearchRecord, SqliteSearchHistoryDao};
use image_api::parsers::search_html_parser::parse_search_html;
use log::{error, info, warn};
// Import the trait to use its methods
use image_api::database::SearchHistoryDao;
#[derive(Parser, Debug)]
#[command(author, version, about = "Import Google Takeout Search History data", long_about = None)]
struct Args {
/// Path to the search history HTML file
#[arg(short, long)]
path: String,
/// Skip searches that already exist in the database
#[arg(long, default_value = "true")]
skip_existing: bool,
/// Batch size for embedding generation (max 128 recommended)
#[arg(long, default_value = "64")]
batch_size: usize,
}
#[tokio::main]
async fn main() -> Result<()> {
dotenv::dotenv().ok();
env_logger::init();
let args = Args::parse();
info!("Parsing search history file: {}", args.path);
let searches = parse_search_html(&args.path).context("Failed to parse search history HTML")?;
info!("Found {} search records", searches.len());
let primary_url = dotenv::var("OLLAMA_PRIMARY_URL")
.or_else(|_| dotenv::var("OLLAMA_URL"))
.unwrap_or_else(|_| "http://localhost:11434".to_string());
let fallback_url = dotenv::var("OLLAMA_FALLBACK_URL").ok();
let primary_model = dotenv::var("OLLAMA_PRIMARY_MODEL")
.or_else(|_| dotenv::var("OLLAMA_MODEL"))
.unwrap_or_else(|_| "nomic-embed-text:v1.5".to_string());
let fallback_model = dotenv::var("OLLAMA_FALLBACK_MODEL").ok();
let ollama = OllamaClient::new(primary_url, fallback_url, primary_model, fallback_model);
let context = opentelemetry::Context::current();
let mut inserted_count = 0;
let mut skipped_count = 0;
let mut error_count = 0;
let mut dao_instance = SqliteSearchHistoryDao::new();
let created_at = Utc::now().timestamp();
// Process searches in batches (embeddings are REQUIRED for searches)
for (batch_idx, chunk) in searches.chunks(args.batch_size).enumerate() {
info!(
"Processing batch {} ({} searches)...",
batch_idx + 1,
chunk.len()
);
// Generate embeddings for this batch
let queries: Vec<String> = chunk.iter().map(|s| s.query.clone()).collect();
let embeddings_result = tokio::task::spawn({
let ollama_client = ollama.clone();
async move {
// Generate embeddings in parallel for the batch
let mut embeddings = Vec::new();
for query in &queries {
match ollama_client.generate_embedding(query).await {
Ok(emb) => embeddings.push(Some(emb)),
Err(e) => {
warn!("Failed to generate embedding for query '{}': {}", query, e);
embeddings.push(None);
}
}
}
embeddings
}
})
.await
.context("Failed to generate embeddings for batch")?;
// Build batch of searches with embeddings
let mut batch_inserts = Vec::new();
for (search, embedding_opt) in chunk.iter().zip(embeddings_result.iter()) {
// Check if search exists (optional for speed)
if args.skip_existing
&& let Ok(exists) =
dao_instance.search_exists(&context, search.timestamp, &search.query)
&& exists
{
skipped_count += 1;
continue;
}
// Only insert if we have an embedding
if let Some(embedding) = embedding_opt {
batch_inserts.push(InsertSearchRecord {
timestamp: search.timestamp,
query: search.query.clone(),
search_engine: search.search_engine.clone(),
embedding: embedding.clone(),
created_at,
source_file: Some(args.path.clone()),
});
} else {
error!(
"Skipping search '{}' due to missing embedding",
search.query
);
error_count += 1;
}
}
// Batch insert entire chunk in single transaction
if !batch_inserts.is_empty() {
match dao_instance.store_searches_batch(&context, batch_inserts) {
Ok(count) => {
inserted_count += count;
info!("Imported {} searches (total: {})...", count, inserted_count);
}
Err(e) => {
error!("Failed to store batch: {:?}", e);
error_count += chunk.len();
}
}
}
// Rate limiting between batches
if batch_idx < searches.len() / args.batch_size {
info!("Waiting 500ms before next batch...");
tokio::time::sleep(tokio::time::Duration::from_millis(500)).await;
}
}
info!("\n=== Import Summary ===");
info!("Total searches found: {}", searches.len());
info!("Successfully inserted: {}", inserted_count);
info!("Skipped (already exist): {}", skipped_count);
info!("Errors: {}", error_count);
info!("All imported searches have embeddings for semantic search");
Ok(())
}

View File

@@ -1,195 +0,0 @@
use std::path::PathBuf;
use std::sync::{Arc, Mutex};
use chrono::Utc;
use clap::Parser;
use rayon::prelude::*;
use walkdir::WalkDir;
use image_api::database::models::InsertImageExif;
use image_api::database::{ExifDao, SqliteExifDao};
use image_api::exif;
#[derive(Parser, Debug)]
#[command(name = "migrate_exif")]
#[command(about = "Extract and store EXIF data from images", long_about = None)]
struct Args {
#[arg(long, help = "Skip files that already have EXIF data in database")]
skip_existing: bool,
}
fn main() -> anyhow::Result<()> {
env_logger::init();
dotenv::dotenv()?;
let args = Args::parse();
let base_path = dotenv::var("BASE_PATH")?;
let base = PathBuf::from(&base_path);
println!("EXIF Migration Tool");
println!("===================");
println!("Base path: {}", base.display());
if args.skip_existing {
println!("Mode: Skip existing (incremental)");
} else {
println!("Mode: Upsert (insert new, update existing)");
}
println!();
// Collect all image files that support EXIF
println!("Scanning for images...");
let image_files: Vec<PathBuf> = WalkDir::new(&base)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.file_type().is_file())
.filter(|e| exif::supports_exif(e.path()))
.map(|e| e.path().to_path_buf())
.collect();
println!("Found {} images to process", image_files.len());
if image_files.is_empty() {
println!("No EXIF-supporting images found. Exiting.");
return Ok(());
}
println!();
println!("Extracting EXIF data...");
// Create a thread-safe DAO
let dao = Arc::new(Mutex::new(SqliteExifDao::new()));
// Process in parallel using rayon
let results: Vec<_> = image_files
.par_iter()
.map(|path| {
// Create context for this processing iteration
let context = opentelemetry::Context::new();
let relative_path = match path.strip_prefix(&base) {
Ok(p) => p.to_str().unwrap().to_string(),
Err(_) => {
eprintln!(
"Error: Could not create relative path for {}",
path.display()
);
return Err(anyhow::anyhow!("Path error"));
}
};
// Check if EXIF data already exists
let existing = if let Ok(mut dao_lock) = dao.lock() {
dao_lock.get_exif(&context, &relative_path).ok().flatten()
} else {
eprintln!("{} - Failed to acquire database lock", relative_path);
return Err(anyhow::anyhow!("Lock error"));
};
// Skip if exists and skip_existing flag is set
if args.skip_existing && existing.is_some() {
return Ok(("skip".to_string(), relative_path));
}
match exif::extract_exif_from_path(path) {
Ok(exif_data) => {
let timestamp = Utc::now().timestamp();
let insert_exif = InsertImageExif {
file_path: relative_path.clone(),
camera_make: exif_data.camera_make,
camera_model: exif_data.camera_model,
lens_model: exif_data.lens_model,
width: exif_data.width,
height: exif_data.height,
orientation: exif_data.orientation,
gps_latitude: exif_data.gps_latitude.map(|v| v as f32),
gps_longitude: exif_data.gps_longitude.map(|v| v as f32),
gps_altitude: exif_data.gps_altitude.map(|v| v as f32),
focal_length: exif_data.focal_length.map(|v| v as f32),
aperture: exif_data.aperture.map(|v| v as f32),
shutter_speed: exif_data.shutter_speed,
iso: exif_data.iso,
date_taken: exif_data.date_taken,
created_time: existing
.as_ref()
.map(|e| e.created_time)
.unwrap_or(timestamp),
last_modified: timestamp,
};
// Store or update in database
if let Ok(mut dao_lock) = dao.lock() {
let result = if existing.is_some() {
// Update existing record
dao_lock
.update_exif(&context, insert_exif)
.map(|_| "update")
} else {
// Insert new record
dao_lock.store_exif(&context, insert_exif).map(|_| "insert")
};
match result {
Ok(action) => {
if action == "update" {
println!("{} (updated)", relative_path);
} else {
println!("{} (inserted)", relative_path);
}
Ok((action.to_string(), relative_path))
}
Err(e) => {
eprintln!("{} - Database error: {:?}", relative_path, e);
Err(anyhow::anyhow!("Database error"))
}
}
} else {
eprintln!("{} - Failed to acquire database lock", relative_path);
Err(anyhow::anyhow!("Lock error"))
}
}
Err(e) => {
eprintln!("{} - No EXIF data: {:?}", relative_path, e);
Err(e)
}
}
})
.collect();
// Count results
let mut success_count = 0;
let mut inserted_count = 0;
let mut updated_count = 0;
let mut skipped_count = 0;
for (action, _) in results.iter().flatten() {
success_count += 1;
match action.as_str() {
"insert" => inserted_count += 1,
"update" => updated_count += 1,
"skip" => skipped_count += 1,
_ => {}
}
}
let error_count = results.len() - success_count - skipped_count;
println!();
println!("===================");
println!("Migration complete!");
println!("Total images processed: {}", image_files.len());
if inserted_count > 0 {
println!(" New EXIF records inserted: {}", inserted_count);
}
if updated_count > 0 {
println!(" Existing records updated: {}", updated_count);
}
if skipped_count > 0 {
println!(" Skipped (already exists): {}", skipped_count);
}
if error_count > 0 {
println!(" Errors (no EXIF data or failures): {}", error_count);
}
Ok(())
}

View File

@@ -1,288 +0,0 @@
use anyhow::Result;
use chrono::NaiveDate;
use clap::Parser;
use image_api::ai::{OllamaClient, SmsApiClient, strip_summary_boilerplate};
use image_api::database::{DailySummaryDao, InsertDailySummary, SqliteDailySummaryDao};
use std::env;
use std::sync::{Arc, Mutex};
#[derive(Parser, Debug)]
#[command(author, version, about = "Test daily summary generation with different models and prompts", long_about = None)]
struct Args {
/// Contact name to generate summaries for
#[arg(short, long)]
contact: String,
/// Start date (YYYY-MM-DD)
#[arg(short, long)]
start: String,
/// End date (YYYY-MM-DD)
#[arg(short, long)]
end: String,
/// Optional: Override the model to use (e.g., "qwen2.5:32b", "llama3.1:30b")
#[arg(short, long)]
model: Option<String>,
/// Test mode: Generate but don't save to database (shows output only)
#[arg(short = 't', long, default_value_t = false)]
test_mode: bool,
/// Show message count and preview
#[arg(short, long, default_value_t = false)]
verbose: bool,
}
#[tokio::main]
async fn main() -> Result<()> {
// Load .env file
dotenv::dotenv().ok();
// Initialize logging
env_logger::Builder::from_env(env_logger::Env::default().default_filter_or("info")).init();
let args = Args::parse();
// Parse dates
let start_date = NaiveDate::parse_from_str(&args.start, "%Y-%m-%d")
.expect("Invalid start date format. Use YYYY-MM-DD");
let end_date = NaiveDate::parse_from_str(&args.end, "%Y-%m-%d")
.expect("Invalid end date format. Use YYYY-MM-DD");
println!("========================================");
println!("Daily Summary Generation Test Tool");
println!("========================================");
println!("Contact: {}", args.contact);
println!("Date range: {} to {}", start_date, end_date);
println!("Days: {}", (end_date - start_date).num_days() + 1);
if let Some(ref model) = args.model {
println!("Model: {}", model);
} else {
println!(
"Model: {} (from env)",
env::var("OLLAMA_PRIMARY_MODEL")
.or_else(|_| env::var("OLLAMA_MODEL"))
.unwrap_or_else(|_| "nemotron-3-nano:30b".to_string())
);
}
if args.test_mode {
println!("⚠ TEST MODE: Results will NOT be saved to database");
}
println!("========================================");
println!();
// Initialize AI clients
let ollama_primary_url = env::var("OLLAMA_PRIMARY_URL")
.or_else(|_| env::var("OLLAMA_URL"))
.unwrap_or_else(|_| "http://localhost:11434".to_string());
let ollama_fallback_url = env::var("OLLAMA_FALLBACK_URL").ok();
// Use provided model or fallback to env
let model_to_use = args.model.clone().unwrap_or_else(|| {
env::var("OLLAMA_PRIMARY_MODEL")
.or_else(|_| env::var("OLLAMA_MODEL"))
.unwrap_or_else(|_| "nemotron-3-nano:30b".to_string())
});
let ollama = OllamaClient::new(
ollama_primary_url,
ollama_fallback_url.clone(),
model_to_use.clone(),
Some(model_to_use), // Use same model for fallback
);
let sms_api_url =
env::var("SMS_API_URL").unwrap_or_else(|_| "http://localhost:8000".to_string());
let sms_api_token = env::var("SMS_API_TOKEN").ok();
let sms_client = SmsApiClient::new(sms_api_url, sms_api_token);
// Initialize DAO
let summary_dao: Arc<Mutex<Box<dyn DailySummaryDao>>> =
Arc::new(Mutex::new(Box::new(SqliteDailySummaryDao::new())));
// Fetch messages for contact
println!("Fetching messages for {}...", args.contact);
let all_messages = sms_client
.fetch_all_messages_for_contact(&args.contact)
.await?;
println!(
"Found {} total messages for {}",
all_messages.len(),
args.contact
);
println!();
// Filter to date range and group by date
let mut messages_by_date = std::collections::HashMap::new();
for msg in all_messages {
if let Some(dt) = chrono::DateTime::from_timestamp(msg.timestamp, 0) {
let date = dt.date_naive();
if date >= start_date && date <= end_date {
messages_by_date
.entry(date)
.or_insert_with(Vec::new)
.push(msg);
}
}
}
if messages_by_date.is_empty() {
println!("⚠ No messages found in date range");
return Ok(());
}
println!("Found {} days with messages", messages_by_date.len());
println!();
// Sort dates
let mut dates: Vec<NaiveDate> = messages_by_date.keys().cloned().collect();
dates.sort();
// Process each day
for (idx, date) in dates.iter().enumerate() {
let messages = messages_by_date.get(date).unwrap();
let date_str = date.format("%Y-%m-%d").to_string();
let weekday = date.format("%A");
println!("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
println!(
"Day {}/{}: {} ({}) - {} messages",
idx + 1,
dates.len(),
date_str,
weekday,
messages.len()
);
println!("━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━");
if args.verbose {
println!("\nMessage preview:");
for (i, msg) in messages.iter().take(3).enumerate() {
let sender = if msg.is_sent { "Me" } else { &msg.contact };
let preview = msg.body.chars().take(60).collect::<String>();
println!(" {}. {}: {}...", i + 1, sender, preview);
}
if messages.len() > 3 {
println!(" ... and {} more", messages.len() - 3);
}
println!();
}
// Format messages for LLM
let messages_text: String = messages
.iter()
.take(200)
.map(|m| {
if m.is_sent {
format!("Me: {}", m.body)
} else {
format!("{}: {}", m.contact, m.body)
}
})
.collect::<Vec<_>>()
.join("\n");
let prompt = format!(
r#"Summarize this day's conversation between me and {}.
CRITICAL FORMAT RULES:
- Do NOT start with "Based on the conversation..." or "Here is a summary..." or similar preambles
- Do NOT repeat the date at the beginning
- Start DIRECTLY with the content - begin with a person's name or action
- Write in past tense, as if recording what happened
NARRATIVE (3-5 sentences):
- What specific topics, activities, or events were discussed?
- What places, people, or organizations were mentioned?
- What plans were made or decisions discussed?
- Clearly distinguish between what "I" did versus what {} did
KEYWORDS (comma-separated):
5-10 specific keywords that capture this conversation's unique content:
- Proper nouns (people, places, brands)
- Specific activities ("drum corps audition" not just "music")
- Distinctive terms that make this day unique
Date: {} ({})
Messages:
{}
YOUR RESPONSE (follow this format EXACTLY):
Summary: [Start directly with content, NO preamble]
Keywords: [specific, unique terms]"#,
args.contact,
args.contact,
date.format("%B %d, %Y"),
weekday,
messages_text
);
println!("Generating summary...");
let summary = ollama
.generate(
&prompt,
Some("You are a conversation summarizer. Create clear, factual summaries with precise subject attribution AND extract distinctive keywords. Focus on specific, unique terms that differentiate this conversation from others."),
)
.await?;
println!("\n📝 GENERATED SUMMARY:");
println!("─────────────────────────────────────────");
println!("{}", summary.trim());
println!("─────────────────────────────────────────");
if !args.test_mode {
println!("\nStripping boilerplate for embedding...");
let stripped = strip_summary_boilerplate(&summary);
println!(
"Stripped: {}...",
stripped.chars().take(80).collect::<String>()
);
println!("\nGenerating embedding...");
let embedding = ollama.generate_embedding(&stripped).await?;
println!("✓ Embedding generated ({} dimensions)", embedding.len());
println!("Saving to database...");
let insert = InsertDailySummary {
date: date_str.clone(),
contact: args.contact.clone(),
summary: summary.trim().to_string(),
message_count: messages.len() as i32,
embedding,
created_at: chrono::Utc::now().timestamp(),
// model_version: "nomic-embed-text:v1.5".to_string(),
model_version: "mxbai-embed-large:335m".to_string(),
};
let mut dao = summary_dao.lock().expect("Unable to lock DailySummaryDao");
let context = opentelemetry::Context::new();
match dao.store_summary(&context, insert) {
Ok(_) => println!("✓ Saved to database"),
Err(e) => println!("✗ Database error: {:?}", e),
}
} else {
println!("\n⚠ TEST MODE: Not saved to database");
}
println!();
// Rate limiting between days
if idx < dates.len() - 1 {
tokio::time::sleep(tokio::time::Duration::from_millis(500)).await;
}
}
println!("========================================");
println!("✓ Complete!");
println!("Processed {} days", dates.len());
println!("========================================");
Ok(())
}

View File

@@ -1,154 +0,0 @@
use crate::database::{ExifDao, FavoriteDao};
use crate::tags::TagDao;
use anyhow::Result;
use log::{error, info};
use opentelemetry;
use std::sync::{Arc, Mutex};
pub struct DatabaseUpdater {
tag_dao: Arc<Mutex<dyn TagDao>>,
exif_dao: Arc<Mutex<dyn ExifDao>>,
favorites_dao: Arc<Mutex<dyn FavoriteDao>>,
}
impl DatabaseUpdater {
pub fn new(
tag_dao: Arc<Mutex<dyn TagDao>>,
exif_dao: Arc<Mutex<dyn ExifDao>>,
favorites_dao: Arc<Mutex<dyn FavoriteDao>>,
) -> Self {
Self {
tag_dao,
exif_dao,
favorites_dao,
}
}
/// Update file path across all three database tables
/// Returns Ok(()) if successful, continues on partial failures but logs errors
pub fn update_file_path(&mut self, old_path: &str, new_path: &str) -> Result<()> {
let context = opentelemetry::Context::current();
let mut success_count = 0;
let mut error_count = 0;
// Update tagged_photo table
if let Ok(mut dao) = self.tag_dao.lock() {
match dao.update_photo_name(old_path, new_path, &context) {
Ok(_) => {
info!("Updated tagged_photo: {} -> {}", old_path, new_path);
success_count += 1;
}
Err(e) => {
error!("Failed to update tagged_photo for {}: {:?}", old_path, e);
error_count += 1;
}
}
} else {
error!("Failed to acquire lock on TagDao");
error_count += 1;
}
// Update image_exif table
if let Ok(mut dao) = self.exif_dao.lock() {
match dao.update_file_path(&context, old_path, new_path) {
Ok(_) => {
info!("Updated image_exif: {} -> {}", old_path, new_path);
success_count += 1;
}
Err(e) => {
error!("Failed to update image_exif for {}: {:?}", old_path, e);
error_count += 1;
}
}
} else {
error!("Failed to acquire lock on ExifDao");
error_count += 1;
}
// Update favorites table
if let Ok(mut dao) = self.favorites_dao.lock() {
match dao.update_path(old_path, new_path) {
Ok(_) => {
info!("Updated favorites: {} -> {}", old_path, new_path);
success_count += 1;
}
Err(e) => {
error!("Failed to update favorites for {}: {:?}", old_path, e);
error_count += 1;
}
}
} else {
error!("Failed to acquire lock on FavoriteDao");
error_count += 1;
}
if success_count > 0 {
info!(
"Updated {}/{} tables for {} -> {}",
success_count,
success_count + error_count,
old_path,
new_path
);
Ok(())
} else {
Err(anyhow::anyhow!(
"Failed to update any tables for {} -> {}",
old_path,
new_path
))
}
}
/// Get all file paths from all three database tables
pub fn get_all_file_paths(&mut self) -> Result<Vec<String>> {
let context = opentelemetry::Context::current();
let mut all_paths = Vec::new();
// Get from tagged_photo
if let Ok(mut dao) = self.tag_dao.lock() {
match dao.get_all_photo_names(&context) {
Ok(paths) => {
info!("Found {} paths in tagged_photo", paths.len());
all_paths.extend(paths);
}
Err(e) => {
error!("Failed to get paths from tagged_photo: {:?}", e);
}
}
}
// Get from image_exif
if let Ok(mut dao) = self.exif_dao.lock() {
match dao.get_all_file_paths(&context) {
Ok(paths) => {
info!("Found {} paths in image_exif", paths.len());
all_paths.extend(paths);
}
Err(e) => {
error!("Failed to get paths from image_exif: {:?}", e);
}
}
}
// Get from favorites
if let Ok(mut dao) = self.favorites_dao.lock() {
match dao.get_all_paths() {
Ok(paths) => {
info!("Found {} paths in favorites", paths.len());
all_paths.extend(paths);
}
Err(e) => {
error!("Failed to get paths from favorites: {:?}", e);
}
}
}
// Deduplicate
all_paths.sort();
all_paths.dedup();
info!("Total unique paths across all tables: {}", all_paths.len());
Ok(all_paths)
}
}

View File

@@ -1,103 +0,0 @@
use anyhow::{Context, Result};
use std::fs::File;
use std::io::Read;
use std::path::Path;
/// Detect the actual file type by reading the magic number (file header)
/// Returns the canonical extension for the detected type, or None if unknown
pub fn detect_file_type(path: &Path) -> Result<Option<String>> {
let mut file = File::open(path).with_context(|| format!("Failed to open file: {:?}", path))?;
// Read first 512 bytes for magic number detection
let mut buffer = vec![0; 512];
let bytes_read = file
.read(&mut buffer)
.with_context(|| format!("Failed to read file: {:?}", path))?;
buffer.truncate(bytes_read);
// Detect type using infer crate
let detected_type = infer::get(&buffer);
Ok(detected_type.map(|t| get_canonical_extension(t.mime_type())))
}
/// Map MIME type to canonical file extension
pub fn get_canonical_extension(mime_type: &str) -> String {
match mime_type {
// Images
"image/jpeg" => "jpg",
"image/png" => "png",
"image/webp" => "webp",
"image/tiff" => "tiff",
"image/heif" | "image/heic" => "heic",
"image/avif" => "avif",
// Videos
"video/mp4" => "mp4",
"video/quicktime" => "mov",
// Fallback: use the last part of MIME type
_ => mime_type.split('/').next_back().unwrap_or("unknown"),
}
.to_string()
}
/// Check if a file should be renamed based on current vs detected extension
/// Handles aliases (jpg/jpeg are equivalent)
pub fn should_rename(current_ext: &str, detected_ext: &str) -> bool {
let current = current_ext.to_lowercase();
let detected = detected_ext.to_lowercase();
// Direct match
if current == detected {
return false;
}
// Handle JPEG aliases (jpg and jpeg are equivalent)
if (current == "jpg" || current == "jpeg") && (detected == "jpg" || detected == "jpeg") {
return false;
}
// Handle TIFF aliases (tiff and tif are equivalent)
if (current == "tiff" || current == "tif") && (detected == "tiff" || detected == "tif") {
return false;
}
// Extensions differ and are not aliases
true
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_get_canonical_extension() {
assert_eq!(get_canonical_extension("image/jpeg"), "jpg");
assert_eq!(get_canonical_extension("image/png"), "png");
assert_eq!(get_canonical_extension("image/webp"), "webp");
assert_eq!(get_canonical_extension("video/mp4"), "mp4");
assert_eq!(get_canonical_extension("video/quicktime"), "mov");
}
#[test]
fn test_should_rename() {
// Same extension - no rename
assert!(!should_rename("jpg", "jpg"));
assert!(!should_rename("png", "png"));
// JPEG aliases - no rename
assert!(!should_rename("jpg", "jpeg"));
assert!(!should_rename("jpeg", "jpg"));
assert!(!should_rename("JPG", "jpeg"));
// TIFF aliases - no rename
assert!(!should_rename("tiff", "tif"));
assert!(!should_rename("tif", "tiff"));
// Different types - should rename
assert!(should_rename("png", "jpg"));
assert!(should_rename("jpg", "png"));
assert!(should_rename("webp", "png"));
}
}

View File

@@ -1,11 +0,0 @@
pub mod database_updater;
pub mod file_type_detector;
pub mod phase1;
pub mod phase2;
pub mod types;
pub use database_updater::DatabaseUpdater;
pub use file_type_detector::{detect_file_type, get_canonical_extension, should_rename};
pub use phase1::resolve_missing_files;
pub use phase2::validate_file_types;
pub use types::{CleanupConfig, CleanupStats, FileIssue, IssueType};

View File

@@ -1,147 +0,0 @@
use crate::cleanup::database_updater::DatabaseUpdater;
use crate::cleanup::types::{CleanupConfig, CleanupStats};
use crate::file_types::IMAGE_EXTENSIONS;
use anyhow::Result;
use log::{error, warn};
use std::path::PathBuf;
// All supported image extensions to try
const SUPPORTED_EXTENSIONS: &[&str] = IMAGE_EXTENSIONS;
/// Phase 1: Resolve missing files by searching for alternative extensions
pub fn resolve_missing_files(
config: &CleanupConfig,
db_updater: &mut DatabaseUpdater,
) -> Result<CleanupStats> {
let mut stats = CleanupStats::new();
println!("\nPhase 1: Missing File Resolution");
println!("---------------------------------");
// Get all file paths from database
println!("Scanning database for file references...");
let all_paths = db_updater.get_all_file_paths()?;
println!("Found {} unique file paths\n", all_paths.len());
stats.files_checked = all_paths.len();
println!("Checking file existence...");
let mut missing_count = 0;
let mut resolved_count = 0;
for path_str in all_paths {
let full_path = config.base_path.join(&path_str);
// Check if file exists
if full_path.exists() {
continue;
}
missing_count += 1;
stats.issues_found += 1;
// Try to find the file with different extensions
match find_file_with_alternative_extension(&config.base_path, &path_str) {
Some(new_path_str) => {
println!(
"{} → found as {} {}",
path_str,
new_path_str,
if config.dry_run {
"(dry-run, not updated)"
} else {
""
}
);
if !config.dry_run {
// Update database
match db_updater.update_file_path(&path_str, &new_path_str) {
Ok(_) => {
resolved_count += 1;
stats.issues_fixed += 1;
}
Err(e) => {
error!("Failed to update database for {}: {:?}", path_str, e);
stats.add_error(format!("DB update failed for {}: {}", path_str, e));
}
}
} else {
resolved_count += 1;
}
}
None => {
warn!("✗ {} → not found with any extension", path_str);
}
}
}
println!("\nResults:");
println!("- Files checked: {}", stats.files_checked);
println!("- Missing files: {}", missing_count);
println!("- Resolved: {}", resolved_count);
println!(
"- Still missing: {}",
missing_count - if config.dry_run { 0 } else { resolved_count }
);
if !stats.errors.is_empty() {
println!("- Errors: {}", stats.errors.len());
}
Ok(stats)
}
/// Find a file with an alternative extension
/// Returns the relative path with the new extension if found
fn find_file_with_alternative_extension(
base_path: &PathBuf,
relative_path: &str,
) -> Option<String> {
let full_path = base_path.join(relative_path);
// Get the parent directory and file stem (name without extension)
let parent = full_path.parent()?;
let stem = full_path.file_stem()?.to_str()?;
// Try each supported extension
for ext in SUPPORTED_EXTENSIONS {
let test_path = parent.join(format!("{}.{}", stem, ext));
if test_path.exists() {
// Convert back to relative path
if let Ok(rel) = test_path.strip_prefix(base_path)
&& let Some(rel_str) = rel.to_str()
{
return Some(rel_str.to_string());
}
}
}
None
}
#[cfg(test)]
mod tests {
use super::*;
use std::fs;
use tempfile::TempDir;
#[test]
fn test_find_file_with_alternative_extension() {
let temp_dir = TempDir::new().unwrap();
let base_path = temp_dir.path().to_path_buf();
// Create a test file with .jpeg extension
let test_file = base_path.join("test.jpeg");
fs::write(&test_file, b"test").unwrap();
// Try to find it as .jpg
let result = find_file_with_alternative_extension(&base_path, "test.jpg");
assert!(result.is_some());
assert_eq!(result.unwrap(), "test.jpeg");
// Try to find non-existent file
let result = find_file_with_alternative_extension(&base_path, "nonexistent.jpg");
assert!(result.is_none());
}
}

View File

@@ -1,241 +0,0 @@
use crate::cleanup::database_updater::DatabaseUpdater;
use crate::cleanup::file_type_detector::{detect_file_type, should_rename};
use crate::cleanup::types::{CleanupConfig, CleanupStats};
use anyhow::Result;
use log::{error, warn};
use std::fs;
use std::path::{Path, PathBuf};
use walkdir::WalkDir;
/// Phase 2: Validate file types and rename mismatches
pub fn validate_file_types(
config: &CleanupConfig,
db_updater: &mut DatabaseUpdater,
) -> Result<CleanupStats> {
let mut stats = CleanupStats::new();
let mut auto_fix_all = config.auto_fix;
let mut skip_all = false;
println!("\nPhase 2: File Type Validation");
println!("------------------------------");
// Walk the filesystem
println!("Scanning filesystem...");
let files: Vec<PathBuf> = WalkDir::new(&config.base_path)
.into_iter()
.filter_map(|e| e.ok())
.filter(|e| e.file_type().is_file())
.filter(|e| is_supported_media_file(e.path()))
.map(|e| e.path().to_path_buf())
.collect();
println!("Files found: {}\n", files.len());
stats.files_checked = files.len();
println!("Detecting file types...");
let mut mismatches_found = 0;
let mut files_renamed = 0;
let mut user_skipped = 0;
for file_path in files {
// Get current extension
let current_ext = match file_path.extension() {
Some(ext) => ext.to_str().unwrap_or(""),
None => continue, // Skip files without extensions
};
// Detect actual file type
match detect_file_type(&file_path) {
Ok(Some(detected_ext)) => {
// Check if we should rename
if should_rename(current_ext, &detected_ext) {
mismatches_found += 1;
stats.issues_found += 1;
// Get relative path for display and database
let relative_path = match file_path.strip_prefix(&config.base_path) {
Ok(rel) => rel.to_str().unwrap_or(""),
Err(_) => {
error!("Failed to get relative path for {:?}", file_path);
continue;
}
};
println!("\nFile type mismatch:");
println!(" Path: {}", relative_path);
println!(" Current: .{}", current_ext);
println!(" Actual: .{}", detected_ext);
// Calculate new path
let new_file_path = file_path.with_extension(&detected_ext);
let new_relative_path = match new_file_path.strip_prefix(&config.base_path) {
Ok(rel) => rel.to_str().unwrap_or(""),
Err(_) => {
error!("Failed to get new relative path for {:?}", new_file_path);
continue;
}
};
// Check if destination already exists
if new_file_path.exists() {
warn!("✗ Destination already exists: {}", new_relative_path);
stats.add_error(format!(
"Destination exists for {}: {}",
relative_path, new_relative_path
));
continue;
}
// Determine if we should proceed
let should_proceed = if config.dry_run {
println!(" (dry-run mode - would rename to {})", new_relative_path);
false
} else if skip_all {
println!(" Skipped (skip all)");
user_skipped += 1;
false
} else if auto_fix_all {
true
} else {
// Interactive prompt
match prompt_for_rename(new_relative_path) {
RenameDecision::Yes => true,
RenameDecision::No => {
user_skipped += 1;
false
}
RenameDecision::All => {
auto_fix_all = true;
true
}
RenameDecision::SkipAll => {
skip_all = true;
user_skipped += 1;
false
}
}
};
if should_proceed {
// Rename the file
match fs::rename(&file_path, &new_file_path) {
Ok(_) => {
println!("✓ Renamed file");
// Update database
match db_updater.update_file_path(relative_path, new_relative_path)
{
Ok(_) => {
files_renamed += 1;
stats.issues_fixed += 1;
}
Err(e) => {
error!(
"File renamed but DB update failed for {}: {:?}",
relative_path, e
);
stats.add_error(format!(
"DB update failed for {}: {}",
relative_path, e
));
}
}
}
Err(e) => {
error!("✗ Failed to rename file: {:?}", e);
stats.add_error(format!(
"Rename failed for {}: {}",
relative_path, e
));
}
}
}
}
}
Ok(None) => {
// Could not detect file type - skip
// This is normal for some RAW formats or corrupted files
}
Err(e) => {
warn!("Failed to detect type for {:?}: {:?}", file_path, e);
}
}
}
println!("\nResults:");
println!("- Files scanned: {}", stats.files_checked);
println!("- Mismatches found: {}", mismatches_found);
if config.dry_run {
println!("- Would rename: {}", mismatches_found);
} else {
println!("- Files renamed: {}", files_renamed);
if user_skipped > 0 {
println!("- User skipped: {}", user_skipped);
}
}
if !stats.errors.is_empty() {
println!("- Errors: {}", stats.errors.len());
}
Ok(stats)
}
/// Check if a file is a supported media file based on extension
fn is_supported_media_file(path: &Path) -> bool {
use crate::file_types::is_media_file;
is_media_file(path)
}
#[derive(Debug)]
enum RenameDecision {
Yes,
No,
All,
SkipAll,
}
/// Prompt the user for rename decision
fn prompt_for_rename(new_path: &str) -> RenameDecision {
println!("\nRename to {}?", new_path);
println!(" [y] Yes");
println!(" [n] No (default)");
println!(" [a] Yes to all");
println!(" [s] Skip all remaining");
print!("Choice: ");
// Force flush stdout
use std::io::{self, Write};
let _ = io::stdout().flush();
let mut input = String::new();
match io::stdin().read_line(&mut input) {
Ok(_) => {
let choice = input.trim().to_lowercase();
match choice.as_str() {
"y" | "yes" => RenameDecision::Yes,
"a" | "all" => RenameDecision::All,
"s" | "skip" => RenameDecision::SkipAll,
_ => RenameDecision::No,
}
}
Err(_) => RenameDecision::No,
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_is_supported_media_file() {
assert!(is_supported_media_file(Path::new("test.jpg")));
assert!(is_supported_media_file(Path::new("test.JPG")));
assert!(is_supported_media_file(Path::new("test.png")));
assert!(is_supported_media_file(Path::new("test.webp")));
assert!(is_supported_media_file(Path::new("test.mp4")));
assert!(is_supported_media_file(Path::new("test.mov")));
assert!(!is_supported_media_file(Path::new("test.txt")));
assert!(!is_supported_media_file(Path::new("test")));
}
}

View File

@@ -1,39 +0,0 @@
use std::path::PathBuf;
#[derive(Debug, Clone)]
pub struct CleanupConfig {
pub base_path: PathBuf,
pub dry_run: bool,
pub auto_fix: bool,
}
#[derive(Debug, Clone)]
pub struct FileIssue {
pub current_path: String,
pub issue_type: IssueType,
pub suggested_path: Option<String>,
}
#[derive(Debug, Clone)]
pub enum IssueType {
MissingFile,
ExtensionMismatch { current: String, actual: String },
}
#[derive(Debug, Clone, Default)]
pub struct CleanupStats {
pub files_checked: usize,
pub issues_found: usize,
pub issues_fixed: usize,
pub errors: Vec<String>,
}
impl CleanupStats {
pub fn new() -> Self {
Self::default()
}
pub fn add_error(&mut self, error: String) {
self.errors.push(error);
}
}

View File

@@ -1,15 +1,14 @@
use std::{fs, str::FromStr};
use crate::database::models::ImageExif;
use anyhow::{Context, anyhow};
use anyhow::{anyhow, Context};
use chrono::{DateTime, Utc};
use log::error;
use actix_web::error::ErrorUnauthorized;
use actix_web::{Error, FromRequest, HttpRequest, dev, http::header};
use futures::future::{Ready, err, ok};
use jsonwebtoken::{Algorithm, DecodingKey, Validation, decode};
use actix_web::{dev, http::header, Error, FromRequest, HttpRequest};
use futures::future::{err, ok, Ready};
use jsonwebtoken::{decode, Algorithm, DecodingKey, Validation};
use serde::{Deserialize, Serialize};
#[derive(Serialize)]
@@ -17,27 +16,12 @@ pub struct Token<'a> {
pub token: &'a str,
}
#[derive(Debug, Clone, Deserialize, Serialize)]
#[derive(Debug, Deserialize, Serialize)]
pub struct Claims {
pub sub: String,
pub exp: i64,
}
#[cfg(test)]
pub mod helper {
use super::Claims;
use chrono::{Duration, Utc};
impl Claims {
pub fn valid_user(user_id: String) -> Self {
Claims {
sub: user_id,
exp: (Utc::now() + Duration::minutes(1)).timestamp(),
}
}
}
}
pub fn secret_key() -> String {
if cfg!(test) {
String::from("test_key")
@@ -50,9 +34,7 @@ impl FromStr for Claims {
type Err = jsonwebtoken::errors::Error;
fn from_str(s: &str) -> Result<Self, Self::Err> {
let token = s.strip_prefix("Bearer ").ok_or_else(|| {
jsonwebtoken::errors::Error::from(jsonwebtoken::errors::ErrorKind::InvalidToken)
})?;
let token = *(s.split("Bearer ").collect::<Vec<_>>().last().unwrap_or(&""));
match decode::<Claims>(
token,
@@ -71,6 +53,7 @@ impl FromStr for Claims {
impl FromRequest for Claims {
type Error = Error;
type Future = Ready<Result<Self, Self::Error>>;
type Config = ();
fn from_request(req: &HttpRequest, _payload: &mut dev::Payload) -> Self::Future {
req.headers()
@@ -101,109 +84,12 @@ impl FromRequest for Claims {
pub struct PhotosResponse {
pub photos: Vec<String>,
pub dirs: Vec<String>,
// Pagination metadata (only present when limit is set)
#[serde(skip_serializing_if = "Option::is_none")]
pub total_count: Option<i64>,
#[serde(skip_serializing_if = "Option::is_none")]
pub has_more: Option<bool>,
#[serde(skip_serializing_if = "Option::is_none")]
pub next_offset: Option<i64>,
}
#[derive(Copy, Clone, Deserialize, PartialEq, Debug)]
#[serde(rename_all = "lowercase")]
pub enum SortType {
Shuffle,
NameAsc,
NameDesc,
TagCountAsc,
TagCountDesc,
DateTakenAsc,
DateTakenDesc,
}
#[derive(Deserialize)]
pub struct FilesRequest {
pub path: String,
// comma separated numbers
pub tag_ids: Option<String>,
pub exclude_tag_ids: Option<String>,
pub tag_filter_mode: Option<FilterMode>,
pub recursive: Option<bool>,
pub sort: Option<SortType>,
// EXIF-based search parameters
pub camera_make: Option<String>,
pub camera_model: Option<String>,
pub lens_model: Option<String>,
// GPS location search
pub gps_lat: Option<f64>,
pub gps_lon: Option<f64>,
pub gps_radius_km: Option<f64>,
// Date range filtering (Unix timestamps)
pub date_from: Option<i64>,
pub date_to: Option<i64>,
// Media type filtering
pub media_type: Option<MediaType>,
// Pagination parameters (optional - backward compatible)
pub limit: Option<i64>,
pub offset: Option<i64>,
}
#[derive(Copy, Clone, Deserialize, PartialEq, Debug)]
pub enum FilterMode {
Any,
All,
}
#[derive(Copy, Clone, Deserialize, PartialEq, Debug)]
#[serde(rename_all = "lowercase")]
pub enum MediaType {
Photo,
Video,
All,
}
#[derive(Copy, Clone, Deserialize, PartialEq, Debug)]
#[serde(rename_all = "lowercase")]
pub enum PhotoSize {
Full,
Thumb,
}
#[derive(Debug, Deserialize)]
pub struct ThumbnailRequest {
pub(crate) path: String,
#[allow(dead_code)] // Part of API contract, may be used in future
pub(crate) size: Option<PhotoSize>,
#[serde(default)]
#[allow(dead_code)] // Part of API contract, may be used in future
pub(crate) format: Option<ThumbnailFormat>,
#[serde(default)]
pub(crate) shape: Option<ThumbnailShape>,
}
#[derive(Debug, Deserialize, PartialEq)]
pub enum ThumbnailFormat {
#[serde(rename = "gif")]
Gif,
#[serde(rename = "image")]
Image,
}
#[derive(Debug, Deserialize, PartialEq)]
pub enum ThumbnailShape {
#[serde(rename = "circle")]
Circle,
#[serde(rename = "square")]
Square,
pub path: String,
pub size: Option<String>,
}
#[derive(Deserialize)]
@@ -229,8 +115,6 @@ pub struct MetadataResponse {
pub created: Option<i64>,
pub modified: Option<i64>,
pub size: u64,
pub exif: Option<ExifMetadata>,
pub filename_date: Option<i64>, // Date extracted from filename
}
impl From<fs::Metadata> for MetadataResponse {
@@ -245,103 +129,6 @@ impl From<fs::Metadata> for MetadataResponse {
utc.timestamp()
}),
size: metadata.len(),
exif: None,
filename_date: None, // Will be set in endpoint handler
}
}
}
#[derive(Debug, Serialize)]
pub struct ExifMetadata {
pub camera: Option<CameraInfo>,
pub image_properties: Option<ImageProperties>,
pub gps: Option<GpsCoordinates>,
pub capture_settings: Option<CaptureSettings>,
pub date_taken: Option<i64>,
}
#[derive(Debug, Serialize)]
pub struct CameraInfo {
pub make: Option<String>,
pub model: Option<String>,
pub lens: Option<String>,
}
#[derive(Debug, Serialize)]
pub struct ImageProperties {
pub width: Option<i32>,
pub height: Option<i32>,
pub orientation: Option<i32>,
}
#[derive(Debug, Serialize)]
pub struct GpsCoordinates {
pub latitude: Option<f64>,
pub longitude: Option<f64>,
pub altitude: Option<f64>,
}
#[derive(Debug, Serialize)]
pub struct CaptureSettings {
pub focal_length: Option<f64>,
pub aperture: Option<f64>,
pub shutter_speed: Option<String>,
pub iso: Option<i32>,
}
impl From<ImageExif> for ExifMetadata {
fn from(exif: ImageExif) -> Self {
let has_camera_info =
exif.camera_make.is_some() || exif.camera_model.is_some() || exif.lens_model.is_some();
let has_image_properties =
exif.width.is_some() || exif.height.is_some() || exif.orientation.is_some();
let has_gps = exif.gps_latitude.is_some()
|| exif.gps_longitude.is_some()
|| exif.gps_altitude.is_some();
let has_capture_settings = exif.focal_length.is_some()
|| exif.aperture.is_some()
|| exif.shutter_speed.is_some()
|| exif.iso.is_some();
ExifMetadata {
camera: if has_camera_info {
Some(CameraInfo {
make: exif.camera_make,
model: exif.camera_model,
lens: exif.lens_model,
})
} else {
None
},
image_properties: if has_image_properties {
Some(ImageProperties {
width: exif.width,
height: exif.height,
orientation: exif.orientation,
})
} else {
None
},
gps: if has_gps {
Some(GpsCoordinates {
latitude: exif.gps_latitude.map(|v| v as f64),
longitude: exif.gps_longitude.map(|v| v as f64),
altitude: exif.gps_altitude.map(|v| v as f64),
})
} else {
None
},
capture_settings: if has_capture_settings {
Some(CaptureSettings {
focal_length: exif.focal_length.map(|v| v as f64),
aperture: exif.aperture.map(|v| v as f64),
shutter_speed: exif.shutter_speed,
iso: exif.iso,
})
} else {
None
},
date_taken: exif.date_taken,
}
}
}
@@ -352,48 +139,6 @@ pub struct AddTagRequest {
pub tag_name: String,
}
#[derive(Deserialize)]
pub struct GetTagsRequest {
pub path: Option<String>,
}
#[derive(Debug, Serialize)]
pub struct GpsPhotoSummary {
pub path: String,
pub lat: f64,
pub lon: f64,
pub date_taken: Option<i64>,
}
#[derive(Debug, Serialize)]
pub struct GpsPhotosResponse {
pub photos: Vec<GpsPhotoSummary>,
pub total: usize,
}
#[derive(Deserialize)]
pub struct PreviewClipRequest {
pub path: String,
}
#[derive(Deserialize)]
pub struct PreviewStatusRequest {
pub paths: Vec<String>,
}
#[derive(Serialize)]
pub struct PreviewStatusResponse {
pub previews: Vec<PreviewStatusItem>,
}
#[derive(Serialize)]
pub struct PreviewStatusItem {
pub path: String,
pub status: String,
#[serde(skip_serializing_if = "Option::is_none")]
pub preview_url: Option<String>,
}
#[cfg(test)]
mod tests {
use super::Claims;
@@ -408,7 +153,7 @@ mod tests {
};
let c = Claims::from_str(
"Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI5IiwiZXhwIjoxNjEzNjE2NDc5MH0.9wwK4l8vhvq55YoueEljMbN_5uVTaAsGLLRPr0AuymE")
"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI5IiwiZXhwIjoxNjEzNjE2NDc5MH0.9wwK4l8vhvq55YoueEljMbN_5uVTaAsGLLRPr0AuymE")
.unwrap();
assert_eq!(claims.sub, c.sub);
@@ -418,8 +163,7 @@ mod tests {
#[test]
fn test_expired_token() {
let err = Claims::from_str(
"Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI5IiwiZXhwIjoxNn0.eZnfaNfiD54VMbphIqeBICeG9SzAtwNXntLwtTBihjY",
);
"eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiI5IiwiZXhwIjoxNn0.eZnfaNfiD54VMbphIqeBICeG9SzAtwNXntLwtTBihjY");
match err.unwrap_err().into_kind() {
ErrorKind::ExpiredSignature => assert!(true),
@@ -442,54 +186,4 @@ mod tests {
}
}
}
#[test]
fn test_preview_clip_request_deserialize() {
use super::PreviewClipRequest;
let json = r#"{"path":"photos/2024/video.mp4"}"#;
let req: PreviewClipRequest = serde_json::from_str(json).unwrap();
assert_eq!(req.path, "photos/2024/video.mp4");
}
#[test]
fn test_preview_status_request_deserialize() {
use super::PreviewStatusRequest;
let json = r#"{"paths":["a/one.mp4","b/two.mp4","c/three.mp4"]}"#;
let req: PreviewStatusRequest = serde_json::from_str(json).unwrap();
assert_eq!(req.paths.len(), 3);
assert_eq!(req.paths[0], "a/one.mp4");
assert_eq!(req.paths[2], "c/three.mp4");
}
#[test]
fn test_preview_status_response_serialize() {
use super::{PreviewStatusItem, PreviewStatusResponse};
let response = PreviewStatusResponse {
previews: vec![
PreviewStatusItem {
path: "a/one.mp4".to_string(),
status: "complete".to_string(),
preview_url: Some("/video/preview?path=a%2Fone.mp4".to_string()),
},
PreviewStatusItem {
path: "b/two.mp4".to_string(),
status: "pending".to_string(),
preview_url: None,
},
],
};
let json = serde_json::to_value(&response).unwrap();
let previews = json["previews"].as_array().unwrap();
assert_eq!(previews.len(), 2);
// Complete item should have preview_url
assert_eq!(previews[0]["status"], "complete");
assert!(previews[0]["preview_url"].is_string());
// Pending item should not have preview_url (skip_serializing_if)
assert_eq!(previews[1]["status"], "pending");
assert!(previews[1].get("preview_url").is_none());
}
}

View File

@@ -1,554 +0,0 @@
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use serde::Serialize;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::{DbError, DbErrorKind, connect};
use crate::otel::trace_db_call;
/// Represents a calendar event
#[derive(Serialize, Clone, Debug)]
pub struct CalendarEvent {
pub id: i32,
pub event_uid: Option<String>,
pub summary: String,
pub description: Option<String>,
pub location: Option<String>,
pub start_time: i64,
pub end_time: i64,
pub all_day: bool,
pub organizer: Option<String>,
pub attendees: Option<String>, // JSON string
pub created_at: i64,
pub source_file: Option<String>,
}
/// Data for inserting a new calendar event
#[derive(Clone, Debug)]
#[allow(dead_code)]
pub struct InsertCalendarEvent {
pub event_uid: Option<String>,
pub summary: String,
pub description: Option<String>,
pub location: Option<String>,
pub start_time: i64,
pub end_time: i64,
pub all_day: bool,
pub organizer: Option<String>,
pub attendees: Option<String>,
pub embedding: Option<Vec<f32>>, // 768-dim, optional
pub created_at: i64,
pub source_file: Option<String>,
}
pub trait CalendarEventDao: Sync + Send {
/// Store calendar event with optional embedding
fn store_event(
&mut self,
context: &opentelemetry::Context,
event: InsertCalendarEvent,
) -> Result<CalendarEvent, DbError>;
/// Batch insert events (for import efficiency)
fn store_events_batch(
&mut self,
context: &opentelemetry::Context,
events: Vec<InsertCalendarEvent>,
) -> Result<usize, DbError>;
/// Find events in time range (PRIMARY query method)
fn find_events_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<CalendarEvent>, DbError>;
/// Find semantically similar events (SECONDARY - requires embeddings)
fn find_similar_events(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<CalendarEvent>, DbError>;
/// Hybrid: Time-filtered + semantic ranking
/// "Events during photo timestamp ±N days, ranked by similarity to context"
fn find_relevant_events_hybrid(
&mut self,
context: &opentelemetry::Context,
center_timestamp: i64,
time_window_days: i64,
query_embedding: Option<&[f32]>,
limit: usize,
) -> Result<Vec<CalendarEvent>, DbError>;
/// Check if event exists (idempotency)
fn event_exists(
&mut self,
context: &opentelemetry::Context,
event_uid: &str,
start_time: i64,
) -> Result<bool, DbError>;
/// Get count of events
fn get_event_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError>;
}
pub struct SqliteCalendarEventDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteCalendarEventDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteCalendarEventDao {
pub fn new() -> Self {
SqliteCalendarEventDao {
connection: Arc::new(Mutex::new(connect())),
}
}
fn serialize_vector(vec: &[f32]) -> Vec<u8> {
use zerocopy::IntoBytes;
vec.as_bytes().to_vec()
}
fn deserialize_vector(bytes: &[u8]) -> Result<Vec<f32>, DbError> {
if !bytes.len().is_multiple_of(4) {
return Err(DbError::new(DbErrorKind::QueryError));
}
let count = bytes.len() / 4;
let mut vec = Vec::with_capacity(count);
for chunk in bytes.chunks_exact(4) {
let float = f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
vec.push(float);
}
Ok(vec)
}
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
if a.len() != b.len() {
return 0.0;
}
let dot_product: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
let magnitude_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let magnitude_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if magnitude_a == 0.0 || magnitude_b == 0.0 {
return 0.0;
}
dot_product / (magnitude_a * magnitude_b)
}
}
#[derive(QueryableByName)]
struct CalendarEventWithVectorRow {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
event_uid: Option<String>,
#[diesel(sql_type = diesel::sql_types::Text)]
summary: String,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
description: Option<String>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
location: Option<String>,
#[diesel(sql_type = diesel::sql_types::BigInt)]
start_time: i64,
#[diesel(sql_type = diesel::sql_types::BigInt)]
end_time: i64,
#[diesel(sql_type = diesel::sql_types::Bool)]
all_day: bool,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
organizer: Option<String>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
attendees: Option<String>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Binary>)]
embedding: Option<Vec<u8>>,
#[diesel(sql_type = diesel::sql_types::BigInt)]
created_at: i64,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
source_file: Option<String>,
}
impl CalendarEventWithVectorRow {
fn to_calendar_event(&self) -> CalendarEvent {
CalendarEvent {
id: self.id,
event_uid: self.event_uid.clone(),
summary: self.summary.clone(),
description: self.description.clone(),
location: self.location.clone(),
start_time: self.start_time,
end_time: self.end_time,
all_day: self.all_day,
organizer: self.organizer.clone(),
attendees: self.attendees.clone(),
created_at: self.created_at,
source_file: self.source_file.clone(),
}
}
}
#[derive(QueryableByName)]
struct LastInsertRowId {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
}
impl CalendarEventDao for SqliteCalendarEventDao {
fn store_event(
&mut self,
context: &opentelemetry::Context,
event: InsertCalendarEvent,
) -> Result<CalendarEvent, DbError> {
trace_db_call(context, "insert", "store_event", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get CalendarEventDao");
// Validate embedding dimensions if provided
if let Some(ref emb) = event.embedding
&& emb.len() != 768
{
return Err(anyhow::anyhow!(
"Invalid embedding dimensions: {} (expected 768)",
emb.len()
));
}
let embedding_bytes = event.embedding.as_ref().map(|e| Self::serialize_vector(e));
// INSERT OR REPLACE to handle re-imports
diesel::sql_query(
"INSERT OR REPLACE INTO calendar_events
(event_uid, summary, description, location, start_time, end_time, all_day,
organizer, attendees, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12)",
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.event_uid)
.bind::<diesel::sql_types::Text, _>(&event.summary)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.description)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.location)
.bind::<diesel::sql_types::BigInt, _>(event.start_time)
.bind::<diesel::sql_types::BigInt, _>(event.end_time)
.bind::<diesel::sql_types::Bool, _>(event.all_day)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.organizer)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.attendees)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Binary>, _>(&embedding_bytes)
.bind::<diesel::sql_types::BigInt, _>(event.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&event.source_file)
.execute(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Insert error: {:?}", e))?;
let row_id: i32 = diesel::sql_query("SELECT last_insert_rowid() as id")
.get_result::<LastInsertRowId>(conn.deref_mut())
.map(|r| r.id)
.map_err(|e| anyhow::anyhow!("Failed to get last insert ID: {:?}", e))?;
Ok(CalendarEvent {
id: row_id,
event_uid: event.event_uid,
summary: event.summary,
description: event.description,
location: event.location,
start_time: event.start_time,
end_time: event.end_time,
all_day: event.all_day,
organizer: event.organizer,
attendees: event.attendees,
created_at: event.created_at,
source_file: event.source_file,
})
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn store_events_batch(
&mut self,
context: &opentelemetry::Context,
events: Vec<InsertCalendarEvent>,
) -> Result<usize, DbError> {
trace_db_call(context, "insert", "store_events_batch", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get CalendarEventDao");
let mut inserted = 0;
conn.transaction::<_, anyhow::Error, _>(|conn| {
for event in events {
// Validate embedding if provided
if let Some(ref emb) = event.embedding
&& emb.len() != 768
{
log::warn!(
"Skipping event with invalid embedding dimensions: {}",
emb.len()
);
continue;
}
let embedding_bytes =
event.embedding.as_ref().map(|e| Self::serialize_vector(e));
diesel::sql_query(
"INSERT OR REPLACE INTO calendar_events
(event_uid, summary, description, location, start_time, end_time, all_day,
organizer, attendees, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11, ?12)",
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.event_uid,
)
.bind::<diesel::sql_types::Text, _>(&event.summary)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.description,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.location,
)
.bind::<diesel::sql_types::BigInt, _>(event.start_time)
.bind::<diesel::sql_types::BigInt, _>(event.end_time)
.bind::<diesel::sql_types::Bool, _>(event.all_day)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.organizer,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.attendees,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Binary>, _>(
&embedding_bytes,
)
.bind::<diesel::sql_types::BigInt, _>(event.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&event.source_file,
)
.execute(conn)
.map_err(|e| anyhow::anyhow!("Batch insert error: {:?}", e))?;
inserted += 1;
}
Ok(())
})
.map_err(|e| anyhow::anyhow!("Transaction error: {:?}", e))?;
Ok(inserted)
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn find_events_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<CalendarEvent>, DbError> {
trace_db_call(context, "query", "find_events_in_range", |_span| {
let mut conn = self.connection.lock().expect("Unable to get CalendarEventDao");
diesel::sql_query(
"SELECT id, event_uid, summary, description, location, start_time, end_time, all_day,
organizer, attendees, NULL as embedding, created_at, source_file
FROM calendar_events
WHERE start_time >= ?1 AND start_time <= ?2
ORDER BY start_time ASC"
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.load::<CalendarEventWithVectorRow>(conn.deref_mut())
.map(|rows| rows.into_iter().map(|r| r.to_calendar_event()).collect())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_similar_events(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<CalendarEvent>, DbError> {
trace_db_call(context, "query", "find_similar_events", |_span| {
let mut conn = self.connection.lock().expect("Unable to get CalendarEventDao");
if query_embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_embedding.len()
));
}
// Load all events with embeddings
let results = diesel::sql_query(
"SELECT id, event_uid, summary, description, location, start_time, end_time, all_day,
organizer, attendees, embedding, created_at, source_file
FROM calendar_events
WHERE embedding IS NOT NULL"
)
.load::<CalendarEventWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
// Compute similarities
let mut scored_events: Vec<(f32, CalendarEvent)> = results
.into_iter()
.filter_map(|row| {
if let Some(ref emb_bytes) = row.embedding {
if let Ok(emb) = Self::deserialize_vector(emb_bytes) {
let similarity = Self::cosine_similarity(query_embedding, &emb);
Some((similarity, row.to_calendar_event()))
} else {
None
}
} else {
None
}
})
.collect();
// Sort by similarity descending
scored_events.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
log::info!("Found {} similar calendar events", scored_events.len());
if !scored_events.is_empty() {
log::info!("Top similarity: {:.4}", scored_events[0].0);
}
Ok(scored_events.into_iter().take(limit).map(|(_, event)| event).collect())
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_relevant_events_hybrid(
&mut self,
context: &opentelemetry::Context,
center_timestamp: i64,
time_window_days: i64,
query_embedding: Option<&[f32]>,
limit: usize,
) -> Result<Vec<CalendarEvent>, DbError> {
trace_db_call(context, "query", "find_relevant_events_hybrid", |_span| {
let window_seconds = time_window_days * 86400;
let start_ts = center_timestamp - window_seconds;
let end_ts = center_timestamp + window_seconds;
let mut conn = self.connection.lock().expect("Unable to get CalendarEventDao");
// Step 1: Time-based filter (fast, indexed)
let events_in_range = diesel::sql_query(
"SELECT id, event_uid, summary, description, location, start_time, end_time, all_day,
organizer, attendees, embedding, created_at, source_file
FROM calendar_events
WHERE start_time >= ?1 AND start_time <= ?2"
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.load::<CalendarEventWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
// Step 2: If query embedding provided, rank by semantic similarity
if let Some(query_emb) = query_embedding {
if query_emb.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_emb.len()
));
}
let mut scored_events: Vec<(f32, CalendarEvent)> = events_in_range
.into_iter()
.map(|row| {
// Events with embeddings get semantic scoring
let similarity = if let Some(ref emb_bytes) = row.embedding {
if let Ok(emb) = Self::deserialize_vector(emb_bytes) {
Self::cosine_similarity(query_emb, &emb)
} else {
0.5 // Neutral score for deserialization errors
}
} else {
0.5 // Neutral score for events without embeddings
};
(similarity, row.to_calendar_event())
})
.collect();
// Sort by similarity descending
scored_events.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
log::info!("Hybrid query: {} events in time range, ranked by similarity", scored_events.len());
if !scored_events.is_empty() {
log::info!("Top similarity: {:.4}", scored_events[0].0);
}
Ok(scored_events.into_iter().take(limit).map(|(_, event)| event).collect())
} else {
// No semantic ranking, just return time-sorted (limit applied)
log::info!("Time-only query: {} events in range", events_in_range.len());
Ok(events_in_range.into_iter().take(limit).map(|r| r.to_calendar_event()).collect())
}
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn event_exists(
&mut self,
context: &opentelemetry::Context,
event_uid: &str,
start_time: i64,
) -> Result<bool, DbError> {
trace_db_call(context, "query", "event_exists", |_span| {
let mut conn = self.connection.lock().expect("Unable to get CalendarEventDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::Integer)]
count: i32,
}
let result: CountResult = diesel::sql_query(
"SELECT COUNT(*) as count FROM calendar_events WHERE event_uid = ?1 AND start_time = ?2"
)
.bind::<diesel::sql_types::Text, _>(event_uid)
.bind::<diesel::sql_types::BigInt, _>(start_time)
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count > 0)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_event_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError> {
trace_db_call(context, "query", "get_event_count", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get CalendarEventDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::BigInt)]
count: i64,
}
let result: CountResult =
diesel::sql_query("SELECT COUNT(*) as count FROM calendar_events")
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}

View File

@@ -1,489 +0,0 @@
use chrono::NaiveDate;
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use serde::Serialize;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::{DbError, DbErrorKind, connect};
use crate::otel::trace_db_call;
/// Represents a daily conversation summary
#[derive(Serialize, Clone, Debug)]
pub struct DailySummary {
pub id: i32,
pub date: String,
pub contact: String,
pub summary: String,
pub message_count: i32,
pub created_at: i64,
pub model_version: String,
}
/// Data for inserting a new daily summary
#[derive(Clone, Debug)]
pub struct InsertDailySummary {
pub date: String,
pub contact: String,
pub summary: String,
pub message_count: i32,
pub embedding: Vec<f32>,
pub created_at: i64,
pub model_version: String,
}
pub trait DailySummaryDao: Sync + Send {
/// Store a daily summary with its embedding
fn store_summary(
&mut self,
context: &opentelemetry::Context,
summary: InsertDailySummary,
) -> Result<DailySummary, DbError>;
/// Find semantically similar daily summaries using vector similarity
fn find_similar_summaries(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<DailySummary>, DbError>;
/// Find semantically similar daily summaries with time-based weighting
/// Combines cosine similarity with temporal proximity to target_date
/// Final score = similarity * time_weight, where time_weight decays with distance from target_date
fn find_similar_summaries_with_time_weight(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
target_date: &str,
limit: usize,
) -> Result<Vec<DailySummary>, DbError>;
/// Check if a summary exists for a given date and contact
fn summary_exists(
&mut self,
context: &opentelemetry::Context,
date: &str,
contact: &str,
) -> Result<bool, DbError>;
/// Get count of summaries for a contact
fn get_summary_count(
&mut self,
context: &opentelemetry::Context,
contact: &str,
) -> Result<i64, DbError>;
}
pub struct SqliteDailySummaryDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteDailySummaryDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteDailySummaryDao {
pub fn new() -> Self {
SqliteDailySummaryDao {
connection: Arc::new(Mutex::new(connect())),
}
}
fn serialize_vector(vec: &[f32]) -> Vec<u8> {
use zerocopy::IntoBytes;
vec.as_bytes().to_vec()
}
fn deserialize_vector(bytes: &[u8]) -> Result<Vec<f32>, DbError> {
if !bytes.len().is_multiple_of(4) {
return Err(DbError::new(DbErrorKind::QueryError));
}
let count = bytes.len() / 4;
let mut vec = Vec::with_capacity(count);
for chunk in bytes.chunks_exact(4) {
let float = f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
vec.push(float);
}
Ok(vec)
}
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
if a.len() != b.len() {
return 0.0;
}
let dot_product: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
let magnitude_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let magnitude_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if magnitude_a == 0.0 || magnitude_b == 0.0 {
return 0.0;
}
dot_product / (magnitude_a * magnitude_b)
}
}
impl DailySummaryDao for SqliteDailySummaryDao {
fn store_summary(
&mut self,
context: &opentelemetry::Context,
summary: InsertDailySummary,
) -> Result<DailySummary, DbError> {
trace_db_call(context, "insert", "store_summary", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get DailySummaryDao");
// Validate embedding dimensions
if summary.embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid embedding dimensions: {} (expected 768)",
summary.embedding.len()
));
}
let embedding_bytes = Self::serialize_vector(&summary.embedding);
// INSERT OR REPLACE to handle updates if summary needs regeneration
diesel::sql_query(
"INSERT OR REPLACE INTO daily_conversation_summaries
(date, contact, summary, message_count, embedding, created_at, model_version)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7)",
)
.bind::<diesel::sql_types::Text, _>(&summary.date)
.bind::<diesel::sql_types::Text, _>(&summary.contact)
.bind::<diesel::sql_types::Text, _>(&summary.summary)
.bind::<diesel::sql_types::Integer, _>(summary.message_count)
.bind::<diesel::sql_types::Binary, _>(&embedding_bytes)
.bind::<diesel::sql_types::BigInt, _>(summary.created_at)
.bind::<diesel::sql_types::Text, _>(&summary.model_version)
.execute(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Insert error: {:?}", e))?;
let row_id: i32 = diesel::sql_query("SELECT last_insert_rowid() as id")
.get_result::<LastInsertRowId>(conn.deref_mut())
.map(|r| r.id as i32)
.map_err(|e| anyhow::anyhow!("Failed to get last insert ID: {:?}", e))?;
Ok(DailySummary {
id: row_id,
date: summary.date,
contact: summary.contact,
summary: summary.summary,
message_count: summary.message_count,
created_at: summary.created_at,
model_version: summary.model_version,
})
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn find_similar_summaries(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<DailySummary>, DbError> {
trace_db_call(context, "query", "find_similar_summaries", |_span| {
let mut conn = self.connection.lock().expect("Unable to get DailySummaryDao");
if query_embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_embedding.len()
));
}
// Load all summaries with embeddings
let results = diesel::sql_query(
"SELECT id, date, contact, summary, message_count, embedding, created_at, model_version
FROM daily_conversation_summaries"
)
.load::<DailySummaryWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
log::info!("Loaded {} daily summaries for similarity comparison", results.len());
// Compute similarity for each summary
let mut scored_summaries: Vec<(f32, DailySummary)> = results
.into_iter()
.filter_map(|row| {
match Self::deserialize_vector(&row.embedding) {
Ok(embedding) => {
let similarity = Self::cosine_similarity(query_embedding, &embedding);
Some((
similarity,
DailySummary {
id: row.id,
date: row.date,
contact: row.contact,
summary: row.summary,
message_count: row.message_count,
created_at: row.created_at,
model_version: row.model_version,
},
))
}
Err(e) => {
log::warn!("Failed to deserialize embedding for summary {}: {:?}", row.id, e);
None
}
}
})
.collect();
// Sort by similarity (highest first)
scored_summaries.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
// Filter out poor matches (similarity < 0.3 is likely noise)
scored_summaries.retain(|(similarity, _)| *similarity >= 0.3);
// Log similarity distribution
if !scored_summaries.is_empty() {
let top_score = scored_summaries.first().map(|(s, _)| *s).unwrap_or(0.0);
let median_score = scored_summaries.get(scored_summaries.len() / 2).map(|(s, _)| *s).unwrap_or(0.0);
log::info!(
"Daily summary similarity - Top: {:.3}, Median: {:.3}, Count: {} (after 0.3 threshold)",
top_score,
median_score,
scored_summaries.len()
);
} else {
log::warn!("No daily summaries met the 0.3 similarity threshold");
}
// Take top N and log matches
let top_results: Vec<DailySummary> = scored_summaries
.into_iter()
.take(limit)
.map(|(similarity, summary)| {
log::info!(
"Summary match: similarity={:.3}, date={}, contact={}, summary=\"{}\"",
similarity,
summary.date,
summary.contact,
summary.summary.chars().take(100).collect::<String>()
);
summary
})
.collect();
Ok(top_results)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_similar_summaries_with_time_weight(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
target_date: &str,
limit: usize,
) -> Result<Vec<DailySummary>, DbError> {
trace_db_call(context, "query", "find_similar_summaries_with_time_weight", |_span| {
let mut conn = self.connection.lock().expect("Unable to get DailySummaryDao");
if query_embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_embedding.len()
));
}
// Parse target date
let target = NaiveDate::parse_from_str(target_date, "%Y-%m-%d")
.map_err(|e| anyhow::anyhow!("Invalid target date: {}", e))?;
// Load all summaries with embeddings
let results = diesel::sql_query(
"SELECT id, date, contact, summary, message_count, embedding, created_at, model_version
FROM daily_conversation_summaries"
)
.load::<DailySummaryWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
log::info!("Loaded {} daily summaries for time-weighted similarity (target: {})", results.len(), target_date);
// Compute time-weighted similarity for each summary
// Score = cosine_similarity * time_weight
// time_weight = 1 / (1 + days_distance/30) - decays with ~30 day half-life
let mut scored_summaries: Vec<(f32, f32, i64, DailySummary)> = results
.into_iter()
.filter_map(|row| {
match Self::deserialize_vector(&row.embedding) {
Ok(embedding) => {
let similarity = Self::cosine_similarity(query_embedding, &embedding);
// Calculate time weight
let summary_date = NaiveDate::parse_from_str(&row.date, "%Y-%m-%d").ok()?;
let days_distance = (target - summary_date).num_days().abs();
// Exponential decay with 30-day half-life
// At 0 days: weight = 1.0
// At 30 days: weight = 0.5
// At 60 days: weight = 0.25
// At 365 days: weight ~= 0.0001
let time_weight = 0.5_f32.powf(days_distance as f32 / 30.0);
// Combined score - but ensure semantic similarity still matters
// We use sqrt to soften the time weight's impact
let combined_score = similarity * time_weight.sqrt();
Some((
combined_score,
similarity,
days_distance,
DailySummary {
id: row.id,
date: row.date,
contact: row.contact,
summary: row.summary,
message_count: row.message_count,
created_at: row.created_at,
model_version: row.model_version,
},
))
}
Err(e) => {
log::warn!("Failed to deserialize embedding for summary {}: {:?}", row.id, e);
None
}
}
})
.collect();
// Sort by combined score (highest first)
scored_summaries.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
// Filter out poor matches (base similarity < 0.5 - stricter than before since we have time weighting)
scored_summaries.retain(|(_, similarity, _, _)| *similarity >= 0.5);
// Log similarity distribution
if !scored_summaries.is_empty() {
let (top_combined, top_sim, top_days, _) = &scored_summaries[0];
log::info!(
"Time-weighted similarity - Top: combined={:.3} (sim={:.3}, days={}), Count: {} matches",
top_combined,
top_sim,
top_days,
scored_summaries.len()
);
} else {
log::warn!("No daily summaries met the 0.5 similarity threshold");
}
// Take top N and log matches
let top_results: Vec<DailySummary> = scored_summaries
.into_iter()
.take(limit)
.map(|(combined, similarity, days, summary)| {
log::info!(
"Summary match: combined={:.3} (sim={:.3}, days={}), date={}, contact={}, summary=\"{}\"",
combined,
similarity,
days,
summary.date,
summary.contact,
summary.summary.chars().take(80).collect::<String>()
);
summary
})
.collect();
Ok(top_results)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn summary_exists(
&mut self,
context: &opentelemetry::Context,
date: &str,
contact: &str,
) -> Result<bool, DbError> {
trace_db_call(context, "query", "summary_exists", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get DailySummaryDao");
let count = diesel::sql_query(
"SELECT COUNT(*) as count FROM daily_conversation_summaries
WHERE date = ?1 AND contact = ?2",
)
.bind::<diesel::sql_types::Text, _>(date)
.bind::<diesel::sql_types::Text, _>(contact)
.get_result::<CountResult>(conn.deref_mut())
.map(|r| r.count)
.map_err(|e| anyhow::anyhow!("Count query error: {:?}", e))?;
Ok(count > 0)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_summary_count(
&mut self,
context: &opentelemetry::Context,
contact: &str,
) -> Result<i64, DbError> {
trace_db_call(context, "query", "get_summary_count", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get DailySummaryDao");
diesel::sql_query(
"SELECT COUNT(*) as count FROM daily_conversation_summaries WHERE contact = ?1",
)
.bind::<diesel::sql_types::Text, _>(contact)
.get_result::<CountResult>(conn.deref_mut())
.map(|r| r.count)
.map_err(|e| anyhow::anyhow!("Count query error: {:?}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}
// Helper structs for raw SQL queries
#[derive(QueryableByName)]
struct LastInsertRowId {
#[diesel(sql_type = diesel::sql_types::BigInt)]
id: i64,
}
#[derive(QueryableByName)]
struct DailySummaryWithVectorRow {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
#[diesel(sql_type = diesel::sql_types::Text)]
date: String,
#[diesel(sql_type = diesel::sql_types::Text)]
contact: String,
#[diesel(sql_type = diesel::sql_types::Text)]
summary: String,
#[diesel(sql_type = diesel::sql_types::Integer)]
message_count: i32,
#[diesel(sql_type = diesel::sql_types::Binary)]
embedding: Vec<u8>,
#[diesel(sql_type = diesel::sql_types::BigInt)]
created_at: i64,
#[diesel(sql_type = diesel::sql_types::Text)]
model_version: String,
}
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::BigInt)]
count: i64,
}

View File

@@ -1,133 +0,0 @@
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::models::{InsertPhotoInsight, PhotoInsight};
use crate::database::schema;
use crate::database::{DbError, DbErrorKind, connect};
use crate::otel::trace_db_call;
pub trait InsightDao: Sync + Send {
fn store_insight(
&mut self,
context: &opentelemetry::Context,
insight: InsertPhotoInsight,
) -> Result<PhotoInsight, DbError>;
fn get_insight(
&mut self,
context: &opentelemetry::Context,
file_path: &str,
) -> Result<Option<PhotoInsight>, DbError>;
fn delete_insight(
&mut self,
context: &opentelemetry::Context,
file_path: &str,
) -> Result<(), DbError>;
fn get_all_insights(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<PhotoInsight>, DbError>;
}
pub struct SqliteInsightDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteInsightDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteInsightDao {
pub fn new() -> Self {
SqliteInsightDao {
connection: Arc::new(Mutex::new(connect())),
}
}
}
impl InsightDao for SqliteInsightDao {
fn store_insight(
&mut self,
context: &opentelemetry::Context,
insight: InsertPhotoInsight,
) -> Result<PhotoInsight, DbError> {
trace_db_call(context, "insert", "store_insight", |_span| {
use schema::photo_insights::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get InsightDao");
// Insert or replace on conflict (UNIQUE constraint on file_path)
diesel::replace_into(photo_insights)
.values(&insight)
.execute(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Insert error"))?;
// Retrieve the inserted record
photo_insights
.filter(file_path.eq(&insight.file_path))
.first::<PhotoInsight>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn get_insight(
&mut self,
context: &opentelemetry::Context,
path: &str,
) -> Result<Option<PhotoInsight>, DbError> {
trace_db_call(context, "query", "get_insight", |_span| {
use schema::photo_insights::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get InsightDao");
photo_insights
.filter(file_path.eq(path))
.first::<PhotoInsight>(connection.deref_mut())
.optional()
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn delete_insight(
&mut self,
context: &opentelemetry::Context,
path: &str,
) -> Result<(), DbError> {
trace_db_call(context, "delete", "delete_insight", |_span| {
use schema::photo_insights::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get InsightDao");
diesel::delete(photo_insights.filter(file_path.eq(path)))
.execute(connection.deref_mut())
.map(|_| ())
.map_err(|_| anyhow::anyhow!("Delete error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_all_insights(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<PhotoInsight>, DbError> {
trace_db_call(context, "query", "get_all_insights", |_span| {
use schema::photo_insights::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get InsightDao");
photo_insights
.order(generated_at.desc())
.load::<PhotoInsight>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}

View File

@@ -1,528 +0,0 @@
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use serde::Serialize;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::{DbError, DbErrorKind, connect};
use crate::otel::trace_db_call;
/// Represents a location history record
#[derive(Serialize, Clone, Debug)]
pub struct LocationRecord {
pub id: i32,
pub timestamp: i64,
pub latitude: f64,
pub longitude: f64,
pub accuracy: Option<i32>,
pub activity: Option<String>,
pub activity_confidence: Option<i32>,
pub place_name: Option<String>,
pub place_category: Option<String>,
pub created_at: i64,
pub source_file: Option<String>,
}
/// Data for inserting a new location record
#[derive(Clone, Debug)]
pub struct InsertLocationRecord {
pub timestamp: i64,
pub latitude: f64,
pub longitude: f64,
pub accuracy: Option<i32>,
pub activity: Option<String>,
pub activity_confidence: Option<i32>,
pub place_name: Option<String>,
pub place_category: Option<String>,
pub embedding: Option<Vec<f32>>, // 768-dim, optional (rarely used)
pub created_at: i64,
pub source_file: Option<String>,
}
pub trait LocationHistoryDao: Sync + Send {
/// Store single location record
fn store_location(
&mut self,
context: &opentelemetry::Context,
location: InsertLocationRecord,
) -> Result<LocationRecord, DbError>;
/// Batch insert locations (Google Takeout has millions of points)
fn store_locations_batch(
&mut self,
context: &opentelemetry::Context,
locations: Vec<InsertLocationRecord>,
) -> Result<usize, DbError>;
/// Find nearest location to timestamp (PRIMARY query)
/// "Where was I at photo timestamp ±N minutes?"
fn find_nearest_location(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
max_time_diff_seconds: i64,
) -> Result<Option<LocationRecord>, DbError>;
/// Find locations in time range
fn find_locations_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<LocationRecord>, DbError>;
/// Find locations near GPS coordinates (for "photos near this place")
/// Uses approximate bounding box for performance
fn find_locations_near_point(
&mut self,
context: &opentelemetry::Context,
latitude: f64,
longitude: f64,
radius_km: f64,
) -> Result<Vec<LocationRecord>, DbError>;
/// Deduplicate: check if location exists
fn location_exists(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
latitude: f64,
longitude: f64,
) -> Result<bool, DbError>;
/// Get count of location records
fn get_location_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError>;
}
pub struct SqliteLocationHistoryDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteLocationHistoryDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteLocationHistoryDao {
pub fn new() -> Self {
SqliteLocationHistoryDao {
connection: Arc::new(Mutex::new(connect())),
}
}
fn serialize_vector(vec: &[f32]) -> Vec<u8> {
use zerocopy::IntoBytes;
vec.as_bytes().to_vec()
}
/// Haversine distance calculation (in kilometers)
/// Used for filtering locations by proximity to a point
fn haversine_distance(lat1: f64, lon1: f64, lat2: f64, lon2: f64) -> f64 {
const R: f64 = 6371.0; // Earth radius in km
let d_lat = (lat2 - lat1).to_radians();
let d_lon = (lon2 - lon1).to_radians();
let a = (d_lat / 2.0).sin().powi(2)
+ lat1.to_radians().cos() * lat2.to_radians().cos() * (d_lon / 2.0).sin().powi(2);
let c = 2.0 * a.sqrt().atan2((1.0 - a).sqrt());
R * c
}
/// Calculate approximate bounding box for spatial queries
/// Returns (min_lat, max_lat, min_lon, max_lon)
fn bounding_box(lat: f64, lon: f64, radius_km: f64) -> (f64, f64, f64, f64) {
const KM_PER_DEGREE_LAT: f64 = 111.0;
let km_per_degree_lon = 111.0 * lat.to_radians().cos();
let delta_lat = radius_km / KM_PER_DEGREE_LAT;
let delta_lon = radius_km / km_per_degree_lon;
(
lat - delta_lat, // min_lat
lat + delta_lat, // max_lat
lon - delta_lon, // min_lon
lon + delta_lon, // max_lon
)
}
}
#[derive(QueryableByName)]
struct LocationRecordRow {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
#[diesel(sql_type = diesel::sql_types::BigInt)]
timestamp: i64,
#[diesel(sql_type = diesel::sql_types::Float)]
latitude: f32,
#[diesel(sql_type = diesel::sql_types::Float)]
longitude: f32,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Integer>)]
accuracy: Option<i32>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
activity: Option<String>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Integer>)]
activity_confidence: Option<i32>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
place_name: Option<String>,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
place_category: Option<String>,
#[diesel(sql_type = diesel::sql_types::BigInt)]
created_at: i64,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
source_file: Option<String>,
}
impl LocationRecordRow {
fn to_location_record(&self) -> LocationRecord {
LocationRecord {
id: self.id,
timestamp: self.timestamp,
latitude: self.latitude as f64,
longitude: self.longitude as f64,
accuracy: self.accuracy,
activity: self.activity.clone(),
activity_confidence: self.activity_confidence,
place_name: self.place_name.clone(),
place_category: self.place_category.clone(),
created_at: self.created_at,
source_file: self.source_file.clone(),
}
}
}
#[derive(QueryableByName)]
struct LastInsertRowId {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
}
impl LocationHistoryDao for SqliteLocationHistoryDao {
fn store_location(
&mut self,
context: &opentelemetry::Context,
location: InsertLocationRecord,
) -> Result<LocationRecord, DbError> {
trace_db_call(context, "insert", "store_location", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
// Validate embedding dimensions if provided (rare for location data)
if let Some(ref emb) = location.embedding
&& emb.len() != 768
{
return Err(anyhow::anyhow!(
"Invalid embedding dimensions: {} (expected 768)",
emb.len()
));
}
let embedding_bytes = location
.embedding
.as_ref()
.map(|e| Self::serialize_vector(e));
// INSERT OR IGNORE to handle re-imports (UNIQUE constraint on timestamp+lat+lon)
diesel::sql_query(
"INSERT OR IGNORE INTO location_history
(timestamp, latitude, longitude, accuracy, activity, activity_confidence,
place_name, place_category, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11)",
)
.bind::<diesel::sql_types::BigInt, _>(location.timestamp)
.bind::<diesel::sql_types::Float, _>(location.latitude as f32)
.bind::<diesel::sql_types::Float, _>(location.longitude as f32)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Integer>, _>(&location.accuracy)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&location.activity)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Integer>, _>(
&location.activity_confidence,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&location.place_name)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&location.place_category,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Binary>, _>(&embedding_bytes)
.bind::<diesel::sql_types::BigInt, _>(location.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&location.source_file)
.execute(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Insert error: {:?}", e))?;
let row_id: i32 = diesel::sql_query("SELECT last_insert_rowid() as id")
.get_result::<LastInsertRowId>(conn.deref_mut())
.map(|r| r.id)
.map_err(|e| anyhow::anyhow!("Failed to get last insert ID: {:?}", e))?;
Ok(LocationRecord {
id: row_id,
timestamp: location.timestamp,
latitude: location.latitude,
longitude: location.longitude,
accuracy: location.accuracy,
activity: location.activity,
activity_confidence: location.activity_confidence,
place_name: location.place_name,
place_category: location.place_category,
created_at: location.created_at,
source_file: location.source_file,
})
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn store_locations_batch(
&mut self,
context: &opentelemetry::Context,
locations: Vec<InsertLocationRecord>,
) -> Result<usize, DbError> {
trace_db_call(context, "insert", "store_locations_batch", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
let mut inserted = 0;
conn.transaction::<_, anyhow::Error, _>(|conn| {
for location in locations {
// Validate embedding if provided (rare)
if let Some(ref emb) = location.embedding
&& emb.len() != 768
{
log::warn!(
"Skipping location with invalid embedding dimensions: {}",
emb.len()
);
continue;
}
let embedding_bytes = location
.embedding
.as_ref()
.map(|e| Self::serialize_vector(e));
let rows_affected = diesel::sql_query(
"INSERT OR IGNORE INTO location_history
(timestamp, latitude, longitude, accuracy, activity, activity_confidence,
place_name, place_category, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6, ?7, ?8, ?9, ?10, ?11)",
)
.bind::<diesel::sql_types::BigInt, _>(location.timestamp)
.bind::<diesel::sql_types::Float, _>(location.latitude as f32)
.bind::<diesel::sql_types::Float, _>(location.longitude as f32)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Integer>, _>(
&location.accuracy,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&location.activity,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Integer>, _>(
&location.activity_confidence,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&location.place_name,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&location.place_category,
)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Binary>, _>(
&embedding_bytes,
)
.bind::<diesel::sql_types::BigInt, _>(location.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&location.source_file,
)
.execute(conn)
.map_err(|e| anyhow::anyhow!("Batch insert error: {:?}", e))?;
if rows_affected > 0 {
inserted += 1;
}
}
Ok(())
})
.map_err(|e| anyhow::anyhow!("Transaction error: {:?}", e))?;
Ok(inserted)
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn find_nearest_location(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
max_time_diff_seconds: i64,
) -> Result<Option<LocationRecord>, DbError> {
trace_db_call(context, "query", "find_nearest_location", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
let start_ts = timestamp - max_time_diff_seconds;
let end_ts = timestamp + max_time_diff_seconds;
// Find location closest to target timestamp within window
let results = diesel::sql_query(
"SELECT id, timestamp, latitude, longitude, accuracy, activity, activity_confidence,
place_name, place_category, created_at, source_file
FROM location_history
WHERE timestamp >= ?1 AND timestamp <= ?2
ORDER BY ABS(timestamp - ?3) ASC
LIMIT 1"
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.bind::<diesel::sql_types::BigInt, _>(timestamp)
.load::<LocationRecordRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(results.into_iter().next().map(|r| r.to_location_record()))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_locations_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<LocationRecord>, DbError> {
trace_db_call(context, "query", "find_locations_in_range", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
diesel::sql_query(
"SELECT id, timestamp, latitude, longitude, accuracy, activity, activity_confidence,
place_name, place_category, created_at, source_file
FROM location_history
WHERE timestamp >= ?1 AND timestamp <= ?2
ORDER BY timestamp ASC"
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.load::<LocationRecordRow>(conn.deref_mut())
.map(|rows| rows.into_iter().map(|r| r.to_location_record()).collect())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_locations_near_point(
&mut self,
context: &opentelemetry::Context,
latitude: f64,
longitude: f64,
radius_km: f64,
) -> Result<Vec<LocationRecord>, DbError> {
trace_db_call(context, "query", "find_locations_near_point", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
// Use bounding box for initial filter (fast, indexed)
let (min_lat, max_lat, min_lon, max_lon) =
Self::bounding_box(latitude, longitude, radius_km);
let results = diesel::sql_query(
"SELECT id, timestamp, latitude, longitude, accuracy, activity, activity_confidence,
place_name, place_category, created_at, source_file
FROM location_history
WHERE latitude >= ?1 AND latitude <= ?2
AND longitude >= ?3 AND longitude <= ?4"
)
.bind::<diesel::sql_types::Float, _>(min_lat as f32)
.bind::<diesel::sql_types::Float, _>(max_lat as f32)
.bind::<diesel::sql_types::Float, _>(min_lon as f32)
.bind::<diesel::sql_types::Float, _>(max_lon as f32)
.load::<LocationRecordRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
// Refine with Haversine distance (in-memory, post-filter)
let filtered: Vec<LocationRecord> = results
.into_iter()
.map(|r| r.to_location_record())
.filter(|loc| {
let distance =
Self::haversine_distance(latitude, longitude, loc.latitude, loc.longitude);
distance <= radius_km
})
.collect();
log::info!(
"Found {} locations within {} km of ({}, {})",
filtered.len(),
radius_km,
latitude,
longitude
);
Ok(filtered)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn location_exists(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
latitude: f64,
longitude: f64,
) -> Result<bool, DbError> {
trace_db_call(context, "query", "location_exists", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::Integer)]
count: i32,
}
let result: CountResult = diesel::sql_query(
"SELECT COUNT(*) as count FROM location_history
WHERE timestamp = ?1 AND latitude = ?2 AND longitude = ?3",
)
.bind::<diesel::sql_types::BigInt, _>(timestamp)
.bind::<diesel::sql_types::Float, _>(latitude as f32)
.bind::<diesel::sql_types::Float, _>(longitude as f32)
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count > 0)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_location_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError> {
trace_db_call(context, "query", "get_location_count", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get LocationHistoryDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::BigInt)]
count: i64,
}
let result: CountResult =
diesel::sql_query("SELECT COUNT(*) as count FROM location_history")
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}

View File

@@ -1,46 +1,26 @@
use bcrypt::{DEFAULT_COST, hash, verify};
use bcrypt::{hash, verify, DEFAULT_COST};
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::models::{
Favorite, ImageExif, InsertFavorite, InsertImageExif, InsertUser, User,
use std::{
ops::Deref,
sync::{Arc, Mutex},
};
use crate::otel::trace_db_call;
pub mod calendar_dao;
pub mod daily_summary_dao;
pub mod insights_dao;
pub mod location_dao;
use crate::database::models::{Favorite, InsertFavorite, InsertUser, User};
pub mod models;
pub mod preview_dao;
pub mod schema;
pub mod search_dao;
pub use calendar_dao::{CalendarEventDao, SqliteCalendarEventDao};
pub use daily_summary_dao::{DailySummaryDao, InsertDailySummary, SqliteDailySummaryDao};
pub use insights_dao::{InsightDao, SqliteInsightDao};
pub use location_dao::{LocationHistoryDao, SqliteLocationHistoryDao};
pub use preview_dao::{PreviewDao, SqlitePreviewDao};
pub use search_dao::{SearchHistoryDao, SqliteSearchHistoryDao};
pub trait UserDao {
fn create_user(&mut self, user: &str, password: &str) -> Option<User>;
fn get_user(&mut self, user: &str, password: &str) -> Option<User>;
fn user_exists(&mut self, user: &str) -> bool;
fn create_user(&self, user: &str, password: &str) -> Option<User>;
fn get_user(&self, user: &str, password: &str) -> Option<User>;
fn user_exists(&self, user: &str) -> bool;
}
pub struct SqliteUserDao {
connection: SqliteConnection,
}
impl Default for SqliteUserDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteUserDao {
pub fn new() -> Self {
Self {
@@ -49,27 +29,9 @@ impl SqliteUserDao {
}
}
#[cfg(test)]
pub mod test {
use diesel::{Connection, SqliteConnection};
use diesel_migrations::{EmbeddedMigrations, MigrationHarness, embed_migrations};
const DB_MIGRATIONS: EmbeddedMigrations = embed_migrations!();
pub fn in_memory_db_connection() -> SqliteConnection {
let mut connection = SqliteConnection::establish(":memory:")
.expect("Unable to create in-memory db connection");
connection
.run_pending_migrations(DB_MIGRATIONS)
.expect("Failure running DB migrations");
connection
}
}
impl UserDao for SqliteUserDao {
// TODO: Should probably use Result here
fn create_user(&mut self, user: &str, pass: &str) -> Option<User> {
fn create_user(&self, user: &str, pass: &str) -> std::option::Option<User> {
use schema::users::dsl::*;
let hashed = hash(pass, DEFAULT_COST);
@@ -79,12 +41,12 @@ impl UserDao for SqliteUserDao {
username: user,
password: &hash,
})
.execute(&mut self.connection)
.execute(&self.connection)
.unwrap();
users
.filter(username.eq(username))
.load::<User>(&mut self.connection)
.load::<User>(&self.connection)
.unwrap()
.first()
.cloned()
@@ -93,12 +55,12 @@ impl UserDao for SqliteUserDao {
}
}
fn get_user(&mut self, user: &str, pass: &str) -> Option<User> {
fn get_user(&self, user: &str, pass: &str) -> Option<User> {
use schema::users::dsl::*;
match users
.filter(username.eq(user))
.load::<User>(&mut self.connection)
.load::<User>(&self.connection)
.unwrap_or_default()
.first()
{
@@ -107,14 +69,15 @@ impl UserDao for SqliteUserDao {
}
}
fn user_exists(&mut self, user: &str) -> bool {
fn user_exists(&self, user: &str) -> bool {
use schema::users::dsl::*;
!users
users
.filter(username.eq(user))
.load::<User>(&mut self.connection)
.load::<User>(&self.connection)
.unwrap_or_default()
.is_empty()
.first()
.is_some()
}
}
@@ -143,27 +106,18 @@ pub enum DbErrorKind {
AlreadyExists,
InsertError,
QueryError,
UpdateError,
}
pub trait FavoriteDao: Sync + Send {
fn add_favorite(&mut self, user_id: i32, favorite_path: &str) -> Result<usize, DbError>;
fn remove_favorite(&mut self, user_id: i32, favorite_path: String);
fn get_favorites(&mut self, user_id: i32) -> Result<Vec<Favorite>, DbError>;
fn update_path(&mut self, old_path: &str, new_path: &str) -> Result<(), DbError>;
fn get_all_paths(&mut self) -> Result<Vec<String>, DbError>;
fn add_favorite(&self, user_id: i32, favorite_path: &str) -> Result<usize, DbError>;
fn remove_favorite(&self, user_id: i32, favorite_path: String);
fn get_favorites(&self, user_id: i32) -> Result<Vec<Favorite>, DbError>;
}
pub struct SqliteFavoriteDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteFavoriteDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteFavoriteDao {
pub fn new() -> Self {
SqliteFavoriteDao {
@@ -173,14 +127,15 @@ impl SqliteFavoriteDao {
}
impl FavoriteDao for SqliteFavoriteDao {
fn add_favorite(&mut self, user_id: i32, favorite_path: &str) -> Result<usize, DbError> {
fn add_favorite(&self, user_id: i32, favorite_path: &str) -> Result<usize, DbError> {
use schema::favorites::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get FavoriteDao");
let connection = self.connection.lock().unwrap();
let connection = connection.deref();
if favorites
.filter(userid.eq(user_id).and(path.eq(&favorite_path)))
.first::<Favorite>(connection.deref_mut())
.first::<Favorite>(connection)
.is_err()
{
diesel::insert_into(favorites)
@@ -188,548 +143,28 @@ impl FavoriteDao for SqliteFavoriteDao {
userid: &user_id,
path: favorite_path,
})
.execute(connection.deref_mut())
.execute(connection)
.map_err(|_| DbError::new(DbErrorKind::InsertError))
} else {
Err(DbError::exists())
}
}
fn remove_favorite(&mut self, user_id: i32, favorite_path: String) {
fn remove_favorite(&self, user_id: i32, favorite_path: String) {
use schema::favorites::dsl::*;
diesel::delete(favorites)
.filter(userid.eq(user_id).and(path.eq(favorite_path)))
.execute(self.connection.lock().unwrap().deref_mut())
.execute(self.connection.lock().unwrap().deref())
.unwrap();
}
fn get_favorites(&mut self, user_id: i32) -> Result<Vec<Favorite>, DbError> {
fn get_favorites(&self, user_id: i32) -> Result<Vec<Favorite>, DbError> {
use schema::favorites::dsl::*;
favorites
.filter(userid.eq(user_id))
.load::<Favorite>(self.connection.lock().unwrap().deref_mut())
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn update_path(&mut self, old_path: &str, new_path: &str) -> Result<(), DbError> {
use schema::favorites::dsl::*;
diesel::update(favorites.filter(path.eq(old_path)))
.set(path.eq(new_path))
.execute(self.connection.lock().unwrap().deref_mut())
.map_err(|_| DbError::new(DbErrorKind::UpdateError))?;
Ok(())
}
fn get_all_paths(&mut self) -> Result<Vec<String>, DbError> {
use schema::favorites::dsl::*;
favorites
.select(path)
.distinct()
.load(self.connection.lock().unwrap().deref_mut())
.load::<Favorite>(self.connection.lock().unwrap().deref())
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}
pub trait ExifDao: Sync + Send {
fn store_exif(
&mut self,
context: &opentelemetry::Context,
exif_data: InsertImageExif,
) -> Result<ImageExif, DbError>;
fn get_exif(
&mut self,
context: &opentelemetry::Context,
file_path: &str,
) -> Result<Option<ImageExif>, DbError>;
fn update_exif(
&mut self,
context: &opentelemetry::Context,
exif_data: InsertImageExif,
) -> Result<ImageExif, DbError>;
fn delete_exif(
&mut self,
context: &opentelemetry::Context,
file_path: &str,
) -> Result<(), DbError>;
fn get_all_with_date_taken(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<(String, i64)>, DbError>;
/// Batch load EXIF data for multiple file paths (single query)
fn get_exif_batch(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
) -> Result<Vec<ImageExif>, DbError>;
/// Query files by EXIF criteria with optional filters
fn query_by_exif(
&mut self,
context: &opentelemetry::Context,
camera_make: Option<&str>,
camera_model: Option<&str>,
lens_model: Option<&str>,
gps_bounds: Option<(f64, f64, f64, f64)>, // (min_lat, max_lat, min_lon, max_lon)
date_from: Option<i64>,
date_to: Option<i64>,
) -> Result<Vec<ImageExif>, DbError>;
/// Get distinct camera makes with counts
fn get_camera_makes(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<(String, i64)>, DbError>;
/// Update file path in EXIF database
fn update_file_path(
&mut self,
context: &opentelemetry::Context,
old_path: &str,
new_path: &str,
) -> Result<(), DbError>;
/// Get all file paths from EXIF database
fn get_all_file_paths(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<String>, DbError>;
/// Get files sorted by date with optional pagination
/// Returns (sorted_file_paths, total_count)
fn get_files_sorted_by_date(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
ascending: bool,
limit: Option<i64>,
offset: i64,
) -> Result<(Vec<String>, i64), DbError>;
/// Get all photos with GPS coordinates
/// Returns Vec<(file_path, latitude, longitude, date_taken)>
fn get_all_with_gps(
&mut self,
context: &opentelemetry::Context,
base_path: &str,
recursive: bool,
) -> Result<Vec<(String, f64, f64, Option<i64>)>, DbError>;
}
pub struct SqliteExifDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteExifDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteExifDao {
pub fn new() -> Self {
SqliteExifDao {
connection: Arc::new(Mutex::new(connect())),
}
}
}
impl ExifDao for SqliteExifDao {
fn store_exif(
&mut self,
context: &opentelemetry::Context,
exif_data: InsertImageExif,
) -> Result<ImageExif, DbError> {
trace_db_call(context, "insert", "store_exif", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
diesel::insert_into(image_exif)
.values(&exif_data)
.execute(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Insert error"))?;
image_exif
.filter(file_path.eq(&exif_data.file_path))
.first::<ImageExif>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn get_exif(
&mut self,
context: &opentelemetry::Context,
path: &str,
) -> Result<Option<ImageExif>, DbError> {
trace_db_call(context, "query", "get_exif", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
// Try both normalized (forward slash) and Windows (backslash) paths
// since database may contain either format
let normalized = path.replace('\\', "/");
let windows_path = path.replace('/', "\\");
match image_exif
.filter(file_path.eq(&normalized).or(file_path.eq(&windows_path)))
.first::<ImageExif>(connection.deref_mut())
{
Ok(exif) => Ok(Some(exif)),
Err(diesel::result::Error::NotFound) => Ok(None),
Err(_) => Err(anyhow::anyhow!("Query error")),
}
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn update_exif(
&mut self,
context: &opentelemetry::Context,
exif_data: InsertImageExif,
) -> Result<ImageExif, DbError> {
trace_db_call(context, "update", "update_exif", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
diesel::update(image_exif.filter(file_path.eq(&exif_data.file_path)))
.set((
camera_make.eq(&exif_data.camera_make),
camera_model.eq(&exif_data.camera_model),
lens_model.eq(&exif_data.lens_model),
width.eq(&exif_data.width),
height.eq(&exif_data.height),
orientation.eq(&exif_data.orientation),
gps_latitude.eq(&exif_data.gps_latitude),
gps_longitude.eq(&exif_data.gps_longitude),
gps_altitude.eq(&exif_data.gps_altitude),
focal_length.eq(&exif_data.focal_length),
aperture.eq(&exif_data.aperture),
shutter_speed.eq(&exif_data.shutter_speed),
iso.eq(&exif_data.iso),
date_taken.eq(&exif_data.date_taken),
last_modified.eq(&exif_data.last_modified),
))
.execute(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Update error"))?;
image_exif
.filter(file_path.eq(&exif_data.file_path))
.first::<ImageExif>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::UpdateError))
}
fn delete_exif(&mut self, context: &opentelemetry::Context, path: &str) -> Result<(), DbError> {
trace_db_call(context, "delete", "delete_exif", |_span| {
use schema::image_exif::dsl::*;
diesel::delete(image_exif.filter(file_path.eq(path)))
.execute(self.connection.lock().unwrap().deref_mut())
.map(|_| ())
.map_err(|_| anyhow::anyhow!("Delete error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_all_with_date_taken(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<(String, i64)>, DbError> {
trace_db_call(context, "query", "get_all_with_date_taken", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
image_exif
.select((file_path, date_taken))
.filter(date_taken.is_not_null())
.load::<(String, Option<i64>)>(connection.deref_mut())
.map(|records| {
records
.into_iter()
.filter_map(|(path, dt)| dt.map(|ts| (path, ts)))
.collect()
})
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_exif_batch(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
) -> Result<Vec<ImageExif>, DbError> {
trace_db_call(context, "query", "get_exif_batch", |_span| {
use schema::image_exif::dsl::*;
if file_paths.is_empty() {
return Ok(Vec::new());
}
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
image_exif
.filter(file_path.eq_any(file_paths))
.load::<ImageExif>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn query_by_exif(
&mut self,
context: &opentelemetry::Context,
camera_make_filter: Option<&str>,
camera_model_filter: Option<&str>,
lens_model_filter: Option<&str>,
gps_bounds: Option<(f64, f64, f64, f64)>,
date_from: Option<i64>,
date_to: Option<i64>,
) -> Result<Vec<ImageExif>, DbError> {
trace_db_call(context, "query", "query_by_exif", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
let mut query = image_exif.into_boxed();
// Camera filters (case-insensitive partial match)
if let Some(make) = camera_make_filter {
query = query.filter(camera_make.like(format!("%{}%", make)));
}
if let Some(model) = camera_model_filter {
query = query.filter(camera_model.like(format!("%{}%", model)));
}
if let Some(lens) = lens_model_filter {
query = query.filter(lens_model.like(format!("%{}%", lens)));
}
// GPS bounding box
if let Some((min_lat, max_lat, min_lon, max_lon)) = gps_bounds {
query = query
.filter(gps_latitude.between(min_lat as f32, max_lat as f32))
.filter(gps_longitude.between(min_lon as f32, max_lon as f32))
.filter(gps_latitude.is_not_null())
.filter(gps_longitude.is_not_null());
}
// Date range
if let Some(from) = date_from {
query = query.filter(date_taken.ge(from));
}
if let Some(to) = date_to {
query = query.filter(date_taken.le(to));
}
if date_from.is_some() || date_to.is_some() {
query = query.filter(date_taken.is_not_null());
}
query
.load::<ImageExif>(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_camera_makes(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<(String, i64)>, DbError> {
trace_db_call(context, "query", "get_camera_makes", |_span| {
use diesel::dsl::count;
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
image_exif
.filter(camera_make.is_not_null())
.group_by(camera_make)
.select((camera_make, count(id)))
.order(count(id).desc())
.load::<(Option<String>, i64)>(connection.deref_mut())
.map(|records| {
records
.into_iter()
.filter_map(|(make, cnt)| make.map(|m| (m, cnt)))
.collect()
})
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn update_file_path(
&mut self,
context: &opentelemetry::Context,
old_path: &str,
new_path: &str,
) -> Result<(), DbError> {
trace_db_call(context, "update", "update_file_path", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
diesel::update(image_exif.filter(file_path.eq(old_path)))
.set(file_path.eq(new_path))
.execute(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Update error"))?;
Ok(())
})
.map_err(|_| DbError::new(DbErrorKind::UpdateError))
}
fn get_all_file_paths(
&mut self,
context: &opentelemetry::Context,
) -> Result<Vec<String>, DbError> {
trace_db_call(context, "query", "get_all_file_paths", |_span| {
use schema::image_exif::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get ExifDao");
image_exif
.select(file_path)
.load(connection.deref_mut())
.map_err(|_| anyhow::anyhow!("Query error"))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_files_sorted_by_date(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
ascending: bool,
limit: Option<i64>,
offset: i64,
) -> Result<(Vec<String>, i64), DbError> {
trace_db_call(context, "query", "get_files_sorted_by_date", |span| {
use diesel::dsl::count_star;
use opentelemetry::KeyValue;
use opentelemetry::trace::Span;
use schema::image_exif::dsl::*;
span.set_attributes(vec![
KeyValue::new("file_count", file_paths.len() as i64),
KeyValue::new("ascending", ascending.to_string()),
KeyValue::new("limit", limit.map(|l| l.to_string()).unwrap_or_default()),
KeyValue::new("offset", offset.to_string()),
]);
if file_paths.is_empty() {
return Ok((Vec::new(), 0));
}
let connection = &mut *self.connection.lock().unwrap();
// Get total count of files that have EXIF data
let total_count: i64 = image_exif
.filter(file_path.eq_any(file_paths))
.select(count_star())
.first(connection)
.map_err(|_| anyhow::anyhow!("Count query error"))?;
// Build sorted query
let mut query = image_exif.filter(file_path.eq_any(file_paths)).into_boxed();
// Apply sorting
// Note: SQLite NULL handling varies - NULLs appear first for ASC, last for DESC by default
if ascending {
query = query.order(date_taken.asc());
} else {
query = query.order(date_taken.desc());
}
// Apply pagination if requested
if let Some(limit_val) = limit {
query = query.limit(limit_val).offset(offset);
}
// Execute and extract file paths
let results: Vec<String> = query
.select(file_path)
.load::<String>(connection)
.map_err(|_| anyhow::anyhow!("Query error"))?;
Ok((results, total_count))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_all_with_gps(
&mut self,
context: &opentelemetry::Context,
base_path: &str,
recursive: bool,
) -> Result<Vec<(String, f64, f64, Option<i64>)>, DbError> {
trace_db_call(context, "query", "get_all_with_gps", |span| {
use opentelemetry::KeyValue;
use opentelemetry::trace::Span;
use schema::image_exif::dsl::*;
span.set_attributes(vec![
KeyValue::new("base_path", base_path.to_string()),
KeyValue::new("recursive", recursive.to_string()),
]);
let connection = &mut *self.connection.lock().unwrap();
// Query all photos with non-null GPS coordinates
let mut query = image_exif
.filter(gps_latitude.is_not_null().and(gps_longitude.is_not_null()))
.into_boxed();
// Apply path filtering
// If base_path is empty or "/", return all GPS photos (no filter)
// Otherwise filter by path prefix
if !base_path.is_empty() && base_path != "/" {
// Match base path as prefix (with wildcard)
query = query.filter(file_path.like(format!("{}%", base_path)));
span.set_attribute(KeyValue::new("path_filter_applied", true));
} else {
span.set_attribute(KeyValue::new("path_filter_applied", false));
span.set_attribute(KeyValue::new("returning_all_gps_photos", true));
}
// Load full ImageExif records
let results: Vec<ImageExif> = query
.load::<ImageExif>(connection)
.map_err(|e| anyhow::anyhow!("GPS query error: {}", e))?;
// Convert to tuple format (path, lat, lon, date_taken)
// Filter out any rows where GPS is still None (shouldn't happen due to filter)
// Cast f32 GPS values to f64 for API compatibility
let filtered: Vec<(String, f64, f64, Option<i64>)> = results
.into_iter()
.filter_map(|exif| {
if let (Some(lat_val), Some(lon_val)) = (exif.gps_latitude, exif.gps_longitude)
{
Some((
exif.file_path,
lat_val as f64,
lon_val as f64,
exif.date_taken,
))
} else {
None
}
})
.collect();
span.set_attribute(KeyValue::new("result_count", filtered.len() as i64));
Ok(filtered)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}

View File

@@ -1,8 +1,8 @@
use crate::database::schema::{favorites, image_exif, photo_insights, users, video_preview_clips};
use crate::database::schema::{favorites, tagged_photo, tags, users};
use serde::Serialize;
#[derive(Insertable)]
#[diesel(table_name = users)]
#[table_name = "users"]
pub struct InsertUser<'a> {
pub username: &'a str,
pub password: &'a str,
@@ -17,7 +17,7 @@ pub struct User {
}
#[derive(Insertable)]
#[diesel(table_name = favorites)]
#[table_name = "favorites"]
pub struct InsertFavorite<'a> {
pub userid: &'a i32,
pub path: &'a str,
@@ -30,87 +30,32 @@ pub struct Favorite {
pub path: String,
}
#[derive(Insertable)]
#[diesel(table_name = image_exif)]
pub struct InsertImageExif {
pub file_path: String,
pub camera_make: Option<String>,
pub camera_model: Option<String>,
pub lens_model: Option<String>,
pub width: Option<i32>,
pub height: Option<i32>,
pub orientation: Option<i32>,
pub gps_latitude: Option<f32>,
pub gps_longitude: Option<f32>,
pub gps_altitude: Option<f32>,
pub focal_length: Option<f32>,
pub aperture: Option<f32>,
pub shutter_speed: Option<String>,
pub iso: Option<i32>,
pub date_taken: Option<i64>,
#[derive(Serialize, Queryable, Clone, Debug)]
pub struct Tag {
pub id: i32,
pub name: String,
pub created_time: i64,
pub last_modified: i64,
}
#[derive(Serialize, Queryable, Clone, Debug)]
pub struct ImageExif {
pub id: i32,
pub file_path: String,
pub camera_make: Option<String>,
pub camera_model: Option<String>,
pub lens_model: Option<String>,
pub width: Option<i32>,
pub height: Option<i32>,
pub orientation: Option<i32>,
pub gps_latitude: Option<f32>,
pub gps_longitude: Option<f32>,
pub gps_altitude: Option<f32>,
pub focal_length: Option<f32>,
pub aperture: Option<f32>,
pub shutter_speed: Option<String>,
pub iso: Option<i32>,
pub date_taken: Option<i64>,
#[derive(Insertable, Clone, Debug)]
#[table_name = "tags"]
pub struct InsertTag {
pub name: String,
pub created_time: i64,
pub last_modified: i64,
}
#[derive(Insertable)]
#[diesel(table_name = photo_insights)]
pub struct InsertPhotoInsight {
pub file_path: String,
pub title: String,
pub summary: String,
pub generated_at: i64,
pub model_version: String,
#[derive(Insertable, Clone, Debug)]
#[table_name = "tagged_photo"]
pub struct InsertTaggedPhoto {
pub tag_id: i32,
pub photo_name: String,
pub created_time: i64,
}
#[derive(Serialize, Queryable, Clone, Debug)]
pub struct PhotoInsight {
#[derive(Queryable, Clone, Debug)]
pub struct TaggedPhoto {
pub id: i32,
pub file_path: String,
pub title: String,
pub summary: String,
pub generated_at: i64,
pub model_version: String,
}
#[derive(Insertable)]
#[diesel(table_name = video_preview_clips)]
pub struct InsertVideoPreviewClip {
pub file_path: String,
pub status: String,
pub created_at: String,
pub updated_at: String,
}
#[derive(Serialize, Queryable, Clone, Debug)]
pub struct VideoPreviewClip {
pub id: i32,
pub file_path: String,
pub status: String,
pub duration_seconds: Option<f32>,
pub file_size_bytes: Option<i32>,
pub error_message: Option<String>,
pub created_at: String,
pub updated_at: String,
pub photo_name: String,
pub tag_id: i32,
pub created_time: i64,
}

View File

@@ -1,354 +0,0 @@
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::models::{InsertVideoPreviewClip, VideoPreviewClip};
use crate::database::{connect, DbError, DbErrorKind};
use crate::otel::trace_db_call;
pub trait PreviewDao: Sync + Send {
fn insert_preview(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
status_val: &str,
) -> Result<(), DbError>;
fn update_status(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
status_val: &str,
duration: Option<f32>,
size: Option<i32>,
error: Option<&str>,
) -> Result<(), DbError>;
fn get_preview(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
) -> Result<Option<VideoPreviewClip>, DbError>;
fn get_previews_batch(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
) -> Result<Vec<VideoPreviewClip>, DbError>;
fn get_by_status(
&mut self,
context: &opentelemetry::Context,
status_val: &str,
) -> Result<Vec<VideoPreviewClip>, DbError>;
}
pub struct SqlitePreviewDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqlitePreviewDao {
fn default() -> Self {
Self::new()
}
}
impl SqlitePreviewDao {
pub fn new() -> Self {
SqlitePreviewDao {
connection: Arc::new(Mutex::new(connect())),
}
}
#[cfg(test)]
pub fn from_connection(conn: SqliteConnection) -> Self {
SqlitePreviewDao {
connection: Arc::new(Mutex::new(conn)),
}
}
}
impl PreviewDao for SqlitePreviewDao {
fn insert_preview(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
status_val: &str,
) -> Result<(), DbError> {
trace_db_call(context, "insert", "insert_preview", |_span| {
use crate::database::schema::video_preview_clips::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get PreviewDao");
let now = chrono::Utc::now().to_rfc3339();
diesel::insert_or_ignore_into(video_preview_clips)
.values(InsertVideoPreviewClip {
file_path: file_path_val.to_string(),
status: status_val.to_string(),
created_at: now.clone(),
updated_at: now,
})
.execute(connection.deref_mut())
.map(|_| ())
.map_err(|e| anyhow::anyhow!("Insert error: {}", e))
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn update_status(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
status_val: &str,
duration: Option<f32>,
size: Option<i32>,
error: Option<&str>,
) -> Result<(), DbError> {
trace_db_call(context, "update", "update_preview_status", |_span| {
use crate::database::schema::video_preview_clips::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get PreviewDao");
let now = chrono::Utc::now().to_rfc3339();
diesel::update(video_preview_clips.filter(file_path.eq(file_path_val)))
.set((
status.eq(status_val),
duration_seconds.eq(duration),
file_size_bytes.eq(size),
error_message.eq(error),
updated_at.eq(&now),
))
.execute(connection.deref_mut())
.map(|_| ())
.map_err(|e| anyhow::anyhow!("Update error: {}", e))
})
.map_err(|_| DbError::new(DbErrorKind::UpdateError))
}
fn get_preview(
&mut self,
context: &opentelemetry::Context,
file_path_val: &str,
) -> Result<Option<VideoPreviewClip>, DbError> {
trace_db_call(context, "query", "get_preview", |_span| {
use crate::database::schema::video_preview_clips::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get PreviewDao");
match video_preview_clips
.filter(file_path.eq(file_path_val))
.first::<VideoPreviewClip>(connection.deref_mut())
{
Ok(clip) => Ok(Some(clip)),
Err(diesel::result::Error::NotFound) => Ok(None),
Err(e) => Err(anyhow::anyhow!("Query error: {}", e)),
}
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_previews_batch(
&mut self,
context: &opentelemetry::Context,
file_paths: &[String],
) -> Result<Vec<VideoPreviewClip>, DbError> {
trace_db_call(context, "query", "get_previews_batch", |_span| {
use crate::database::schema::video_preview_clips::dsl::*;
if file_paths.is_empty() {
return Ok(Vec::new());
}
let mut connection = self.connection.lock().expect("Unable to get PreviewDao");
video_preview_clips
.filter(file_path.eq_any(file_paths))
.load::<VideoPreviewClip>(connection.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_by_status(
&mut self,
context: &opentelemetry::Context,
status_val: &str,
) -> Result<Vec<VideoPreviewClip>, DbError> {
trace_db_call(context, "query", "get_previews_by_status", |_span| {
use crate::database::schema::video_preview_clips::dsl::*;
let mut connection = self.connection.lock().expect("Unable to get PreviewDao");
video_preview_clips
.filter(status.eq(status_val))
.load::<VideoPreviewClip>(connection.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}
#[cfg(test)]
mod tests {
use super::*;
use crate::database::test::in_memory_db_connection;
fn setup_dao() -> SqlitePreviewDao {
SqlitePreviewDao::from_connection(in_memory_db_connection())
}
fn ctx() -> opentelemetry::Context {
opentelemetry::Context::new()
}
#[test]
fn test_insert_and_get_preview() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "photos/video.mp4", "pending")
.unwrap();
let result = dao.get_preview(&ctx, "photos/video.mp4").unwrap();
assert!(result.is_some());
let clip = result.unwrap();
assert_eq!(clip.file_path, "photos/video.mp4");
assert_eq!(clip.status, "pending");
assert!(clip.duration_seconds.is_none());
assert!(clip.file_size_bytes.is_none());
assert!(clip.error_message.is_none());
}
#[test]
fn test_insert_duplicate_ignored() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "photos/video.mp4", "pending")
.unwrap();
// Second insert with same path should not error (INSERT OR IGNORE)
dao.insert_preview(&ctx, "photos/video.mp4", "processing")
.unwrap();
// Status should remain "pending" from the first insert
let clip = dao
.get_preview(&ctx, "photos/video.mp4")
.unwrap()
.unwrap();
assert_eq!(clip.status, "pending");
}
#[test]
fn test_update_status_to_complete() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "photos/video.mp4", "pending")
.unwrap();
dao.update_status(
&ctx,
"photos/video.mp4",
"complete",
Some(9.5),
Some(1024000),
None,
)
.unwrap();
let clip = dao
.get_preview(&ctx, "photos/video.mp4")
.unwrap()
.unwrap();
assert_eq!(clip.status, "complete");
assert_eq!(clip.duration_seconds, Some(9.5));
assert_eq!(clip.file_size_bytes, Some(1024000));
assert!(clip.error_message.is_none());
}
#[test]
fn test_update_status_to_failed() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "photos/video.mp4", "pending")
.unwrap();
dao.update_status(
&ctx,
"photos/video.mp4",
"failed",
None,
None,
Some("ffmpeg exited with code 1"),
)
.unwrap();
let clip = dao
.get_preview(&ctx, "photos/video.mp4")
.unwrap()
.unwrap();
assert_eq!(clip.status, "failed");
assert_eq!(
clip.error_message.as_deref(),
Some("ffmpeg exited with code 1")
);
}
#[test]
fn test_get_preview_not_found() {
let mut dao = setup_dao();
let ctx = ctx();
let result = dao.get_preview(&ctx, "nonexistent/path.mp4").unwrap();
assert!(result.is_none());
}
#[test]
fn test_get_previews_batch() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "a/one.mp4", "complete").unwrap();
dao.insert_preview(&ctx, "b/two.mp4", "pending").unwrap();
dao.insert_preview(&ctx, "c/three.mp4", "failed").unwrap();
// Query only two of the three
let paths = vec!["a/one.mp4".to_string(), "c/three.mp4".to_string()];
let results = dao.get_previews_batch(&ctx, &paths).unwrap();
assert_eq!(results.len(), 2);
let statuses: Vec<&str> = results.iter().map(|c| c.status.as_str()).collect();
assert!(statuses.contains(&"complete"));
assert!(statuses.contains(&"failed"));
}
#[test]
fn test_get_previews_batch_empty_input() {
let mut dao = setup_dao();
let ctx = ctx();
let results = dao.get_previews_batch(&ctx, &[]).unwrap();
assert!(results.is_empty());
}
#[test]
fn test_get_by_status() {
let mut dao = setup_dao();
let ctx = ctx();
dao.insert_preview(&ctx, "a.mp4", "pending").unwrap();
dao.insert_preview(&ctx, "b.mp4", "complete").unwrap();
dao.insert_preview(&ctx, "c.mp4", "pending").unwrap();
dao.insert_preview(&ctx, "d.mp4", "failed").unwrap();
let pending = dao.get_by_status(&ctx, "pending").unwrap();
assert_eq!(pending.len(), 2);
let complete = dao.get_by_status(&ctx, "complete").unwrap();
assert_eq!(complete.len(), 1);
assert_eq!(complete[0].file_path, "b.mp4");
let processing = dao.get_by_status(&ctx, "processing").unwrap();
assert!(processing.is_empty());
}
}

View File

@@ -1,37 +1,4 @@
// @generated automatically by Diesel CLI.
diesel::table! {
calendar_events (id) {
id -> Integer,
event_uid -> Nullable<Text>,
summary -> Text,
description -> Nullable<Text>,
location -> Nullable<Text>,
start_time -> BigInt,
end_time -> BigInt,
all_day -> Bool,
organizer -> Nullable<Text>,
attendees -> Nullable<Text>,
embedding -> Nullable<Binary>,
created_at -> BigInt,
source_file -> Nullable<Text>,
}
}
diesel::table! {
daily_conversation_summaries (id) {
id -> Integer,
date -> Text,
contact -> Text,
summary -> Text,
message_count -> Integer,
embedding -> Binary,
created_at -> BigInt,
model_version -> Text,
}
}
diesel::table! {
table! {
favorites (id) {
id -> Integer,
userid -> Integer,
@@ -39,95 +6,7 @@ diesel::table! {
}
}
diesel::table! {
image_exif (id) {
id -> Integer,
file_path -> Text,
camera_make -> Nullable<Text>,
camera_model -> Nullable<Text>,
lens_model -> Nullable<Text>,
width -> Nullable<Integer>,
height -> Nullable<Integer>,
orientation -> Nullable<Integer>,
gps_latitude -> Nullable<Float>,
gps_longitude -> Nullable<Float>,
gps_altitude -> Nullable<Float>,
focal_length -> Nullable<Float>,
aperture -> Nullable<Float>,
shutter_speed -> Nullable<Text>,
iso -> Nullable<Integer>,
date_taken -> Nullable<BigInt>,
created_time -> BigInt,
last_modified -> BigInt,
}
}
diesel::table! {
knowledge_embeddings (id) {
id -> Integer,
keyword -> Text,
description -> Text,
category -> Nullable<Text>,
embedding -> Binary,
created_at -> BigInt,
model_version -> Text,
}
}
diesel::table! {
location_history (id) {
id -> Integer,
timestamp -> BigInt,
latitude -> Float,
longitude -> Float,
accuracy -> Nullable<Integer>,
activity -> Nullable<Text>,
activity_confidence -> Nullable<Integer>,
place_name -> Nullable<Text>,
place_category -> Nullable<Text>,
embedding -> Nullable<Binary>,
created_at -> BigInt,
source_file -> Nullable<Text>,
}
}
diesel::table! {
message_embeddings (id) {
id -> Integer,
contact -> Text,
body -> Text,
timestamp -> BigInt,
is_sent -> Bool,
embedding -> Binary,
created_at -> BigInt,
model_version -> Text,
}
}
diesel::table! {
photo_insights (id) {
id -> Integer,
file_path -> Text,
title -> Text,
summary -> Text,
generated_at -> BigInt,
model_version -> Text,
}
}
diesel::table! {
search_history (id) {
id -> Integer,
timestamp -> BigInt,
query -> Text,
search_engine -> Nullable<Text>,
embedding -> Binary,
created_at -> BigInt,
source_file -> Nullable<Text>,
}
}
diesel::table! {
table! {
tagged_photo (id) {
id -> Integer,
photo_name -> Text,
@@ -136,7 +15,7 @@ diesel::table! {
}
}
diesel::table! {
table! {
tags (id) {
id -> Integer,
name -> Text,
@@ -144,7 +23,7 @@ diesel::table! {
}
}
diesel::table! {
table! {
users (id) {
id -> Integer,
username -> Text,
@@ -152,33 +31,11 @@ diesel::table! {
}
}
diesel::table! {
video_preview_clips (id) {
id -> Integer,
file_path -> Text,
status -> Text,
duration_seconds -> Nullable<Float>,
file_size_bytes -> Nullable<Integer>,
error_message -> Nullable<Text>,
created_at -> Text,
updated_at -> Text,
}
}
joinable!(tagged_photo -> tags (tag_id));
diesel::joinable!(tagged_photo -> tags (tag_id));
diesel::allow_tables_to_appear_in_same_query!(
calendar_events,
daily_conversation_summaries,
allow_tables_to_appear_in_same_query!(
favorites,
image_exif,
knowledge_embeddings,
location_history,
message_embeddings,
photo_insights,
search_history,
tagged_photo,
tags,
users,
video_preview_clips,
);

View File

@@ -1,516 +0,0 @@
use diesel::prelude::*;
use diesel::sqlite::SqliteConnection;
use serde::Serialize;
use std::ops::DerefMut;
use std::sync::{Arc, Mutex};
use crate::database::{DbError, DbErrorKind, connect};
use crate::otel::trace_db_call;
/// Represents a search history record
#[derive(Serialize, Clone, Debug)]
pub struct SearchRecord {
pub id: i32,
pub timestamp: i64,
pub query: String,
pub search_engine: Option<String>,
pub created_at: i64,
pub source_file: Option<String>,
}
/// Data for inserting a new search record
#[derive(Clone, Debug)]
pub struct InsertSearchRecord {
pub timestamp: i64,
pub query: String,
pub search_engine: Option<String>,
pub embedding: Vec<f32>, // 768-dim, REQUIRED
pub created_at: i64,
pub source_file: Option<String>,
}
pub trait SearchHistoryDao: Sync + Send {
/// Store search with embedding
fn store_search(
&mut self,
context: &opentelemetry::Context,
search: InsertSearchRecord,
) -> Result<SearchRecord, DbError>;
/// Batch insert searches
fn store_searches_batch(
&mut self,
context: &opentelemetry::Context,
searches: Vec<InsertSearchRecord>,
) -> Result<usize, DbError>;
/// Find searches in time range (for temporal context)
fn find_searches_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<SearchRecord>, DbError>;
/// Find semantically similar searches (PRIMARY - embeddings shine here)
fn find_similar_searches(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<SearchRecord>, DbError>;
/// Hybrid: Time window + semantic ranking
fn find_relevant_searches_hybrid(
&mut self,
context: &opentelemetry::Context,
center_timestamp: i64,
time_window_days: i64,
query_embedding: Option<&[f32]>,
limit: usize,
) -> Result<Vec<SearchRecord>, DbError>;
/// Deduplication check
fn search_exists(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
query: &str,
) -> Result<bool, DbError>;
/// Get count of search records
fn get_search_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError>;
}
pub struct SqliteSearchHistoryDao {
connection: Arc<Mutex<SqliteConnection>>,
}
impl Default for SqliteSearchHistoryDao {
fn default() -> Self {
Self::new()
}
}
impl SqliteSearchHistoryDao {
pub fn new() -> Self {
SqliteSearchHistoryDao {
connection: Arc::new(Mutex::new(connect())),
}
}
fn serialize_vector(vec: &[f32]) -> Vec<u8> {
use zerocopy::IntoBytes;
vec.as_bytes().to_vec()
}
fn deserialize_vector(bytes: &[u8]) -> Result<Vec<f32>, DbError> {
if !bytes.len().is_multiple_of(4) {
return Err(DbError::new(DbErrorKind::QueryError));
}
let count = bytes.len() / 4;
let mut vec = Vec::with_capacity(count);
for chunk in bytes.chunks_exact(4) {
let float = f32::from_le_bytes([chunk[0], chunk[1], chunk[2], chunk[3]]);
vec.push(float);
}
Ok(vec)
}
fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
if a.len() != b.len() {
return 0.0;
}
let dot_product: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
let magnitude_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let magnitude_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if magnitude_a == 0.0 || magnitude_b == 0.0 {
return 0.0;
}
dot_product / (magnitude_a * magnitude_b)
}
}
#[derive(QueryableByName)]
struct SearchRecordWithVectorRow {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
#[diesel(sql_type = diesel::sql_types::BigInt)]
timestamp: i64,
#[diesel(sql_type = diesel::sql_types::Text)]
query: String,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
search_engine: Option<String>,
#[diesel(sql_type = diesel::sql_types::Binary)]
embedding: Vec<u8>,
#[diesel(sql_type = diesel::sql_types::BigInt)]
created_at: i64,
#[diesel(sql_type = diesel::sql_types::Nullable<diesel::sql_types::Text>)]
source_file: Option<String>,
}
impl SearchRecordWithVectorRow {
fn to_search_record(&self) -> SearchRecord {
SearchRecord {
id: self.id,
timestamp: self.timestamp,
query: self.query.clone(),
search_engine: self.search_engine.clone(),
created_at: self.created_at,
source_file: self.source_file.clone(),
}
}
}
#[derive(QueryableByName)]
struct LastInsertRowId {
#[diesel(sql_type = diesel::sql_types::Integer)]
id: i32,
}
impl SearchHistoryDao for SqliteSearchHistoryDao {
fn store_search(
&mut self,
context: &opentelemetry::Context,
search: InsertSearchRecord,
) -> Result<SearchRecord, DbError> {
trace_db_call(context, "insert", "store_search", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
// Validate embedding dimensions (REQUIRED for searches)
if search.embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid embedding dimensions: {} (expected 768)",
search.embedding.len()
));
}
let embedding_bytes = Self::serialize_vector(&search.embedding);
// INSERT OR IGNORE to handle re-imports (UNIQUE constraint on timestamp+query)
diesel::sql_query(
"INSERT OR IGNORE INTO search_history
(timestamp, query, search_engine, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
)
.bind::<diesel::sql_types::BigInt, _>(search.timestamp)
.bind::<diesel::sql_types::Text, _>(&search.query)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&search.search_engine)
.bind::<diesel::sql_types::Binary, _>(&embedding_bytes)
.bind::<diesel::sql_types::BigInt, _>(search.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(&search.source_file)
.execute(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Insert error: {:?}", e))?;
let row_id: i32 = diesel::sql_query("SELECT last_insert_rowid() as id")
.get_result::<LastInsertRowId>(conn.deref_mut())
.map(|r| r.id)
.map_err(|e| anyhow::anyhow!("Failed to get last insert ID: {:?}", e))?;
Ok(SearchRecord {
id: row_id,
timestamp: search.timestamp,
query: search.query,
search_engine: search.search_engine,
created_at: search.created_at,
source_file: search.source_file,
})
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn store_searches_batch(
&mut self,
context: &opentelemetry::Context,
searches: Vec<InsertSearchRecord>,
) -> Result<usize, DbError> {
trace_db_call(context, "insert", "store_searches_batch", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
let mut inserted = 0;
conn.transaction::<_, anyhow::Error, _>(|conn| {
for search in searches {
// Validate embedding (REQUIRED)
if search.embedding.len() != 768 {
log::warn!(
"Skipping search with invalid embedding dimensions: {}",
search.embedding.len()
);
continue;
}
let embedding_bytes = Self::serialize_vector(&search.embedding);
let rows_affected = diesel::sql_query(
"INSERT OR IGNORE INTO search_history
(timestamp, query, search_engine, embedding, created_at, source_file)
VALUES (?1, ?2, ?3, ?4, ?5, ?6)",
)
.bind::<diesel::sql_types::BigInt, _>(search.timestamp)
.bind::<diesel::sql_types::Text, _>(&search.query)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&search.search_engine,
)
.bind::<diesel::sql_types::Binary, _>(&embedding_bytes)
.bind::<diesel::sql_types::BigInt, _>(search.created_at)
.bind::<diesel::sql_types::Nullable<diesel::sql_types::Text>, _>(
&search.source_file,
)
.execute(conn)
.map_err(|e| anyhow::anyhow!("Batch insert error: {:?}", e))?;
if rows_affected > 0 {
inserted += 1;
}
}
Ok(())
})
.map_err(|e| anyhow::anyhow!("Transaction error: {:?}", e))?;
Ok(inserted)
})
.map_err(|_| DbError::new(DbErrorKind::InsertError))
}
fn find_searches_in_range(
&mut self,
context: &opentelemetry::Context,
start_ts: i64,
end_ts: i64,
) -> Result<Vec<SearchRecord>, DbError> {
trace_db_call(context, "query", "find_searches_in_range", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
diesel::sql_query(
"SELECT id, timestamp, query, search_engine, embedding, created_at, source_file
FROM search_history
WHERE timestamp >= ?1 AND timestamp <= ?2
ORDER BY timestamp DESC",
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.load::<SearchRecordWithVectorRow>(conn.deref_mut())
.map(|rows| rows.into_iter().map(|r| r.to_search_record()).collect())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_similar_searches(
&mut self,
context: &opentelemetry::Context,
query_embedding: &[f32],
limit: usize,
) -> Result<Vec<SearchRecord>, DbError> {
trace_db_call(context, "query", "find_similar_searches", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
if query_embedding.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_embedding.len()
));
}
// Load all searches with embeddings
let results = diesel::sql_query(
"SELECT id, timestamp, query, search_engine, embedding, created_at, source_file
FROM search_history",
)
.load::<SearchRecordWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
// Compute similarities
let mut scored_searches: Vec<(f32, SearchRecord)> = results
.into_iter()
.filter_map(|row| {
if let Ok(emb) = Self::deserialize_vector(&row.embedding) {
let similarity = Self::cosine_similarity(query_embedding, &emb);
Some((similarity, row.to_search_record()))
} else {
None
}
})
.collect();
// Sort by similarity descending
scored_searches
.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
log::info!("Found {} similar searches", scored_searches.len());
if !scored_searches.is_empty() {
log::info!(
"Top similarity: {:.4} for query: '{}'",
scored_searches[0].0,
scored_searches[0].1.query
);
}
Ok(scored_searches
.into_iter()
.take(limit)
.map(|(_, search)| search)
.collect())
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn find_relevant_searches_hybrid(
&mut self,
context: &opentelemetry::Context,
center_timestamp: i64,
time_window_days: i64,
query_embedding: Option<&[f32]>,
limit: usize,
) -> Result<Vec<SearchRecord>, DbError> {
trace_db_call(context, "query", "find_relevant_searches_hybrid", |_span| {
let window_seconds = time_window_days * 86400;
let start_ts = center_timestamp - window_seconds;
let end_ts = center_timestamp + window_seconds;
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
// Step 1: Time-based filter (fast, indexed)
let searches_in_range = diesel::sql_query(
"SELECT id, timestamp, query, search_engine, embedding, created_at, source_file
FROM search_history
WHERE timestamp >= ?1 AND timestamp <= ?2",
)
.bind::<diesel::sql_types::BigInt, _>(start_ts)
.bind::<diesel::sql_types::BigInt, _>(end_ts)
.load::<SearchRecordWithVectorRow>(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
// Step 2: If query embedding provided, rank by semantic similarity
if let Some(query_emb) = query_embedding {
if query_emb.len() != 768 {
return Err(anyhow::anyhow!(
"Invalid query embedding dimensions: {} (expected 768)",
query_emb.len()
));
}
let mut scored_searches: Vec<(f32, SearchRecord)> = searches_in_range
.into_iter()
.filter_map(|row| {
if let Ok(emb) = Self::deserialize_vector(&row.embedding) {
let similarity = Self::cosine_similarity(query_emb, &emb);
Some((similarity, row.to_search_record()))
} else {
None
}
})
.collect();
// Sort by similarity descending
scored_searches
.sort_by(|a, b| b.0.partial_cmp(&a.0).unwrap_or(std::cmp::Ordering::Equal));
log::info!(
"Hybrid query: {} searches in time range, ranked by similarity",
scored_searches.len()
);
if !scored_searches.is_empty() {
log::info!(
"Top similarity: {:.4} for '{}'",
scored_searches[0].0,
scored_searches[0].1.query
);
}
Ok(scored_searches
.into_iter()
.take(limit)
.map(|(_, search)| search)
.collect())
} else {
// No semantic ranking, just return time-sorted (most recent first)
log::info!(
"Time-only query: {} searches in range",
searches_in_range.len()
);
Ok(searches_in_range
.into_iter()
.take(limit)
.map(|r| r.to_search_record())
.collect())
}
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn search_exists(
&mut self,
context: &opentelemetry::Context,
timestamp: i64,
query: &str,
) -> Result<bool, DbError> {
trace_db_call(context, "query", "search_exists", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::Integer)]
count: i32,
}
let result: CountResult = diesel::sql_query(
"SELECT COUNT(*) as count FROM search_history WHERE timestamp = ?1 AND query = ?2",
)
.bind::<diesel::sql_types::BigInt, _>(timestamp)
.bind::<diesel::sql_types::Text, _>(query)
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count > 0)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
fn get_search_count(&mut self, context: &opentelemetry::Context) -> Result<i64, DbError> {
trace_db_call(context, "query", "get_search_count", |_span| {
let mut conn = self
.connection
.lock()
.expect("Unable to get SearchHistoryDao");
#[derive(QueryableByName)]
struct CountResult {
#[diesel(sql_type = diesel::sql_types::BigInt)]
count: i64,
}
let result: CountResult =
diesel::sql_query("SELECT COUNT(*) as count FROM search_history")
.get_result(conn.deref_mut())
.map_err(|e| anyhow::anyhow!("Query error: {:?}", e))?;
Ok(result.count)
})
.map_err(|_| DbError::new(DbErrorKind::QueryError))
}
}

View File

@@ -1,14 +0,0 @@
use actix_web::{error::InternalError, http::StatusCode};
pub trait IntoHttpError<T> {
fn into_http_internal_err(self) -> Result<T, actix_web::Error>;
}
impl<T> IntoHttpError<T> for Result<T, anyhow::Error> {
fn into_http_internal_err(self) -> Result<T, actix_web::Error> {
self.map_err(|e| {
log::error!("Map to err: {:?}", e);
InternalError::new(e, StatusCode::INTERNAL_SERVER_ERROR).into()
})
}
}

View File

@@ -1,319 +0,0 @@
use std::fs::File;
use std::io::BufReader;
use std::path::Path;
use anyhow::{Result, anyhow};
use exif::{In, Reader, Tag, Value};
use log::debug;
use serde::{Deserialize, Serialize};
#[derive(Debug, Clone, Serialize, Deserialize, Default)]
pub struct ExifData {
pub camera_make: Option<String>,
pub camera_model: Option<String>,
pub lens_model: Option<String>,
pub width: Option<i32>,
pub height: Option<i32>,
pub orientation: Option<i32>,
pub gps_latitude: Option<f64>,
pub gps_longitude: Option<f64>,
pub gps_altitude: Option<f64>,
pub focal_length: Option<f64>,
pub aperture: Option<f64>,
pub shutter_speed: Option<String>,
pub iso: Option<i32>,
pub date_taken: Option<i64>,
}
pub fn supports_exif(path: &Path) -> bool {
if let Some(ext) = path.extension() {
let ext_lower = ext.to_string_lossy().to_lowercase();
matches!(
ext_lower.as_str(),
// JPEG formats
"jpg" | "jpeg" |
// TIFF and RAW formats based on TIFF
"tiff" | "tif" | "nef" | "cr2" | "cr3" | "arw" | "dng" | "raf" | "orf" | "rw2" | "pef" | "srw" |
// HEIF and variants
"heif" | "heic" | "avif" |
// PNG
"png" |
// WebP
"webp"
)
} else {
false
}
}
pub fn extract_exif_from_path(path: &Path) -> Result<ExifData> {
debug!("Extracting EXIF from: {:?}", path);
if !supports_exif(path) {
return Err(anyhow!("File type does not support EXIF"));
}
let file = File::open(path)?;
let mut bufreader = BufReader::new(file);
let exifreader = Reader::new();
let exif = exifreader.read_from_container(&mut bufreader)?;
let mut data = ExifData::default();
for field in exif.fields() {
match field.tag {
Tag::Make => {
data.camera_make = get_string_value(field);
}
Tag::Model => {
data.camera_model = get_string_value(field);
}
Tag::LensModel => {
data.lens_model = get_string_value(field);
}
Tag::PixelXDimension | Tag::ImageWidth => {
if data.width.is_none() {
data.width = get_u32_value(field).map(|v| v as i32);
}
}
Tag::PixelYDimension | Tag::ImageLength => {
if data.height.is_none() {
data.height = get_u32_value(field).map(|v| v as i32);
}
}
Tag::Orientation => {
data.orientation = get_u32_value(field).map(|v| v as i32);
}
Tag::FocalLength => {
data.focal_length = get_rational_value(field);
}
Tag::FNumber => {
data.aperture = get_rational_value(field);
}
Tag::ExposureTime => {
data.shutter_speed = get_rational_string(field);
}
Tag::PhotographicSensitivity | Tag::ISOSpeed => {
if data.iso.is_none() {
data.iso = get_u32_value(field).map(|v| v as i32);
}
}
Tag::DateTime | Tag::DateTimeOriginal => {
if data.date_taken.is_none() {
data.date_taken = parse_exif_datetime(field);
}
}
_ => {}
}
}
// Extract GPS coordinates
if let Some(lat) = extract_gps_coordinate(&exif, Tag::GPSLatitude, Tag::GPSLatitudeRef) {
data.gps_latitude = Some(lat);
}
if let Some(lon) = extract_gps_coordinate(&exif, Tag::GPSLongitude, Tag::GPSLongitudeRef) {
data.gps_longitude = Some(lon);
}
if let Some(alt) = extract_gps_altitude(&exif) {
data.gps_altitude = Some(alt);
}
debug!("Extracted EXIF data: {:?}", data);
Ok(data)
}
fn get_string_value(field: &exif::Field) -> Option<String> {
match &field.value {
Value::Ascii(vec) => {
if let Some(bytes) = vec.first() {
String::from_utf8(bytes.to_vec())
.ok()
.map(|s| s.trim_end_matches('\0').to_string())
} else {
None
}
}
_ => {
let display = field.display_value().to_string();
if display.is_empty() {
None
} else {
Some(display)
}
}
}
}
fn get_u32_value(field: &exif::Field) -> Option<u32> {
match &field.value {
Value::Short(vec) => vec.first().map(|&v| v as u32),
Value::Long(vec) => vec.first().copied(),
_ => None,
}
}
fn get_rational_value(field: &exif::Field) -> Option<f64> {
match &field.value {
Value::Rational(vec) => {
if let Some(rational) = vec.first() {
if rational.denom == 0 {
None
} else {
Some(rational.num as f64 / rational.denom as f64)
}
} else {
None
}
}
_ => None,
}
}
fn get_rational_string(field: &exif::Field) -> Option<String> {
match &field.value {
Value::Rational(vec) => {
if let Some(rational) = vec.first() {
if rational.denom == 0 {
None
} else if rational.num < rational.denom {
Some(format!("{}/{}", rational.num, rational.denom))
} else {
let value = rational.num as f64 / rational.denom as f64;
Some(format!("{:.2}", value))
}
} else {
None
}
}
_ => None,
}
}
fn parse_exif_datetime(field: &exif::Field) -> Option<i64> {
if let Some(datetime_str) = get_string_value(field) {
use chrono::NaiveDateTime;
// EXIF datetime format: "YYYY:MM:DD HH:MM:SS"
// Note: EXIF dates are local time without timezone info
// We return the timestamp as if it were UTC, and the client will display it as-is
NaiveDateTime::parse_from_str(&datetime_str, "%Y:%m:%d %H:%M:%S")
.ok()
.map(|dt| dt.and_utc().timestamp())
} else {
None
}
}
fn extract_gps_coordinate(exif: &exif::Exif, coord_tag: Tag, ref_tag: Tag) -> Option<f64> {
let coord_field = exif.get_field(coord_tag, In::PRIMARY)?;
let ref_field = exif.get_field(ref_tag, In::PRIMARY)?;
let coordinates = match &coord_field.value {
Value::Rational(vec) => {
if vec.len() < 3 {
return None;
}
let degrees = vec[0].num as f64 / vec[0].denom as f64;
let minutes = vec[1].num as f64 / vec[1].denom as f64;
let seconds = vec[2].num as f64 / vec[2].denom as f64;
degrees + (minutes / 60.0) + (seconds / 3600.0)
}
_ => return None,
};
let reference = get_string_value(ref_field)?;
let sign = if reference.starts_with('S') || reference.starts_with('W') {
-1.0
} else {
1.0
};
Some(coordinates * sign)
}
fn extract_gps_altitude(exif: &exif::Exif) -> Option<f64> {
let alt_field = exif.get_field(Tag::GPSAltitude, In::PRIMARY)?;
match &alt_field.value {
Value::Rational(vec) => {
if let Some(rational) = vec.first() {
if rational.denom == 0 {
None
} else {
let altitude = rational.num as f64 / rational.denom as f64;
// Check if below sea level
if let Some(ref_field) = exif.get_field(Tag::GPSAltitudeRef, In::PRIMARY)
&& let Some(ref_val) = get_u32_value(ref_field)
&& ref_val == 1
{
return Some(-altitude);
}
Some(altitude)
}
} else {
None
}
}
_ => None,
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_supports_exif_jpeg() {
assert!(supports_exif(Path::new("test.jpg")));
assert!(supports_exif(Path::new("test.jpeg")));
assert!(supports_exif(Path::new("test.JPG")));
}
#[test]
fn test_supports_exif_raw_formats() {
assert!(supports_exif(Path::new("test.nef"))); // Nikon
assert!(supports_exif(Path::new("test.NEF")));
assert!(supports_exif(Path::new("test.cr2"))); // Canon
assert!(supports_exif(Path::new("test.cr3"))); // Canon
assert!(supports_exif(Path::new("test.arw"))); // Sony
assert!(supports_exif(Path::new("test.dng"))); // Adobe DNG
}
#[test]
fn test_supports_exif_tiff() {
assert!(supports_exif(Path::new("test.tiff")));
assert!(supports_exif(Path::new("test.tif")));
assert!(supports_exif(Path::new("test.TIFF")));
}
#[test]
fn test_supports_exif_heif() {
assert!(supports_exif(Path::new("test.heif")));
assert!(supports_exif(Path::new("test.heic")));
assert!(supports_exif(Path::new("test.avif")));
}
#[test]
fn test_supports_exif_png_webp() {
assert!(supports_exif(Path::new("test.png")));
assert!(supports_exif(Path::new("test.PNG")));
assert!(supports_exif(Path::new("test.webp")));
assert!(supports_exif(Path::new("test.WEBP")));
}
#[test]
fn test_supports_exif_unsupported() {
assert!(!supports_exif(Path::new("test.mp4")));
assert!(!supports_exif(Path::new("test.mov")));
assert!(!supports_exif(Path::new("test.txt")));
assert!(!supports_exif(Path::new("test.gif")));
}
#[test]
fn test_supports_exif_no_extension() {
assert!(!supports_exif(Path::new("test")));
}
}

View File

@@ -1,88 +0,0 @@
use std::path::Path;
use walkdir::DirEntry;
/// Supported image file extensions
pub const IMAGE_EXTENSIONS: &[&str] = &[
"jpg", "jpeg", "png", "webp", "tiff", "tif", "heif", "heic", "avif", "nef",
];
/// Supported video file extensions
pub const VIDEO_EXTENSIONS: &[&str] = &["mp4", "mov", "avi", "mkv"];
/// Check if a path has an image extension
pub fn is_image_file(path: &Path) -> bool {
if let Some(ext) = path.extension().and_then(|e| e.to_str()) {
let ext_lower = ext.to_lowercase();
IMAGE_EXTENSIONS.contains(&ext_lower.as_str())
} else {
false
}
}
/// Check if a path has a video extension
pub fn is_video_file(path: &Path) -> bool {
if let Some(ext) = path.extension().and_then(|e| e.to_str()) {
let ext_lower = ext.to_lowercase();
VIDEO_EXTENSIONS.contains(&ext_lower.as_str())
} else {
false
}
}
/// Check if a path has a supported media extension (image or video)
pub fn is_media_file(path: &Path) -> bool {
is_image_file(path) || is_video_file(path)
}
/// Check if a DirEntry is an image file (for walkdir usage)
#[allow(dead_code)]
pub fn direntry_is_image(entry: &DirEntry) -> bool {
is_image_file(entry.path())
}
/// Check if a DirEntry is a video file (for walkdir usage)
#[allow(dead_code)]
pub fn direntry_is_video(entry: &DirEntry) -> bool {
is_video_file(entry.path())
}
/// Check if a DirEntry is a media file (for walkdir usage)
#[allow(dead_code)]
pub fn direntry_is_media(entry: &DirEntry) -> bool {
is_media_file(entry.path())
}
#[cfg(test)]
mod tests {
use super::*;
use std::path::Path;
#[test]
fn test_is_image_file() {
assert!(is_image_file(Path::new("photo.jpg")));
assert!(is_image_file(Path::new("photo.JPG")));
assert!(is_image_file(Path::new("photo.png")));
assert!(is_image_file(Path::new("photo.nef")));
assert!(!is_image_file(Path::new("video.mp4")));
assert!(!is_image_file(Path::new("document.txt")));
}
#[test]
fn test_is_video_file() {
assert!(is_video_file(Path::new("video.mp4")));
assert!(is_video_file(Path::new("video.MP4")));
assert!(is_video_file(Path::new("video.mov")));
assert!(is_video_file(Path::new("video.avi")));
assert!(!is_video_file(Path::new("photo.jpg")));
assert!(!is_video_file(Path::new("document.txt")));
}
#[test]
fn test_is_media_file() {
assert!(is_media_file(Path::new("photo.jpg")));
assert!(is_media_file(Path::new("video.mp4")));
assert!(is_media_file(Path::new("photo.PNG")));
assert!(!is_media_file(Path::new("document.txt")));
assert!(!is_media_file(Path::new("no_extension")));
}
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,121 +0,0 @@
/// Geographic calculation utilities for GPS-based search
use std::f64;
/// Calculate distance between two GPS coordinates using the Haversine formula.
/// Returns distance in kilometers.
///
/// # Arguments
/// * `lat1` - Latitude of first point in decimal degrees
/// * `lon1` - Longitude of first point in decimal degrees
/// * `lat2` - Latitude of second point in decimal degrees
/// * `lon2` - Longitude of second point in decimal degrees
///
/// # Example
/// ```
/// use image_api::geo::haversine_distance;
/// let distance = haversine_distance(37.7749, -122.4194, 34.0522, -118.2437);
/// // Distance between San Francisco and Los Angeles (~559 km)
/// ```
pub fn haversine_distance(lat1: f64, lon1: f64, lat2: f64, lon2: f64) -> f64 {
const EARTH_RADIUS_KM: f64 = 6371.0;
let lat1_rad = lat1.to_radians();
let lat2_rad = lat2.to_radians();
let delta_lat = (lat2 - lat1).to_radians();
let delta_lon = (lon2 - lon1).to_radians();
let a = (delta_lat / 2.0).sin().powi(2)
+ lat1_rad.cos() * lat2_rad.cos() * (delta_lon / 2.0).sin().powi(2);
let c = 2.0 * a.sqrt().atan2((1.0 - a).sqrt());
EARTH_RADIUS_KM * c
}
/// Calculate bounding box for GPS radius query.
/// Returns (min_lat, max_lat, min_lon, max_lon) that encompasses the search radius.
///
/// This is used as a fast first-pass filter for GPS queries, narrowing down
/// candidates before applying the more expensive Haversine distance calculation.
///
/// # Arguments
/// * `lat` - Center latitude in decimal degrees
/// * `lon` - Center longitude in decimal degrees
/// * `radius_km` - Search radius in kilometers
///
/// # Returns
/// A tuple of (min_lat, max_lat, min_lon, max_lon) in decimal degrees
pub fn gps_bounding_box(lat: f64, lon: f64, radius_km: f64) -> (f64, f64, f64, f64) {
const EARTH_RADIUS_KM: f64 = 6371.0;
// Calculate latitude delta (same at all latitudes)
let lat_delta = (radius_km / EARTH_RADIUS_KM) * (180.0 / f64::consts::PI);
// Calculate longitude delta (varies with latitude)
let lon_delta = lat_delta / lat.to_radians().cos();
(
lat - lat_delta, // min_lat
lat + lat_delta, // max_lat
lon - lon_delta, // min_lon
lon + lon_delta, // max_lon
)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_haversine_distance_sf_to_la() {
// San Francisco to Los Angeles
let distance = haversine_distance(37.7749, -122.4194, 34.0522, -118.2437);
// Should be approximately 559 km
assert!(
(distance - 559.0).abs() < 10.0,
"Distance should be ~559km, got {}",
distance
);
}
#[test]
fn test_haversine_distance_same_point() {
// Same point should have zero distance
let distance = haversine_distance(37.7749, -122.4194, 37.7749, -122.4194);
assert!(
distance < 0.001,
"Same point should have ~0 distance, got {}",
distance
);
}
#[test]
fn test_gps_bounding_box() {
// Test bounding box calculation for 10km radius around San Francisco
let (min_lat, max_lat, min_lon, max_lon) = gps_bounding_box(37.7749, -122.4194, 10.0);
// Verify the bounds are reasonable
assert!(min_lat < 37.7749, "min_lat should be less than center");
assert!(max_lat > 37.7749, "max_lat should be greater than center");
assert!(min_lon < -122.4194, "min_lon should be less than center");
assert!(max_lon > -122.4194, "max_lon should be greater than center");
// Verify bounds span roughly the right distance
let lat_span = max_lat - min_lat;
assert!(
lat_span > 0.1 && lat_span < 0.3,
"Latitude span should be reasonable for 10km"
);
}
#[test]
fn test_haversine_distance_across_equator() {
// Test across equator
let distance = haversine_distance(1.0, 0.0, -1.0, 0.0);
// Should be approximately 222 km
assert!(
(distance - 222.0).abs() < 5.0,
"Distance should be ~222km, got {}",
distance
);
}
}

View File

@@ -1,45 +0,0 @@
#[macro_use]
extern crate diesel;
pub mod ai;
pub mod auth;
pub mod cleanup;
pub mod data;
pub mod database;
pub mod error;
pub mod exif;
pub mod file_types;
pub mod files;
pub mod geo;
pub mod memories;
pub mod otel;
pub mod parsers;
pub mod service;
pub mod state;
pub mod tags;
#[cfg(test)]
pub mod testhelpers;
pub mod utils;
pub mod video;
// Re-export commonly used types
pub use data::{Claims, ThumbnailRequest};
pub use database::{connect, schema};
pub use state::AppState;
// Stub functions for modules that reference main.rs
// These are not used by cleanup_files binary
use std::path::Path;
use walkdir::DirEntry;
pub fn create_thumbnails() {
// Stub - implemented in main.rs
}
pub fn update_media_counts(_media_dir: &Path) {
// Stub - implemented in main.rs
}
pub fn is_video(entry: &DirEntry) -> bool {
file_types::direntry_is_video(entry)
}

File diff suppressed because it is too large Load Diff

File diff suppressed because it is too large Load Diff

View File

@@ -1,112 +0,0 @@
use actix_web::HttpRequest;
use actix_web::http::header::HeaderMap;
use opentelemetry::global::{BoxedSpan, BoxedTracer};
use opentelemetry::propagation::TextMapPropagator;
use opentelemetry::trace::{Span, Status, Tracer};
use opentelemetry::{Context, KeyValue, global};
use opentelemetry_appender_log::OpenTelemetryLogBridge;
use opentelemetry_otlp::WithExportConfig;
use opentelemetry_sdk::Resource;
use opentelemetry_sdk::logs::{BatchLogProcessor, SdkLoggerProvider};
use opentelemetry_sdk::propagation::TraceContextPropagator;
pub fn global_tracer() -> BoxedTracer {
global::tracer("image-server")
}
#[allow(dead_code)]
pub fn init_tracing() {
let resources = Resource::builder()
.with_attributes([
KeyValue::new("service.name", "image-server"),
KeyValue::new("service.version", env!("CARGO_PKG_VERSION")),
])
.build();
let span_exporter = opentelemetry_otlp::SpanExporter::builder()
.with_tonic()
.with_endpoint(std::env::var("OTLP_OTLS_ENDPOINT").unwrap())
.build()
.unwrap();
let tracer_provider = opentelemetry_sdk::trace::SdkTracerProvider::builder()
.with_batch_exporter(span_exporter)
.with_resource(resources)
.build();
global::set_tracer_provider(tracer_provider);
}
#[allow(dead_code)]
pub fn init_logs() {
let otlp_exporter = opentelemetry_otlp::LogExporter::builder()
.with_tonic()
.with_endpoint(std::env::var("OTLP_OTLS_ENDPOINT").unwrap())
.build()
.unwrap();
let exporter = opentelemetry_stdout::LogExporter::default();
let resources = Resource::builder()
.with_attributes([
KeyValue::new("service.name", "image-server"),
KeyValue::new("service.version", env!("CARGO_PKG_VERSION")),
])
.build();
let log_provider = SdkLoggerProvider::builder()
.with_log_processor(BatchLogProcessor::builder(exporter).build())
.with_log_processor(BatchLogProcessor::builder(otlp_exporter).build())
.with_resource(resources)
.build();
let otel_log_appender = OpenTelemetryLogBridge::new(&log_provider);
log::set_boxed_logger(Box::new(otel_log_appender)).expect("Unable to set boxed logger");
//TODO: Still set this with the env? Ideally we still have a clean/simple local logger for local dev
log::set_max_level(log::LevelFilter::Info);
}
struct HeaderExtractor<'a>(&'a HeaderMap);
impl<'a> opentelemetry::propagation::Extractor for HeaderExtractor<'a> {
fn get(&self, key: &str) -> Option<&str> {
self.0.get(key).and_then(|v| v.to_str().ok())
}
fn keys(&self) -> Vec<&str> {
self.0.keys().map(|k| k.as_str()).collect()
}
}
pub fn extract_context_from_request(req: &HttpRequest) -> Context {
let propagator = TraceContextPropagator::new();
propagator.extract(&HeaderExtractor(req.headers()))
}
pub fn trace_db_call<F, O>(
context: &Context,
query_type: &str,
operation: &str,
func: F,
) -> anyhow::Result<O>
where
F: FnOnce(&mut BoxedSpan) -> anyhow::Result<O>,
{
let tracer = global::tracer("db");
let mut span = tracer
.span_builder(format!("db.{}.{}", query_type, operation))
.with_attributes(vec![
KeyValue::new("db.query_type", query_type.to_string().clone()),
KeyValue::new("db.operation", operation.to_string().clone()),
])
.start_with_context(&tracer, context);
let result = func(&mut span);
match &result {
Ok(_) => {
span.set_status(Status::Ok);
}
Err(e) => span.set_status(Status::error(e.to_string())),
}
result
}

View File

@@ -1,183 +0,0 @@
use anyhow::{Context, Result};
use chrono::NaiveDateTime;
use ical::parser::ical::component::IcalCalendar;
use ical::property::Property;
use std::fs::File;
use std::io::BufReader;
#[derive(Debug, Clone)]
pub struct ParsedCalendarEvent {
pub event_uid: Option<String>,
pub summary: String,
pub description: Option<String>,
pub location: Option<String>,
pub start_time: i64,
pub end_time: i64,
pub all_day: bool,
pub organizer: Option<String>,
pub attendees: Vec<String>,
}
pub fn parse_ics_file(path: &str) -> Result<Vec<ParsedCalendarEvent>> {
let file = File::open(path).context("Failed to open .ics file")?;
let reader = BufReader::new(file);
let parser = ical::IcalParser::new(reader);
let mut events = Vec::new();
for calendar_result in parser {
let calendar: IcalCalendar = calendar_result.context("Failed to parse calendar")?;
for event in calendar.events {
// Extract properties
let mut event_uid = None;
let mut summary = None;
let mut description = None;
let mut location = None;
let mut start_time = None;
let mut end_time = None;
let mut all_day = false;
let mut organizer = None;
let mut attendees = Vec::new();
for property in event.properties {
match property.name.as_str() {
"UID" => {
event_uid = property.value;
}
"SUMMARY" => {
summary = property.value;
}
"DESCRIPTION" => {
description = property.value;
}
"LOCATION" => {
location = property.value;
}
"DTSTART" => {
if let Some(ref value) = property.value {
start_time = parse_ical_datetime(value, &property)?;
// Check if it's an all-day event (no time component)
all_day = value.len() == 8; // YYYYMMDD format
}
}
"DTEND" => {
if let Some(ref value) = property.value {
end_time = parse_ical_datetime(value, &property)?;
}
}
"ORGANIZER" => {
organizer = extract_email_from_mailto(property.value.as_deref());
}
"ATTENDEE" => {
if let Some(email) = extract_email_from_mailto(property.value.as_deref()) {
attendees.push(email);
}
}
_ => {}
}
}
// Only include events with required fields
if let (Some(summary_text), Some(start), Some(end)) = (summary, start_time, end_time) {
events.push(ParsedCalendarEvent {
event_uid,
summary: summary_text,
description,
location,
start_time: start,
end_time: end,
all_day,
organizer,
attendees,
});
}
}
}
Ok(events)
}
fn parse_ical_datetime(value: &str, property: &Property) -> Result<Option<i64>> {
// Check for TZID parameter
let _tzid = property.params.as_ref().and_then(|params| {
params
.iter()
.find(|(key, _)| key == "TZID")
.and_then(|(_, values)| values.first())
.cloned()
});
// iCal datetime formats:
// - 20240815T140000Z (UTC)
// - 20240815T140000 (local/TZID)
// - 20240815 (all-day)
let cleaned = value.replace("Z", "").replace("T", "");
// All-day event (YYYYMMDD)
if cleaned.len() == 8 {
let dt = NaiveDateTime::parse_from_str(&format!("{}000000", cleaned), "%Y%m%d%H%M%S")
.context("Failed to parse all-day date")?;
return Ok(Some(dt.and_utc().timestamp()));
}
// DateTime event (YYYYMMDDTHHMMSS)
if cleaned.len() >= 14 {
let dt = NaiveDateTime::parse_from_str(&cleaned[..14], "%Y%m%d%H%M%S")
.context("Failed to parse datetime")?;
// If original had 'Z', it's UTC
let timestamp = if value.ends_with('Z') {
dt.and_utc().timestamp()
} else {
// Treat as UTC for simplicity (proper TZID handling is complex)
dt.and_utc().timestamp()
};
return Ok(Some(timestamp));
}
Ok(None)
}
fn extract_email_from_mailto(value: Option<&str>) -> Option<String> {
value.map(|v| {
// ORGANIZER and ATTENDEE often have format: mailto:user@example.com
if v.starts_with("mailto:") {
v.trim_start_matches("mailto:").to_string()
} else {
v.to_string()
}
})
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_parse_ical_datetime() {
let prop = Property {
name: "DTSTART".to_string(),
params: None,
value: Some("20240815T140000Z".to_string()),
};
let timestamp = parse_ical_datetime("20240815T140000Z", &prop).unwrap();
assert!(timestamp.is_some());
}
#[test]
fn test_extract_email() {
assert_eq!(
extract_email_from_mailto(Some("mailto:user@example.com")),
Some("user@example.com".to_string())
);
assert_eq!(
extract_email_from_mailto(Some("user@example.com")),
Some("user@example.com".to_string())
);
}
}

View File

@@ -1,134 +0,0 @@
use anyhow::{Context, Result};
use chrono::DateTime;
use serde::Deserialize;
use std::fs::File;
use std::io::BufReader;
#[derive(Debug, Clone)]
pub struct ParsedLocationRecord {
pub timestamp: i64,
pub latitude: f64,
pub longitude: f64,
pub accuracy: Option<i32>,
pub activity: Option<String>,
pub activity_confidence: Option<i32>,
}
// Google Takeout Location History JSON structures
#[derive(Debug, Deserialize)]
struct LocationHistory {
locations: Vec<LocationPoint>,
}
#[derive(Debug, Deserialize)]
#[serde(rename_all = "camelCase")]
struct LocationPoint {
timestamp_ms: Option<String>, // Older format
timestamp: Option<String>, // Newer format (ISO8601)
latitude_e7: Option<i64>,
longitude_e7: Option<i64>,
accuracy: Option<i32>,
activity: Option<Vec<ActivityRecord>>,
}
#[derive(Debug, Deserialize)]
struct ActivityRecord {
activity: Vec<ActivityType>,
#[allow(dead_code)] // Part of JSON structure, may be used in future
timestamp_ms: Option<String>,
}
#[derive(Debug, Deserialize)]
struct ActivityType {
#[serde(rename = "type")]
activity_type: String,
confidence: i32,
}
pub fn parse_location_json(path: &str) -> Result<Vec<ParsedLocationRecord>> {
let file = File::open(path).context("Failed to open location JSON file")?;
let reader = BufReader::new(file);
let history: LocationHistory =
serde_json::from_reader(reader).context("Failed to parse location history JSON")?;
let mut records = Vec::new();
for point in history.locations {
// Parse timestamp (try both formats)
let timestamp = if let Some(ts_ms) = point.timestamp_ms {
// Milliseconds since epoch
ts_ms
.parse::<i64>()
.context("Failed to parse timestamp_ms")?
/ 1000
} else if let Some(ts_iso) = point.timestamp {
// ISO8601 format
DateTime::parse_from_rfc3339(&ts_iso)
.context("Failed to parse ISO8601 timestamp")?
.timestamp()
} else {
continue; // Skip points without timestamp
};
// Convert E7 format to decimal degrees
let latitude = point.latitude_e7.map(|e7| e7 as f64 / 10_000_000.0);
let longitude = point.longitude_e7.map(|e7| e7 as f64 / 10_000_000.0);
// Extract highest-confidence activity
let (activity, activity_confidence) = point
.activity
.as_ref()
.and_then(|activities| activities.first())
.and_then(|record| {
record
.activity
.iter()
.max_by_key(|a| a.confidence)
.map(|a| (a.activity_type.clone(), a.confidence))
})
.unzip();
if let (Some(lat), Some(lon)) = (latitude, longitude) {
records.push(ParsedLocationRecord {
timestamp,
latitude: lat,
longitude: lon,
accuracy: point.accuracy,
activity,
activity_confidence,
});
}
}
Ok(records)
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn test_e7_conversion() {
let lat_e7 = 374228300_i64;
let lat = lat_e7 as f64 / 10_000_000.0;
assert!((lat - 37.42283).abs() < 0.00001);
}
#[test]
fn test_parse_sample_json() {
let json = r#"{
"locations": [
{
"latitudeE7": 374228300,
"longitudeE7": -1221086100,
"accuracy": 20,
"timestampMs": "1692115200000"
}
]
}"#;
let history: LocationHistory = serde_json::from_str(json).unwrap();
assert_eq!(history.locations.len(), 1);
}
}

View File

@@ -1,7 +0,0 @@
pub mod ical_parser;
pub mod location_json_parser;
pub mod search_html_parser;
pub use ical_parser::{ParsedCalendarEvent, parse_ics_file};
pub use location_json_parser::{ParsedLocationRecord, parse_location_json};
pub use search_html_parser::{ParsedSearchRecord, parse_search_html};

Some files were not shown because too many files have changed in this diff Show More