Add Google Takeout data import infrastructure

Implements Phase 1 & 2 of Google Takeout RAG integration:
- Database migrations for calendar_events, location_history, search_history
- DAO implementations with hybrid time + semantic search
- Parsers for .ics, JSON, and HTML Google Takeout formats
- Import utilities with batch insert optimization

Features:
- CalendarEventDao: Hybrid time-range + semantic search for events
- LocationHistoryDao: GPS proximity with Haversine distance calculation
- SearchHistoryDao: Semantic-first search (queries are embedding-rich)
- Batch inserts for performance (1M+ records in minutes vs hours)
- OpenTelemetry tracing for all database operations

Import utilities:
- import_calendar: Parse .ics with optional embedding generation
- import_location_history: High-volume GPS data with batch inserts
- import_search_history: Always generates embeddings for semantic search

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
Cameron
2026-01-05 14:50:49 -05:00
parent bb23e6bb25
commit d86b2c3746
27 changed files with 3129 additions and 55 deletions

View File

@@ -0,0 +1 @@
DROP TABLE IF EXISTS location_history;

View File

@@ -0,0 +1,19 @@
CREATE TABLE location_history (
id INTEGER PRIMARY KEY NOT NULL,
timestamp BIGINT NOT NULL,
latitude REAL NOT NULL,
longitude REAL NOT NULL,
accuracy INTEGER,
activity TEXT,
activity_confidence INTEGER,
place_name TEXT,
place_category TEXT,
embedding BLOB,
created_at BIGINT NOT NULL,
source_file TEXT,
UNIQUE(timestamp, latitude, longitude)
);
CREATE INDEX idx_location_timestamp ON location_history(timestamp);
CREATE INDEX idx_location_coords ON location_history(latitude, longitude);
CREATE INDEX idx_location_activity ON location_history(activity);