feat/raw-thumb-embedded-preview #59
26
CLAUDE.md
26
CLAUDE.md
@@ -111,6 +111,15 @@ All database access goes through trait-based DAOs (e.g., `ExifDao`, `SqliteExifD
|
|||||||
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
||||||
3. Videos: extracts frame at 3-second mark via ffmpeg
|
3. Videos: extracts frame at 3-second mark via ffmpeg
|
||||||
4. Images: uses `image` crate for JPEG/PNG processing
|
4. Images: uses `image` crate for JPEG/PNG processing
|
||||||
|
5. RAW formats (NEF/CR2/ARW/DNG/etc.): the `image` crate can't decode RAW
|
||||||
|
pixel data, so the pipeline pulls an embedded JPEG preview instead. Fast
|
||||||
|
path is `exif::read_jpeg_at_ifd` against IFD0 (PRIMARY) and IFD1
|
||||||
|
(THUMBNAIL) — covers most older bodies and DNGs. Slow-path fallback shells
|
||||||
|
out to **`exiftool`** for `PreviewImage` / `JpgFromRaw` / `OtherImage`,
|
||||||
|
which reaches MakerNote / SubIFD-hosted previews kamadak-exif can't see
|
||||||
|
(e.g. Nikon's `PreviewIFD`, where modern Nikon bodies store the full-res
|
||||||
|
review JPEG). All candidates are pooled and the largest valid JPEG wins.
|
||||||
|
See `src/exif.rs::extract_embedded_jpeg_preview`.
|
||||||
|
|
||||||
**File Watching:**
|
**File Watching:**
|
||||||
Runs in background thread with two-tier strategy:
|
Runs in background thread with two-tier strategy:
|
||||||
@@ -364,6 +373,8 @@ Configurable env:
|
|||||||
|
|
||||||
## Dependencies of Note
|
## Dependencies of Note
|
||||||
|
|
||||||
|
### Rust crates
|
||||||
|
|
||||||
- **actix-web**: HTTP framework
|
- **actix-web**: HTTP framework
|
||||||
- **diesel**: ORM for SQLite
|
- **diesel**: ORM for SQLite
|
||||||
- **jsonwebtoken**: JWT implementation
|
- **jsonwebtoken**: JWT implementation
|
||||||
@@ -374,3 +385,18 @@ Configurable env:
|
|||||||
- **opentelemetry**: Distributed tracing
|
- **opentelemetry**: Distributed tracing
|
||||||
- **bcrypt**: Password hashing
|
- **bcrypt**: Password hashing
|
||||||
- **infer**: Magic number file type detection
|
- **infer**: Magic number file type detection
|
||||||
|
|
||||||
|
### External binaries (must be on `PATH`)
|
||||||
|
|
||||||
|
- **`ffmpeg`** — video thumbnail extraction (`StreamActor`, HLS pipeline) and
|
||||||
|
the HEIF/HEIC/NEF/ARW thumbnail fallback in `generate_image_thumbnail_ffmpeg`.
|
||||||
|
Required for any deploy that holds video or HEIF files.
|
||||||
|
- **`exiftool`** — optional but strongly recommended for RAW-heavy libraries.
|
||||||
|
The thumbnail pipeline shells out to it as the slow-path fallback for
|
||||||
|
embedded preview extraction (Nikon MakerNote `PreviewIFD`, Canon SubIFDs,
|
||||||
|
etc. — anything kamadak-exif's IFD0/IFD1 readers can't reach). Without
|
||||||
|
exiftool installed, RAWs whose preview lives outside IFD0/IFD1 will fall
|
||||||
|
through to ffmpeg, which often produces black thumbnails. Install via
|
||||||
|
package manager: `apt install libimage-exiftool-perl`,
|
||||||
|
`brew install exiftool`, `winget install OliverBetz.ExifTool`, or
|
||||||
|
`choco install exiftool`.
|
||||||
|
|||||||
31
README.md
31
README.md
@@ -28,14 +28,31 @@ Builds used in development: the `gyan.dev` full build on Windows, and distro `ff
|
|||||||
packages on Linux work fine. If HEIC thumbnails silently fail, check
|
packages on Linux work fine. If HEIC thumbnails silently fail, check
|
||||||
`ffmpeg -formats | grep heif` to confirm HEIF support.
|
`ffmpeg -formats | grep heif` to confirm HEIF support.
|
||||||
|
|
||||||
### RAW photo thumbnails (no extra dependency)
|
### RAW photo thumbnails
|
||||||
RAW formats (ARW, NEF, CR2, CR3, DNG, RAF, ORF, RW2, PEF, SRW, TIFF) are thumbnailed
|
RAW formats (ARW, NEF, CR2, CR3, DNG, RAF, ORF, RW2, PEF, SRW, TIFF) are thumbnailed
|
||||||
by reading the embedded JPEG preview from the TIFF IFD1 using `kamadak-exif`. No
|
by reading an embedded JPEG preview out of the TIFF container — no external RAW
|
||||||
external RAW decoder (libraw / dcraw) is required. Files without an embedded preview
|
decoder (libraw / dcraw) is involved. The pipeline tries two layers in order and
|
||||||
fall back to ffmpeg (works for most NEF files), and anything that still can't be
|
keeps the largest valid JPEG:
|
||||||
decoded is marked with a `<thumb>.unsupported` sentinel in the thumbnail directory
|
|
||||||
so we don't retry it every scan. Delete those sentinels to force retries after a
|
1. **Fast path (no extra dependency)** — `kamadak-exif` reads
|
||||||
tooling upgrade.
|
`JPEGInterchangeFormat` from IFD0 / IFD1 directly. Covers older bodies and
|
||||||
|
most DNGs.
|
||||||
|
2. **`exiftool` fallback (recommended for RAW-heavy libraries)** — shells out
|
||||||
|
to extract `PreviewImage` / `JpgFromRaw` / `OtherImage`, which reaches
|
||||||
|
MakerNote and SubIFD-hosted previews kamadak-exif can't see (e.g. Nikon's
|
||||||
|
`PreviewIFD`, where modern Nikon bodies stash the full-res review JPEG).
|
||||||
|
If `exiftool` isn't on `PATH` this layer is skipped silently and only the
|
||||||
|
fast-path result is used.
|
||||||
|
|
||||||
|
Install `exiftool` via your package manager:
|
||||||
|
- macOS: `brew install exiftool`
|
||||||
|
- Linux (Debian/Ubuntu): `apt install libimage-exiftool-perl`
|
||||||
|
- Windows: `winget install OliverBetz.ExifTool` or `choco install exiftool`
|
||||||
|
|
||||||
|
Files where neither layer produces a valid preview fall back to ffmpeg. Anything
|
||||||
|
that still can't be decoded is marked with a `<thumb>.unsupported` sentinel in
|
||||||
|
the thumbnail directory so we don't retry it every scan. Delete those sentinels
|
||||||
|
(and any cached black thumbnails) to force retries after a tooling upgrade.
|
||||||
|
|
||||||
## Environment
|
## Environment
|
||||||
There are a handful of required environment variables to have the API run.
|
There are a handful of required environment variables to have the API run.
|
||||||
|
|||||||
81
src/exif.rs
81
src/exif.rs
@@ -1,6 +1,7 @@
|
|||||||
use std::fs::File;
|
use std::fs::File;
|
||||||
use std::io::{BufReader, Read, Seek, SeekFrom};
|
use std::io::{BufReader, Read, Seek, SeekFrom};
|
||||||
use std::path::Path;
|
use std::path::Path;
|
||||||
|
use std::process::Command;
|
||||||
|
|
||||||
use anyhow::{Result, anyhow};
|
use anyhow::{Result, anyhow};
|
||||||
use exif::{In, Reader, Tag, Value};
|
use exif::{In, Reader, Tag, Value};
|
||||||
@@ -70,6 +71,55 @@ fn read_jpeg_at_ifd(exif: &exif::Exif, path: &Path, ifd: In) -> Option<Vec<u8>>
|
|||||||
Some(buf)
|
Some(buf)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Tags exiftool exposes for embedded JPEG previews, in priority order. The
|
||||||
|
/// largest valid JPEG returned by any of them wins. Different camera makers
|
||||||
|
/// stash their largest preview under different names: Nikon's full-res
|
||||||
|
/// preview lives under `PreviewImage` in the MakerNote `PreviewIFD`, Canon /
|
||||||
|
/// Sony often expose theirs as `JpgFromRaw`, and `OtherImage` is a catch-all
|
||||||
|
/// some sub-IFD chains use.
|
||||||
|
const EXIFTOOL_PREVIEW_TAGS: &[&str] = &["PreviewImage", "JpgFromRaw", "OtherImage"];
|
||||||
|
|
||||||
|
/// Shell out to `exiftool -b -<tag>` for one tag. Returns the response bytes
|
||||||
|
/// only if exiftool succeeded AND the bytes start with the JPEG SOI marker
|
||||||
|
/// (some MakerNote tags hold TIFF-wrapped previews or other non-JPEG payloads
|
||||||
|
/// we can't load).
|
||||||
|
fn extract_exiftool_tag(path: &Path, tag: &str) -> Option<Vec<u8>> {
|
||||||
|
let output = Command::new("exiftool")
|
||||||
|
.arg("-b")
|
||||||
|
.arg(format!("-{}", tag))
|
||||||
|
.arg(path)
|
||||||
|
.output()
|
||||||
|
.ok()?;
|
||||||
|
|
||||||
|
if !output.status.success() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
let bytes = output.stdout;
|
||||||
|
if bytes.len() < 2 || bytes[0] != 0xFF || bytes[1] != 0xD8 {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
Some(bytes)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Try each EXIFTOOL_PREVIEW_TAGS in turn and return the largest valid JPEG.
|
||||||
|
/// If `exiftool` isn't on PATH the very first spawn returns `None` and we
|
||||||
|
/// silently bail — callers fall back to whatever the IFD0/IFD1 fast path
|
||||||
|
/// found.
|
||||||
|
fn extract_preview_via_exiftool(path: &Path) -> Option<Vec<u8>> {
|
||||||
|
let mut best: Option<Vec<u8>> = None;
|
||||||
|
for &tag in EXIFTOOL_PREVIEW_TAGS {
|
||||||
|
let Some(bytes) = extract_exiftool_tag(path, tag) else {
|
||||||
|
continue;
|
||||||
|
};
|
||||||
|
match &best {
|
||||||
|
None => best = Some(bytes),
|
||||||
|
Some(b) if b.len() < bytes.len() => best = Some(bytes),
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
best
|
||||||
|
}
|
||||||
|
|
||||||
/// Returns the bytes of the embedded JPEG preview in a TIFF-based RAW or
|
/// Returns the bytes of the embedded JPEG preview in a TIFF-based RAW or
|
||||||
/// TIFF file. Used to thumbnail formats whose RAW pixel data can't be decoded
|
/// TIFF file. Used to thumbnail formats whose RAW pixel data can't be decoded
|
||||||
/// by our normal tools (e.g. Sony ARW), and to serve a usable full-size
|
/// by our normal tools (e.g. Sony ARW), and to serve a usable full-size
|
||||||
@@ -77,12 +127,20 @@ fn read_jpeg_at_ifd(exif: &exif::Exif, path: &Path, ifd: In) -> Option<Vec<u8>>
|
|||||||
/// `None` if no preview is present, the file isn't a TIFF container, or the
|
/// `None` if no preview is present, the file isn't a TIFF container, or the
|
||||||
/// data doesn't look like a valid JPEG.
|
/// data doesn't look like a valid JPEG.
|
||||||
///
|
///
|
||||||
/// Both IFD0 (PRIMARY) and IFD1 (THUMBNAIL) are checked, preferring the
|
/// Strategy:
|
||||||
/// larger valid JPEG. Conventions vary by camera: most modern Nikon NEFs
|
/// 1. Fast path: read `JPEGInterchangeFormat` from IFD0 (PRIMARY) and IFD1
|
||||||
/// expose the larger reduced-resolution preview (~1–2 MP) via IFD0 and a
|
/// (THUMBNAIL) directly via kamadak-exif. No subprocess, no external
|
||||||
/// small chip via IFD1; some bodies leave one or the other empty or zero-
|
/// dependency.
|
||||||
/// length, and an earlier THUMBNAIL-only implementation produced black
|
/// 2. Slow path: shell out to `exiftool -b -<tag>` for each of
|
||||||
/// thumbnails for any NEF whose IFD1 thumbnail was missing or corrupted.
|
/// `PreviewImage` / `JpgFromRaw` / `OtherImage`. kamadak-exif can't
|
||||||
|
/// reach SubIFDs or MakerNote sub-IFDs, but most modern Nikon bodies
|
||||||
|
/// stash their large preview JPEG in the Nikon MakerNote's PreviewIFD;
|
||||||
|
/// Canon / Sony often use `JpgFromRaw` in a SubIFD chain. Skipped
|
||||||
|
/// gracefully if exiftool isn't on PATH.
|
||||||
|
///
|
||||||
|
/// All candidates are pooled and the largest valid JPEG wins, so a deploy
|
||||||
|
/// without exiftool degrades to "fast-path only" behavior rather than
|
||||||
|
/// breaking outright.
|
||||||
pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
||||||
if !is_tiff_raw(path) {
|
if !is_tiff_raw(path) {
|
||||||
return None;
|
return None;
|
||||||
@@ -94,13 +152,12 @@ pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
|||||||
|
|
||||||
let primary = read_jpeg_at_ifd(&exif, path, In::PRIMARY);
|
let primary = read_jpeg_at_ifd(&exif, path, In::PRIMARY);
|
||||||
let thumbnail = read_jpeg_at_ifd(&exif, path, In::THUMBNAIL);
|
let thumbnail = read_jpeg_at_ifd(&exif, path, In::THUMBNAIL);
|
||||||
|
let exiftool = extract_preview_via_exiftool(path);
|
||||||
|
|
||||||
match (primary, thumbnail) {
|
[primary, thumbnail, exiftool]
|
||||||
(Some(p), Some(t)) => Some(if p.len() >= t.len() { p } else { t }),
|
.into_iter()
|
||||||
(Some(p), None) => Some(p),
|
.flatten()
|
||||||
(None, Some(t)) => Some(t),
|
.max_by_key(|v| v.len())
|
||||||
(None, None) => None,
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
pub fn supports_exif(path: &Path) -> bool {
|
pub fn supports_exif(path: &Path) -> bool {
|
||||||
|
|||||||
Reference in New Issue
Block a user