feat/raw-thumb-embedded-preview #59
26
CLAUDE.md
26
CLAUDE.md
@@ -111,6 +111,15 @@ All database access goes through trait-based DAOs (e.g., `ExifDao`, `SqliteExifD
|
|||||||
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
2. Creates 200x200 thumbnails in THUMBNAILS directory (mirrors source structure)
|
||||||
3. Videos: extracts frame at 3-second mark via ffmpeg
|
3. Videos: extracts frame at 3-second mark via ffmpeg
|
||||||
4. Images: uses `image` crate for JPEG/PNG processing
|
4. Images: uses `image` crate for JPEG/PNG processing
|
||||||
|
5. RAW formats (NEF/CR2/ARW/DNG/etc.): the `image` crate can't decode RAW
|
||||||
|
pixel data, so the pipeline pulls an embedded JPEG preview instead. Fast
|
||||||
|
path is `exif::read_jpeg_at_ifd` against IFD0 (PRIMARY) and IFD1
|
||||||
|
(THUMBNAIL) — covers most older bodies and DNGs. Slow-path fallback shells
|
||||||
|
out to **`exiftool`** for `PreviewImage` / `JpgFromRaw` / `OtherImage`,
|
||||||
|
which reaches MakerNote / SubIFD-hosted previews kamadak-exif can't see
|
||||||
|
(e.g. Nikon's `PreviewIFD`, where modern Nikon bodies store the full-res
|
||||||
|
review JPEG). All candidates are pooled and the largest valid JPEG wins.
|
||||||
|
See `src/exif.rs::extract_embedded_jpeg_preview`.
|
||||||
|
|
||||||
**File Watching:**
|
**File Watching:**
|
||||||
Runs in background thread with two-tier strategy:
|
Runs in background thread with two-tier strategy:
|
||||||
@@ -364,6 +373,8 @@ Configurable env:
|
|||||||
|
|
||||||
## Dependencies of Note
|
## Dependencies of Note
|
||||||
|
|
||||||
|
### Rust crates
|
||||||
|
|
||||||
- **actix-web**: HTTP framework
|
- **actix-web**: HTTP framework
|
||||||
- **diesel**: ORM for SQLite
|
- **diesel**: ORM for SQLite
|
||||||
- **jsonwebtoken**: JWT implementation
|
- **jsonwebtoken**: JWT implementation
|
||||||
@@ -374,3 +385,18 @@ Configurable env:
|
|||||||
- **opentelemetry**: Distributed tracing
|
- **opentelemetry**: Distributed tracing
|
||||||
- **bcrypt**: Password hashing
|
- **bcrypt**: Password hashing
|
||||||
- **infer**: Magic number file type detection
|
- **infer**: Magic number file type detection
|
||||||
|
|
||||||
|
### External binaries (must be on `PATH`)
|
||||||
|
|
||||||
|
- **`ffmpeg`** — video thumbnail extraction (`StreamActor`, HLS pipeline) and
|
||||||
|
the HEIF/HEIC/NEF/ARW thumbnail fallback in `generate_image_thumbnail_ffmpeg`.
|
||||||
|
Required for any deploy that holds video or HEIF files.
|
||||||
|
- **`exiftool`** — optional but strongly recommended for RAW-heavy libraries.
|
||||||
|
The thumbnail pipeline shells out to it as the slow-path fallback for
|
||||||
|
embedded preview extraction (Nikon MakerNote `PreviewIFD`, Canon SubIFDs,
|
||||||
|
etc. — anything kamadak-exif's IFD0/IFD1 readers can't reach). Without
|
||||||
|
exiftool installed, RAWs whose preview lives outside IFD0/IFD1 will fall
|
||||||
|
through to ffmpeg, which often produces black thumbnails. Install via
|
||||||
|
package manager: `apt install libimage-exiftool-perl`,
|
||||||
|
`brew install exiftool`, `winget install OliverBetz.ExifTool`, or
|
||||||
|
`choco install exiftool`.
|
||||||
|
|||||||
31
README.md
31
README.md
@@ -28,14 +28,31 @@ Builds used in development: the `gyan.dev` full build on Windows, and distro `ff
|
|||||||
packages on Linux work fine. If HEIC thumbnails silently fail, check
|
packages on Linux work fine. If HEIC thumbnails silently fail, check
|
||||||
`ffmpeg -formats | grep heif` to confirm HEIF support.
|
`ffmpeg -formats | grep heif` to confirm HEIF support.
|
||||||
|
|
||||||
### RAW photo thumbnails (no extra dependency)
|
### RAW photo thumbnails
|
||||||
RAW formats (ARW, NEF, CR2, CR3, DNG, RAF, ORF, RW2, PEF, SRW, TIFF) are thumbnailed
|
RAW formats (ARW, NEF, CR2, CR3, DNG, RAF, ORF, RW2, PEF, SRW, TIFF) are thumbnailed
|
||||||
by reading the embedded JPEG preview from the TIFF IFD1 using `kamadak-exif`. No
|
by reading an embedded JPEG preview out of the TIFF container — no external RAW
|
||||||
external RAW decoder (libraw / dcraw) is required. Files without an embedded preview
|
decoder (libraw / dcraw) is involved. The pipeline tries two layers in order and
|
||||||
fall back to ffmpeg (works for most NEF files), and anything that still can't be
|
keeps the largest valid JPEG:
|
||||||
decoded is marked with a `<thumb>.unsupported` sentinel in the thumbnail directory
|
|
||||||
so we don't retry it every scan. Delete those sentinels to force retries after a
|
1. **Fast path (no extra dependency)** — `kamadak-exif` reads
|
||||||
tooling upgrade.
|
`JPEGInterchangeFormat` from IFD0 / IFD1 directly. Covers older bodies and
|
||||||
|
most DNGs.
|
||||||
|
2. **`exiftool` fallback (recommended for RAW-heavy libraries)** — shells out
|
||||||
|
to extract `PreviewImage` / `JpgFromRaw` / `OtherImage`, which reaches
|
||||||
|
MakerNote and SubIFD-hosted previews kamadak-exif can't see (e.g. Nikon's
|
||||||
|
`PreviewIFD`, where modern Nikon bodies stash the full-res review JPEG).
|
||||||
|
If `exiftool` isn't on `PATH` this layer is skipped silently and only the
|
||||||
|
fast-path result is used.
|
||||||
|
|
||||||
|
Install `exiftool` via your package manager:
|
||||||
|
- macOS: `brew install exiftool`
|
||||||
|
- Linux (Debian/Ubuntu): `apt install libimage-exiftool-perl`
|
||||||
|
- Windows: `winget install OliverBetz.ExifTool` or `choco install exiftool`
|
||||||
|
|
||||||
|
Files where neither layer produces a valid preview fall back to ffmpeg. Anything
|
||||||
|
that still can't be decoded is marked with a `<thumb>.unsupported` sentinel in
|
||||||
|
the thumbnail directory so we don't retry it every scan. Delete those sentinels
|
||||||
|
(and any cached black thumbnails) to force retries after a tooling upgrade.
|
||||||
|
|
||||||
## Environment
|
## Environment
|
||||||
There are a handful of required environment variables to have the API run.
|
There are a handful of required environment variables to have the API run.
|
||||||
|
|||||||
118
src/exif.rs
118
src/exif.rs
@@ -1,6 +1,7 @@
|
|||||||
use std::fs::File;
|
use std::fs::File;
|
||||||
use std::io::{BufReader, Read, Seek, SeekFrom};
|
use std::io::{BufReader, Read, Seek, SeekFrom};
|
||||||
use std::path::Path;
|
use std::path::Path;
|
||||||
|
use std::process::Command;
|
||||||
|
|
||||||
use anyhow::{Result, anyhow};
|
use anyhow::{Result, anyhow};
|
||||||
use exif::{In, Reader, Tag, Value};
|
use exif::{In, Reader, Tag, Value};
|
||||||
@@ -28,7 +29,7 @@ pub struct ExifData {
|
|||||||
|
|
||||||
/// TIFF-based RAW formats where `JPEGInterchangeFormat` offsets are
|
/// TIFF-based RAW formats where `JPEGInterchangeFormat` offsets are
|
||||||
/// absolute file offsets (the file itself is a TIFF container).
|
/// absolute file offsets (the file itself is a TIFF container).
|
||||||
fn is_tiff_raw(path: &Path) -> bool {
|
pub fn is_tiff_raw(path: &Path) -> bool {
|
||||||
matches!(
|
matches!(
|
||||||
path.extension()
|
path.extension()
|
||||||
.and_then(|e| e.to_str())
|
.and_then(|e| e.to_str())
|
||||||
@@ -40,26 +41,18 @@ fn is_tiff_raw(path: &Path) -> bool {
|
|||||||
)
|
)
|
||||||
}
|
}
|
||||||
|
|
||||||
/// Returns the bytes of the embedded JPEG thumbnail in a TIFF-based RAW or
|
/// Read the JPEG bytes pointed to by `JPEGInterchangeFormat` /
|
||||||
/// TIFF file. Used to thumbnail formats whose RAW pixel data can't be decoded
|
/// `JPEGInterchangeFormatLength` in a single IFD. Returns `None` on any
|
||||||
/// by our normal tools (e.g. Sony ARW). Returns `None` if no preview is
|
/// failure: tags missing, length zero, file read failure, or bytes that
|
||||||
/// present, the file isn't a TIFF container, or the data doesn't look like
|
/// don't start with the JPEG SOI marker (some MakerNote pointers reference
|
||||||
/// a valid JPEG.
|
/// TIFF-wrapped previews or other non-JPEG payloads we can't load).
|
||||||
pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
fn read_jpeg_at_ifd(exif: &exif::Exif, path: &Path, ifd: In) -> Option<Vec<u8>> {
|
||||||
if !is_tiff_raw(path) {
|
|
||||||
return None;
|
|
||||||
}
|
|
||||||
|
|
||||||
let file = File::open(path).ok()?;
|
|
||||||
let mut bufreader = BufReader::new(file);
|
|
||||||
let exif = Reader::new().read_from_container(&mut bufreader).ok()?;
|
|
||||||
|
|
||||||
let offset = exif
|
let offset = exif
|
||||||
.get_field(Tag::JPEGInterchangeFormat, In::THUMBNAIL)?
|
.get_field(Tag::JPEGInterchangeFormat, ifd)?
|
||||||
.value
|
.value
|
||||||
.get_uint(0)?;
|
.get_uint(0)?;
|
||||||
let length = exif
|
let length = exif
|
||||||
.get_field(Tag::JPEGInterchangeFormatLength, In::THUMBNAIL)?
|
.get_field(Tag::JPEGInterchangeFormatLength, ifd)?
|
||||||
.value
|
.value
|
||||||
.get_uint(0)?;
|
.get_uint(0)?;
|
||||||
if length == 0 {
|
if length == 0 {
|
||||||
@@ -71,8 +64,6 @@ pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
|||||||
let mut buf = vec![0u8; length as usize];
|
let mut buf = vec![0u8; length as usize];
|
||||||
file.read_exact(&mut buf).ok()?;
|
file.read_exact(&mut buf).ok()?;
|
||||||
|
|
||||||
// JPEG SOI marker sanity check — MakerNote offsets sometimes point at
|
|
||||||
// TIFF-wrapped previews or other non-JPEG data.
|
|
||||||
if buf.len() < 2 || buf[0] != 0xFF || buf[1] != 0xD8 {
|
if buf.len() < 2 || buf[0] != 0xFF || buf[1] != 0xD8 {
|
||||||
return None;
|
return None;
|
||||||
}
|
}
|
||||||
@@ -80,6 +71,95 @@ pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
|||||||
Some(buf)
|
Some(buf)
|
||||||
}
|
}
|
||||||
|
|
||||||
|
/// Tags exiftool exposes for embedded JPEG previews, in priority order. The
|
||||||
|
/// largest valid JPEG returned by any of them wins. Different camera makers
|
||||||
|
/// stash their largest preview under different names: Nikon's full-res
|
||||||
|
/// preview lives under `PreviewImage` in the MakerNote `PreviewIFD`, Canon /
|
||||||
|
/// Sony often expose theirs as `JpgFromRaw`, and `OtherImage` is a catch-all
|
||||||
|
/// some sub-IFD chains use.
|
||||||
|
const EXIFTOOL_PREVIEW_TAGS: &[&str] = &["PreviewImage", "JpgFromRaw", "OtherImage"];
|
||||||
|
|
||||||
|
/// Shell out to `exiftool -b -<tag>` for one tag. Returns the response bytes
|
||||||
|
/// only if exiftool succeeded AND the bytes start with the JPEG SOI marker
|
||||||
|
/// (some MakerNote tags hold TIFF-wrapped previews or other non-JPEG payloads
|
||||||
|
/// we can't load).
|
||||||
|
fn extract_exiftool_tag(path: &Path, tag: &str) -> Option<Vec<u8>> {
|
||||||
|
let output = Command::new("exiftool")
|
||||||
|
.arg("-b")
|
||||||
|
.arg(format!("-{}", tag))
|
||||||
|
.arg(path)
|
||||||
|
.output()
|
||||||
|
.ok()?;
|
||||||
|
|
||||||
|
if !output.status.success() {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
let bytes = output.stdout;
|
||||||
|
if bytes.len() < 2 || bytes[0] != 0xFF || bytes[1] != 0xD8 {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
Some(bytes)
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Try each EXIFTOOL_PREVIEW_TAGS in turn and return the largest valid JPEG.
|
||||||
|
/// If `exiftool` isn't on PATH the very first spawn returns `None` and we
|
||||||
|
/// silently bail — callers fall back to whatever the IFD0/IFD1 fast path
|
||||||
|
/// found.
|
||||||
|
fn extract_preview_via_exiftool(path: &Path) -> Option<Vec<u8>> {
|
||||||
|
let mut best: Option<Vec<u8>> = None;
|
||||||
|
for &tag in EXIFTOOL_PREVIEW_TAGS {
|
||||||
|
let Some(bytes) = extract_exiftool_tag(path, tag) else {
|
||||||
|
continue;
|
||||||
|
};
|
||||||
|
match &best {
|
||||||
|
None => best = Some(bytes),
|
||||||
|
Some(b) if b.len() < bytes.len() => best = Some(bytes),
|
||||||
|
_ => {}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
best
|
||||||
|
}
|
||||||
|
|
||||||
|
/// Returns the bytes of the embedded JPEG preview in a TIFF-based RAW or
|
||||||
|
/// TIFF file. Used to thumbnail formats whose RAW pixel data can't be decoded
|
||||||
|
/// by our normal tools (e.g. Sony ARW), and to serve a usable full-size
|
||||||
|
/// image for clients that can't decode the RAW container directly. Returns
|
||||||
|
/// `None` if no preview is present, the file isn't a TIFF container, or the
|
||||||
|
/// data doesn't look like a valid JPEG.
|
||||||
|
///
|
||||||
|
/// Strategy:
|
||||||
|
/// 1. Fast path: read `JPEGInterchangeFormat` from IFD0 (PRIMARY) and IFD1
|
||||||
|
/// (THUMBNAIL) directly via kamadak-exif. No subprocess, no external
|
||||||
|
/// dependency.
|
||||||
|
/// 2. Slow path: shell out to `exiftool -b -<tag>` for each of
|
||||||
|
/// `PreviewImage` / `JpgFromRaw` / `OtherImage`. kamadak-exif can't
|
||||||
|
/// reach SubIFDs or MakerNote sub-IFDs, but most modern Nikon bodies
|
||||||
|
/// stash their large preview JPEG in the Nikon MakerNote's PreviewIFD;
|
||||||
|
/// Canon / Sony often use `JpgFromRaw` in a SubIFD chain. Skipped
|
||||||
|
/// gracefully if exiftool isn't on PATH.
|
||||||
|
///
|
||||||
|
/// All candidates are pooled and the largest valid JPEG wins, so a deploy
|
||||||
|
/// without exiftool degrades to "fast-path only" behavior rather than
|
||||||
|
/// breaking outright.
|
||||||
|
pub fn extract_embedded_jpeg_preview(path: &Path) -> Option<Vec<u8>> {
|
||||||
|
if !is_tiff_raw(path) {
|
||||||
|
return None;
|
||||||
|
}
|
||||||
|
|
||||||
|
let file = File::open(path).ok()?;
|
||||||
|
let mut bufreader = BufReader::new(file);
|
||||||
|
let exif = Reader::new().read_from_container(&mut bufreader).ok()?;
|
||||||
|
|
||||||
|
let primary = read_jpeg_at_ifd(&exif, path, In::PRIMARY);
|
||||||
|
let thumbnail = read_jpeg_at_ifd(&exif, path, In::THUMBNAIL);
|
||||||
|
let exiftool = extract_preview_via_exiftool(path);
|
||||||
|
|
||||||
|
[primary, thumbnail, exiftool]
|
||||||
|
.into_iter()
|
||||||
|
.flatten()
|
||||||
|
.max_by_key(|v| v.len())
|
||||||
|
}
|
||||||
|
|
||||||
pub fn supports_exif(path: &Path) -> bool {
|
pub fn supports_exif(path: &Path) -> bool {
|
||||||
if let Some(ext) = path.extension() {
|
if let Some(ext) = path.extension() {
|
||||||
let ext_lower = ext.to_string_lossy().to_lowercase();
|
let ext_lower = ext.to_string_lossy().to_lowercase();
|
||||||
|
|||||||
17
src/main.rs
17
src/main.rs
@@ -215,6 +215,23 @@ async fn get_image(
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// Full-size requests for RAW formats (NEF/CR2/ARW/etc.) can't just
|
||||||
|
// NamedFile-stream the original bytes — browsers won't decode the
|
||||||
|
// RAW container, so a `<img src=...>` lands as a broken image. Serve
|
||||||
|
// the embedded JPEG preview instead (typically the camera's in-body
|
||||||
|
// review JPEG, ~1–2 MP). Falls through to NamedFile if no preview is
|
||||||
|
// available, which preserves the historical behavior for callers
|
||||||
|
// that genuinely want the original bytes.
|
||||||
|
if image_size == PhotoSize::Full && exif::is_tiff_raw(&path) {
|
||||||
|
if let Some(preview) = exif::extract_embedded_jpeg_preview(&path) {
|
||||||
|
span.set_status(Status::Ok);
|
||||||
|
return HttpResponse::Ok()
|
||||||
|
.content_type("image/jpeg")
|
||||||
|
.insert_header(("Cache-Control", "public, max-age=3600"))
|
||||||
|
.body(preview);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
if let Ok(file) = NamedFile::open(&path) {
|
if let Ok(file) = NamedFile::open(&path) {
|
||||||
span.set_status(Status::Ok);
|
span.set_status(Status::Ok);
|
||||||
// Enable ETag and set cache headers for full images (1 hour cache)
|
// Enable ETag and set cache headers for full images (1 hour cache)
|
||||||
|
|||||||
Reference in New Issue
Block a user