libraries: validate excluded_dirs entries on write

Reject the silent-footgun shapes that PathExcluder would store but
never match. The watcher would still walk past every photo as if the
exclude wasn't there, and the operator would have no signal that
their entry is dead. Caught at PATCH time with a descriptive 422.

Rules:
- Backslash anywhere → "use forward slashes" (catches \photos,
  photos\2024, \\server\share — Windows-typed entries land in the
  component-pattern bucket and never fire).
- Drive-letter prefix (Z:, Z:/...) → "relative to library root" —
  excludes are root-relative, not absolute system paths.
- Multi-segment name with no leading slash (photos/2024) →
  "did you mean /photos/2024?" — the common "I forgot the slash"
  typo, today silently stored as a component pattern that never hits.
- `..` segments in a path entry → "doesn't normalise". base.join()
  doesn't canonicalise, so the resulting prefix never matches.
- Bare "/" → "almost certainly a typo" for the library root.

Trailing slashes on path entries are stripped silently. Eight new
tests cover each rejection plus the trailing-slash normalisation
and the all-or-nothing failure mode of normalize_excluded_dirs_input
(one bad entry aborts the whole patch rather than silently applying
N-1 of N changes).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Cameron Cordes
2026-05-13 09:02:29 -04:00
parent ce9fa94cb4
commit 439532377d

View File

@@ -97,29 +97,118 @@ pub fn parse_excluded_dirs_column(raw: Option<&str>) -> Vec<String> {
}
}
/// Canonicalise an excluded_dirs string for storage: parse → trim →
/// dedupe (preserving insertion order) → comma-join with no inner
/// whitespace. Empty / whitespace-only input → `None` (writes NULL).
/// Validate a single excluded_dirs entry, normalising trivial cosmetic
/// differences and rejecting forms that `PathExcluder` would silently
/// drop. Returns the entry to store, or an error message describing
/// what's wrong with it.
///
/// Rules:
/// - Backslashes are rejected — PathExcluder strips only a leading `/`;
/// a Windows-typed `\photos` or `photos\2024` lands in the
/// component-pattern bucket and never matches anything. Suggest the
/// forward-slash form.
/// - A Windows drive letter prefix (`Z:` etc.) is rejected — excluded
/// entries are *relative to the library root*, not absolute system
/// paths.
/// - A no-leading-slash entry containing `/` is rejected — the
/// component-pattern path matches a single segment only; the user
/// almost certainly meant the leading-slash form.
/// - A `..` segment in a path entry is rejected — `base.join("../x")`
/// doesn't canonicalise, so the resulting prefix never matches and
/// the exclude silently fails.
/// - Trailing slashes on path entries are stripped silently
/// (`/photos/` → `/photos`) — purely cosmetic.
pub fn validate_excluded_dirs_entry(entry: &str) -> Result<String, String> {
let trimmed = entry.trim();
if trimmed.is_empty() {
return Err("empty entry".to_string());
}
if trimmed.contains('\\') {
return Err(format!(
"'{}': use forward slashes — backslash paths never match on the watcher's component-by-component compare",
trimmed
));
}
// Windows drive letter prefix like `Z:` or `Z:/something`. A
// length-2 ASCII-alpha + colon is the canonical form; we don't
// bother with longer multi-letter Windows drive-equivalents
// (`\\?\Volume{…}`) since the backslash check already catches them.
let bytes = trimmed.as_bytes();
if bytes.len() >= 2 && bytes[0].is_ascii_alphabetic() && bytes[1] == b':' {
return Err(format!(
"'{}': excluded entries are relative to the library root, not absolute system paths — drop the drive letter",
trimmed
));
}
if let Some(rel) = trimmed.strip_prefix('/') {
// Path form. Reject `..` traversal — `base.join(\"../x\")` doesn't
// canonicalise, so `path.starts_with(...)` never matches.
if rel
.split('/')
.any(|seg| seg == "..")
{
return Err(format!(
"'{}': '..' segments don't normalise — the prefix-match never fires",
trimmed
));
}
// Strip a trailing slash if any (`/photos/` → `/photos`). Purely
// cosmetic; PathBuf::starts_with treats both forms identically.
let stripped = if rel.ends_with('/') {
format!("/{}", rel.trim_end_matches('/'))
} else {
trimmed.to_string()
};
// After stripping, an empty rel ("/" alone) excludes the root —
// certainly a typo.
if stripped == "/" {
return Err("'/': excluding the library root is almost certainly a typo".to_string());
}
Ok(stripped)
} else {
// Component-pattern form: must be a single segment. A `/`
// anywhere here is the common "I forgot the leading slash" typo
// — reject so the user fixes it instead of staring at an
// exclude that does nothing.
if trimmed.contains('/') {
return Err(format!(
"'{}': multi-segment names only match with a leading slash — try '/{}'",
trimmed, trimmed
));
}
Ok(trimmed.to_string())
}
}
/// Canonicalise an excluded_dirs string for storage: validate each
/// entry, then parse → trim → dedupe (preserving insertion order) →
/// comma-join with no inner whitespace. Empty / whitespace-only input
/// → `Ok(None)` (writes NULL). Any entry that fails validation aborts
/// the whole patch with a descriptive error so the operator can fix
/// the typo before retrying.
///
/// Used by `PATCH /libraries/{id}` so two users typing the same entries
/// in different orders / casings / whitespace land on the same stored
/// form, and a typo'd duplicate (`@eaDir, @eaDir`) collapses on save.
/// Round-trip stable: writing the output back through this function
/// yields the same string.
pub fn normalize_excluded_dirs_input(raw: &str) -> Option<String> {
pub fn normalize_excluded_dirs_input(raw: &str) -> Result<Option<String>, String> {
let parsed = parse_excluded_dirs_column(Some(raw));
if parsed.is_empty() {
return None;
return Ok(None);
}
let mut seen = std::collections::HashSet::new();
let deduped: Vec<String> = parsed
.into_iter()
.filter(|s| seen.insert(s.clone()))
.collect();
let mut deduped: Vec<String> = Vec::with_capacity(parsed.len());
for entry in parsed {
let validated = validate_excluded_dirs_entry(&entry)?;
if seen.insert(validated.clone()) {
deduped.push(validated);
}
}
if deduped.is_empty() {
None
Ok(None)
} else {
Some(deduped.join(","))
Ok(Some(deduped.join(",")))
}
}
@@ -454,11 +543,16 @@ pub async fn patch_library(
}
}
if let Some(raw) = body.excluded_dirs.as_deref() {
// Canonicalise on write — trim, dedupe, drop empties — so the DB
// stores a round-trip-stable form regardless of how messy the
// user typed it. Empty / whitespace-only → NULL (matches a
// never-set library, parse_excluded_dirs_column returns []).
let normalised = normalize_excluded_dirs_input(raw);
// Canonicalise on write — trim, dedupe, validate, drop empties —
// so the DB stores a round-trip-stable form regardless of how
// messy the user typed it. Empty / whitespace-only → NULL
// (matches a never-set library). Validation failures (Windows
// backslash paths, drive letters, `..` traversal, etc.) bounce
// back as 422 so the operator can fix the typo.
let normalised = match normalize_excluded_dirs_input(raw) {
Ok(v) => v,
Err(msg) => return HttpResponse::UnprocessableEntity().body(msg),
};
let stored: Option<&str> = normalised.as_deref();
match diesel::update(libraries::table.filter(libraries::id.eq(lib_id)))
.set(libraries::excluded_dirs.eq(stored))
@@ -693,10 +787,10 @@ mod tests {
#[test]
fn normalize_excluded_dirs_input_handles_empty_and_whitespace() {
assert_eq!(normalize_excluded_dirs_input(""), None);
assert_eq!(normalize_excluded_dirs_input(" "), None);
assert_eq!(normalize_excluded_dirs_input(",,,"), None);
assert_eq!(normalize_excluded_dirs_input(" , , "), None);
assert_eq!(normalize_excluded_dirs_input(""), Ok(None));
assert_eq!(normalize_excluded_dirs_input(" "), Ok(None));
assert_eq!(normalize_excluded_dirs_input(",,,"), Ok(None));
assert_eq!(normalize_excluded_dirs_input(" , , "), Ok(None));
}
#[test]
@@ -705,7 +799,7 @@ mod tests {
// spaces. Mirrors how parse_excluded_dirs_column reads it back.
assert_eq!(
normalize_excluded_dirs_input(" @eaDir , /private , .thumbnails "),
Some("@eaDir,/private,.thumbnails".to_string())
Ok(Some("@eaDir,/private,.thumbnails".to_string()))
);
}
@@ -716,7 +810,7 @@ mod tests {
// intent on round-trip).
assert_eq!(
normalize_excluded_dirs_input("@eaDir, /private, @eaDir, /private"),
Some("@eaDir,/private".to_string())
Ok(Some("@eaDir,/private".to_string()))
);
// Whitespace-distinct entries collapse to the same canonical
// form. Case is preserved — `Foo` and `foo` are different keys
@@ -724,7 +818,7 @@ mod tests {
// make that call here).
assert_eq!(
normalize_excluded_dirs_input(" Foo,foo, Foo "),
Some("Foo,foo".to_string())
Ok(Some("Foo,foo".to_string()))
);
}
@@ -734,8 +828,12 @@ mod tests {
// same string. PATCH-clearing edits round-trip cleanly through
// parse_excluded_dirs_column too.
let raw = " /a/b ,, /a/b , c ";
let once = normalize_excluded_dirs_input(raw).expect("not empty");
let twice = normalize_excluded_dirs_input(&once).expect("not empty");
let once = normalize_excluded_dirs_input(raw)
.expect("validation passes")
.expect("not empty");
let twice = normalize_excluded_dirs_input(&once)
.expect("validation passes")
.expect("not empty");
assert_eq!(once, twice);
// Parsing the stored form back gives the deduped Vec.
assert_eq!(
@@ -744,6 +842,102 @@ mod tests {
);
}
#[test]
fn validate_rejects_backslash_paths() {
// Windows-typed entries land in the component-pattern bucket
// and never match — reject so the user gets feedback instead
// of a silent no-op.
assert!(validate_excluded_dirs_entry(r"\photos").is_err());
assert!(validate_excluded_dirs_entry(r"photos\2024").is_err());
assert!(validate_excluded_dirs_entry(r"\\server\share").is_err());
// The error message names the entry and points at the fix.
let err = validate_excluded_dirs_entry(r"\photos").unwrap_err();
assert!(err.contains("forward slashes"), "{}", err);
}
#[test]
fn validate_rejects_windows_drive_letters() {
assert!(validate_excluded_dirs_entry("Z:/photos").is_err());
assert!(validate_excluded_dirs_entry("z:photos").is_err());
// Single-letter alpha + colon is the canonical drive prefix;
// the message should steer toward the relative form.
let err = validate_excluded_dirs_entry("Z:/foo").unwrap_err();
assert!(err.contains("relative to the library root"), "{}", err);
}
#[test]
fn validate_rejects_multi_segment_name_without_leading_slash() {
// The common "I forgot the slash" typo. Today this would store
// a never-matching component pattern; we catch it.
let err = validate_excluded_dirs_entry("photos/2024").unwrap_err();
assert!(err.contains("multi-segment"), "{}", err);
// And the suggestion shows the corrected form.
assert!(err.contains("/photos/2024"), "{}", err);
}
#[test]
fn validate_rejects_parent_dir_traversal_in_path_entries() {
// base.join("../sensitive") doesn't canonicalise, so the
// resulting prefix never starts_with anything the walker sees.
assert!(validate_excluded_dirs_entry("/../secret").is_err());
assert!(validate_excluded_dirs_entry("/photos/../keys").is_err());
// Same string as a non-leading-slash component is fine — it
// just never matches (you'd literally need a directory named
// `..` which is impossible on every filesystem we care about),
// but the validator accepts it because the failure mode isn't
// a silent footgun in that direction.
assert!(validate_excluded_dirs_entry("..").is_ok());
}
#[test]
fn validate_strips_trailing_slash_on_path_entries() {
assert_eq!(
validate_excluded_dirs_entry("/photos/").unwrap(),
"/photos"
);
assert_eq!(
validate_excluded_dirs_entry("/photos//").unwrap(),
"/photos"
);
// Bare "/" is rejected — almost certainly a typo for the
// library root.
assert!(validate_excluded_dirs_entry("/").is_err());
assert!(validate_excluded_dirs_entry("///").is_err());
}
#[test]
fn validate_passes_valid_entries() {
for entry in &[
"/photos",
"/photos/2024",
"/media/raw/private",
"@eaDir",
".thumbnails",
".DS_Store",
"node_modules",
] {
assert!(
validate_excluded_dirs_entry(entry).is_ok(),
"expected {} to pass",
entry
);
}
}
#[test]
fn normalize_aborts_on_invalid_entry() {
// One bad entry kills the whole patch — better to surface the
// problem than to silently apply N-1 of N changes.
let err = normalize_excluded_dirs_input("/photos, photos/2024").unwrap_err();
assert!(err.contains("photos/2024"), "{}", err);
// A valid mix succeeds — the bad-entry test isn't accidentally
// matching the good prefix.
assert_eq!(
normalize_excluded_dirs_input("/photos, @eaDir, /private/"),
Ok(Some("/photos,@eaDir,/private".to_string()))
);
}
fn probe_lib(id: i32, root: String) -> Library {
Library {
id,