test(repair): page-align corruption offset in preflight regression test
Address Copilot review on #1403: the test seeked unconditionally to offset 40960 with only `pre_size > 16384` as a guard. If pre_size sat between 16384 and 40960 + 16384 = 57344 (e.g., on a chromadb version that allocated fewer pages on init, or a future schema change), the seek would extend the file with zero-padding and the original pages would stay intact — quick_check would still pass on the (untouched) real data, and the regression guard would silently skip detecting a preflight-ordering regression. Compute the offset from pre_size, page-aligned, with explicit asserts that the file is large enough to mangle 4 pages without truncating the header or extending past EOF.
This commit is contained in:
+23
-3
@@ -1186,11 +1186,31 @@ def test_rebuild_index_runs_sqlite_preflight_before_chromadb_open(tmp_path, caps
|
|||||||
|
|
||||||
sqlite_path = palace / "chroma.sqlite3"
|
sqlite_path = palace / "chroma.sqlite3"
|
||||||
pre_size = sqlite_path.stat().st_size
|
pre_size = sqlite_path.stat().st_size
|
||||||
assert pre_size > 16384, "need a multi-page sqlite db to mangle"
|
|
||||||
|
# Compute a page-aligned corruption offset that's always inside the
|
||||||
|
# existing file. SQLite uses 4 KB pages by default; we mangle 4 pages
|
||||||
|
# somewhere in the middle, skipping at least the first 2 pages
|
||||||
|
# (header + root) so the file still opens. Without clamping to the
|
||||||
|
# actual file size, a seek past EOF on r+b mode would silently
|
||||||
|
# extend the file with zero-padding and leave the original pages
|
||||||
|
# intact — quick_check would still pass, and the regression guard
|
||||||
|
# would skip the bug.
|
||||||
|
PAGE = 4096
|
||||||
|
CORRUPT_BYTES = 16384 # 4 pages
|
||||||
|
HEADER_GUARD = PAGE * 2 # leave header + root pages intact
|
||||||
|
assert (
|
||||||
|
pre_size >= HEADER_GUARD + CORRUPT_BYTES
|
||||||
|
), f"sqlite db too small to mangle without truncating: {pre_size} bytes"
|
||||||
|
# Round (pre_size - CORRUPT_BYTES) down to a page boundary so we
|
||||||
|
# mangle whole pages. Cap at offset 40960 (page 10) for stable
|
||||||
|
# diagnostics across SQLite versions that may grow the file.
|
||||||
|
max_offset = (pre_size - CORRUPT_BYTES) & ~(PAGE - 1)
|
||||||
|
corrupt_offset = min(40960, max_offset)
|
||||||
|
assert corrupt_offset >= HEADER_GUARD, f"corruption offset {corrupt_offset} too close to header"
|
||||||
|
|
||||||
with open(sqlite_path, "r+b") as f:
|
with open(sqlite_path, "r+b") as f:
|
||||||
f.seek(40960) # page 10
|
f.seek(corrupt_offset)
|
||||||
f.write(b"\xde\xad\xbe\xef" * 4096) # 16 KB of garbage
|
f.write(b"\xde\xad\xbe\xef" * (CORRUPT_BYTES // 4))
|
||||||
|
|
||||||
# No chromadb mocks: rebuild_index must reach sqlite_integrity_errors
|
# No chromadb mocks: rebuild_index must reach sqlite_integrity_errors
|
||||||
# before any code path that opens a chromadb client. If the preflight
|
# before any code path that opens a chromadb client. If the preflight
|
||||||
|
|||||||
Reference in New Issue
Block a user