fix: purge stale drawers before re-mine to avoid hnswlib segfault (#521)

Delete existing drawers for a file before re-inserting fresh chunks.
Converts re-mines from upsert (hnswlib updatePoint path, thread-unsafe
on macOS ARM + chromadb 0.6.3) into delete+insert (safe addPoint path).

Credit: @StefanKremen (#523)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
MSL
2026-04-10 09:13:07 -07:00
parent 60bea83e76
commit a868e16eaa
+10
View File
@@ -436,6 +436,16 @@ def process_file(
print(f" [DRY RUN] {filepath.name} → room:{room} ({len(chunks)} drawers)")
return len(chunks), room
# Purge stale drawers for this file before re-inserting the fresh chunks.
# Converts modified-file re-mines from upsert-over-existing-IDs (which hits
# hnswlib's thread-unsafe updatePoint path and can segfault on macOS ARM
# with chromadb 0.6.3) into a clean delete+insert, bypassing the update
# path entirely.
try:
collection.delete(where={"source_file": source_file})
except Exception:
pass
drawers_added = 0
for chunk in chunks:
added = add_drawer(