fix: purge stale drawers before re-mine to avoid hnswlib segfault (#521)
Delete existing drawers for a file before re-inserting fresh chunks. Converts re-mines from upsert (hnswlib updatePoint path, thread-unsafe on macOS ARM + chromadb 0.6.3) into delete+insert (safe addPoint path). Credit: @StefanKremen (#523) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -436,6 +436,16 @@ def process_file(
|
||||
print(f" [DRY RUN] {filepath.name} → room:{room} ({len(chunks)} drawers)")
|
||||
return len(chunks), room
|
||||
|
||||
# Purge stale drawers for this file before re-inserting the fresh chunks.
|
||||
# Converts modified-file re-mines from upsert-over-existing-IDs (which hits
|
||||
# hnswlib's thread-unsafe updatePoint path and can segfault on macOS ARM
|
||||
# with chromadb 0.6.3) into a clean delete+insert, bypassing the update
|
||||
# path entirely.
|
||||
try:
|
||||
collection.delete(where={"source_file": source_file})
|
||||
except Exception:
|
||||
pass
|
||||
|
||||
drawers_added = 0
|
||||
for chunk in chunks:
|
||||
added = add_drawer(
|
||||
|
||||
Reference in New Issue
Block a user