What type of indexing are you using for the NIP50 search? Is it similar to SQLghts FTS5?
Login to reply
Replies (1)
No index is currently used, the search relies on DuckDB's pattern matching capabilities with ILIKE. Therefore, no index is needed, and the data is always fresh. DuckDB's columnar storage makes this approach performant. However, it is true that performance will degrade gracefully as the dataset grows. Through testing, I found that you can handle quite a large dataset, over +100k and still maintain good performance.
I am considering introducing BM25 fts. However, this approach relies on indexes that would need to be rebuilt each time to ensure it is fresh. My initial idea for implementation is to make this optional through configuration and integrate it into the current search function. Something like: If the BM25 index exists, it will be used, otherwise, it will fall back to pattern matching.