[Project] pgANN Fast Approximate Nearest Neighbor (ANN) searches with a PostgreSQL database.
We are open-sourcing pgANN – an ANN (approx nearest neighbor) approach with a PostgreSQL backend. The key differentiator between pgANN and the rest (FAISS, Annoy,NearPy etc) is:
- this enables “online” learning i.e. doesn’t require retraining with every CRUD, and
- works with extremely large datasets, since it’s not held in RAM like the others
We use it internally to QA images and find it consistently provides sub-second query performance with a few million rows of vectors on a 32GB.8 vcpu Ubuntu box and can reasonably be expected to scale-up with normal pgsql scaling techniques. We invite the community to give this a try and share feedback.