> Thy are binary vectors with 768 dimensions, which takes up 96 bytes (768 / 8 =...

alexgarcia-xyz · on May 3, 2024

Author here - ya "binary vectors" means quantizing to one bit per dimension. Normally it would be 4 * dimensions bytes of space per vector (where 4=sizeof(float)). Some embedding models, like nomic v1.5[0] and mixedbread's new model[1] are specifically trained to retain quality after binary quantization. Not all models do tho, so results may vary. I think in general for really large vectors, like OpenAI's large embeddings model with 3072 dimensions, it kindof works, even if they didn't specifically train for it.

[0] https://twitter.com/nomic_ai/status/1769837800793243687

[1] https://www.mixedbread.ai/blog/binary-mrl

barakm · on May 3, 2024

Thank you! As you keep posting your progress, and I hope you do, adding these references would probably help warding off crusty fuddy-duddys like me (or at least give them more to research either way) ;)

lsb · on May 3, 2024

Binary, quantize each dimension to +1 or -1

You can try out binary vectors, in comparison to quantize every pair of vectors to one of four values, and a lot more, by using a FAISS index on your data, and using Product Quantization (like PQ768x1 for binary features in this case) https://github.com/facebookresearch/faiss/wiki/The-index-fac...

barakm · on May 3, 2024

Appreciate the link; but would still like to know how well it works.

If you have a link for that, I’d be much obliged

lsb · on May 3, 2024

It depends on your data and your embedding model. For example, I was able to quantize embeddings of English Wikipedia from 384-dimensions down to 48 7-bit dimensions, and the search works great: https://www.leebutterman.com/2023/06/01/offline-realtime-emb...

queuebert · on May 3, 2024

BTW, the "curse of dimensionality" technically refers to the relative sparsity of high-dimensional space and the need for geometrically increasing data to fill it. It has nothing to do with storage. And typically in vector databases the data are compressed/projected into a lower dimensionality space before storage, which actually improves the situation.