[R] Clustering embeddings based on our own chosen attributes
Hi, Here’s an example of what I have in mind:
I have 1000 face images. I use a face recognition model to convert each face to a 128D embedding.
I have a hunch that each embedding has encoded special characteristics of a face such as eye color, jaw type, male or female genders and lots of other attributes.
Now I want to somehow be able to classify each face (each embedding) based on the attribute that I choose. For example I want to find all faces that have green eyes and are male ONLY based on the embeddings.
One way to do this is to classify the faces based on my attributes (gender, eye color, jaw type, …) and then train the embeddings on these attributes. But this approach takes a lot of time since I have to either make a dataset of faces containing different attributes or download it from somewhere and train a model on it.
I was wondering if there is an unsupervised or semi-supervised approach to cluster the embeddings based on the attributes I choose (gender, eye color, jaw type, …) by only selecting a few of the faces that have these attributes and the model/method automatically tries to cluster the faces based on my chosen attributes.
Simply finding the nearest neighbor of an embedding isn’t enough. For example I may choose a face that has green eyes (a rare factor) and find the nearest faces of that face’s embedding, but there is no guarantee that the nearest faces all have green eyes since they may have other stronger similarities that the embedding may have encoded (for example the same beards or brows).
One way to account for this is to average a couple of embeddings of faces which have green eyes and then try to find the nearest faces. But then again, there is no guarantee that the embeddings are even ‘average friendly’ meaning that averaging them would results in the attribute that is shared between them to become stronger.
So is this even possible and what is the fastest and highest quality approach to do it? Thanks.
EDIT: Since I may not have explained my question in full detail, if there are any questions about this post, I’d be glad to explain furthur.