applicability of relative similarity computation

I've computed the cosine similarity between a b (=x) and b c (=y). I can use the same embeddings to compute the similarity between a and c (assuming it's = z).

I've a situation wherein I've only the similarity measures x and y. How can I find the similarity between a c, without the original embeddings? If I use a plane to represent this then I will have infinite number of solutions. Are there any approaches which provides some insights about the relation between a and c?

a = torch.Tensor([1, 2])
b = torch.Tensor([1, -1])
c = torch.Tensor([2, 3])

sim_ab = torch.dot(a, b) / (torch.sqrt(sum(torch.square(a)) * sum(torch.square(b))))
sim_ac = torch.dot(a, c) / (torch.sqrt(sum(torch.square(a)) * sum(torch.square(c))))
sim_bc = torch.dot(b, c) / (torch.sqrt(sum(torch.square(b)) * sum(torch.square(c))))

print(Actual similarity between b and c: , sim_bc)

x = torch.arccos(sim_ab)
y = torch.arccos(sim_ac)

print(Measured similarity between b and c: , torch.cos(x-y))

 Actual similarity between b and c:  tensor(-0.1961)
 Measured similarity between b and c:  tensor(-0.1961)

Topic semantic-similarity cosine-distance word-embeddings nlp similarity

Category Data Science


It's often useful to think of simple cases, e.g. even in a 2-D (planar) case, you can't determine z. Similarity between two vectors is identical to the angle (at the origin) between them, so:

  • for a fixed vector $b$, if $a$ had angle $x$ to $b$, then $a$ lies in one of two lines either side of $b$.
  • if another vector $c$ has angle $y$ to $b$, then it is also in one of two lines either side of $b$.

This already gives 2 different answers for the angle between $a$ and $c$: $x+y$ or $x-y$, depending on whether $a$ and $c$ lie on the "same side" of $b$ or not.

In higher dimensions, the number of possible angles for $z$ increases, e.g. in 3-D, $a$ and $c$ each lie on a cone around $b$, with a range of possible angles between them.

This isn't a proof, but I think from this you can bound $z$ between $x+y$ and $x-y$ (perhaps useful if one value is very small, perhaps not), but you can't uniquely define z.

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.