Graph embeddings of Wikidata items
I'm trying to use PyTorch BigGraph pre-trained embeddings of Wikidata items for disambiguation. The problem is that the results I am getting by using dot (or cosine) similarity are not great. For example, the similarity between the Python programming language and the snake with the same name is greater than between Python and Django. Does anybody know if there is a Wikidata embedding that results in better similarities? The only alternative I've found is Webmembedder embeddings but they are incomplete.
Wiki Item 1 | Wiki Item 2 | dot | cosine |
---|---|---|---|
Q28865 (Python language) | Q271218 (Python snake) | 17.625 | 0.64013671875 |
Q28865 (Python language) | Q10811 (Reptiles) | 8.21875 | 0.300048828125 |
Q28865 (Python language) | Q2407 (C++) | 25.296875 | 0.919921875 |
Q28865 (Python language) | Q842014 (Django Python) | 11.34375 | 0.409912109375 |
Q271218 (Python snake) | Q10811 (Reptiles) | 11.25 | 0.409912109375 |
Q271218 (Python snake) | Q2407 (C++) | 12.5390625 | 0.4599609375 |
Q271218 (Python snake) | Q842014 (Django Python) | 6.05859375 | 0.219970703125 |
Q10811 (Reptils) | Q2407 (C++) | 4.76171875 | 0.1700439453125 |
Q10811 (Reptils) | Q842014 (Django Python) | -0.60009765625 | -0.0200042724609375 |
Q2407 (C++) | Q842014 (Django Python) | 11.53125 | 0.419921875 |
Topic knowledge-graph embeddings
Category Data Science