Intuitively, if you normalized the vectors before using them, or if they all ended up having almost unit norm after training, then a small $l_1$ norm will imply that the angle between the vectors is small, hence the cosine similarity will be high. Conversely, almost colinear vectors will have almost equal coordinates because they all have unit length. So if one works well, the other will work well too.
To see this, remember the equivalence of $l_1$ and $l_2$ norms in $\mathbb{R}^n$, in particular that for any $x \in \mathbb{R}^n$ it holds that $||x||_2 \le ||x||_1$. We can use that to prove the first of the statements (the other is left as an exercise ;)
If $||u||_2 = ||v||_2 = 1$ and $||u-v||_1 \le \sqrt{2\epsilon}$, then $\langle u, v \rangle \ge 1 - \epsilon$.
To prove this just expand $||u-v||_2^2 = 2-2 \langle u, v \rangle$ to obtain:
$$\langle u, v \rangle = 1 - \frac{1}{2} ||u-v||_2^2 \ge 1- \frac{1}{2} ||u-v||_1^2 \ge 1 - \epsilon.$$
So in the end which one you choose is up to you. One reason to prefer the cosine is differentiability of the scalar product, which if you assume normed vectors is all you need.