As per parsimony, principal Gini outperform entropy as of computation ease (log is obvious has more computations involved rather that plain multiplication at processor/machine level).
But, entropy definitely has an edge in some data cases involving high imbalance.
Since entropy uses log of probabilities and multiplying with probabilities of event, what is happening at background is value of lower probabilities are getting scaled up.
If your data probability distribution is exponential or Laplace (like in case of deep learning where we need probability distribution at sharp point) entropy outperform Gini.
To give an example if you have $2$ events one $.01$ probability and other $.99$ probability.
In Gini probability squared will be $.01^2+.99^2$, $.0001 + .9801$ means that lower probability does not play any role as everything is governed by the majority probability.
Now in case of entropy $.01*log(.01)+.99*log(.99)= .01*(-2)+ .99*(-.00436)
= -.02-.00432$
now in this case clearly seen lower probabilities are given better weight-age.