Perplexed by perplexity

I've seen 2 definitions of the perplexity metric:

$PP = 2^{H(p)}$

and

$PP = 2^{H(p, q)}$

If I'm understanding correctly, the first one only tells us about how confident the model is about its predictions, while the second one reflects the accuracy/correctness of the model's predictions. Am I correct?

Which one do people actually refer to when they claim their language model achieved X perplexity in their papers?

Topic perplexity nlp

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.