My answer is not limited to NLP and I think NLP is no different in this aspect than other types of learning.
An interesting technical look is offered by: On Discriminative vs. Generative Classifiers - Andrew Ng, Michael Jordan.
Now a more informal opinion:
Discriminative classifiers attack the problem of learning directly. In the end, you build classifiers for prediction, which means you build an estimation of $p(y|x)$. Generative models arrive through Bayes theorem to the same estimation, but it does that estimating the joint probability and the conditional is obtained as a consequence.
Intuitively, generative classifiers require more data since the space modeled is usually larger than that for a discriminative model. More parameters mean there is a need for more data. Sometimes not only the parameters but even the form of a joint distribution is harder to be modeled rather than a conditional.
But if you have enough data available it is also to be expected that a generative model should give a more robust model. Those are intuitions. Vapnik asked once why to go for joint distribution when what we have to solve is the conditional? He seems to be right if you are interested only in prediction.
My opinion is that there many factors that influence building a generative model of a conditional one which includes the complexity of formalism, the complexity of input data, flexibility to extend results beyond prediction and the model themselves. If there is a superiority of discriminant models as a function of available data, that is perhaps a small margin.