There isn't a well established way of estimating the number of data points that you'll need. It's much more an art than a science. As you gain more experience, you'll learn some common sense lessons (in hindsight) along the way. For instance, you should never have more parameters than data points; if you're building random forests, you should not have more trees than data points. If you're doing deep learning, you should never have more neurons than data points (these are extreme examples). As a rule of thumb, try to avoid having 10 times more features than you have data points. 300 data points is very small, and you so you should probably limit yourself to linear models.
As Hobbes mentioned, cross validation and/or holdout sets are the proper way to judge the readiness of your model irrespective of the datapoint-to-feature ratio. If the error on the training set is 10% better than the error on the test/validation set, you're probably overfitting and your model is not fit for production. The 10% threshold is very much a rule of thumb.
Also worth mentioning, once you feel comfortable that your model is fit for production, you should never deploy it and then just walk away. If your model has value, you should do ongoing post production monitoring. In the real world, your model's performance will inevitably degrade over time (regardless of how strong/accurate it was at the time of the model build) as the environment that your model was trained for evolves. This is certainly true for finance and insurance industries.
Production monitoring can also be used to pull the plug if your model performs far worse than you anticipated. If you have a very low opinion of the potential stability of your model, deploy it silently (don't allow the predictions to go to downstream systems) and measure how your model would have done had you hypothetically deployed it.
A last piece of advice: when you do predictions on new documents you'll nearly always come across words that you have not accounted for in your training set. You should find a way to build this awareness into your model. For example, you could choose to use only keywords as predictors or you could create indicator variables for new words. Either way, you should not expect the vocabulary size to remain the same.