How to feed a Knowledge Base into Language Models?

Question

How to feed a Knowledge Base into Language Models?

Shivam Arya Jha

2021年12月29日 00:42

I’m a CS undergrad trying to make my way into NLP Research. For some time, I have been wanting to incorporate everyday commonsense reasoning within the existing state-of-the-art Language Models; i.e. to make their generated output more reasonable and in coherence with our practical world. Although there do exist some commonsense knowledge bases like ConceptNet (2018), ATOMIC (2019), OpenMind CommonSense (MIT), Cyc (1984), etc., they exist in form of knowledge graphs, ontology, and taxonomies.

My question is, how can I go about leveraging the power of these knowledge bases into current transformer language models like BERT and GPT-2? How can we fine-tune these models (or maybe train new ones from scratch) using these knowledge bases, such that they retain their language modeling capabilities but also get enhanced through a new commonsense understanding of our physical world?

If any better possibilities exist other than fine-tuning, I'm open to ideas.

Topic knowledge-graph bert deep-learning nlp machine-learning

Category Data Science

Erwan · Accepted Answer · 2021年12月29日 00:42

In my opinion this is a very difficult question, and it's not sure that this can be done.

Symbolic methods and statistical methods are hard to combine. In fact, statistical ML methods became mainstream because they could solve most problems better than symbolic methods. This is especially true in NLP: the multiple attempts at rule-based representations of languages (in the 80s and 90s) were not only expensive to build but also they never proved capable of covering the full diversity of natural language.

There have been various attempt at hybrid models in specific tasks, but to my knowledge none of these hybrid methods proved good enough compared to pure statistical methods. What can work however is to introduce knowledge represented by resources as some of the features used by a statistical model. In this case the model is not symbolic at all, but it uses information coming from symbolic resources.

also get enhanced through a new commonsense understanding of our physical world

Be careful not to assume that any of these models understands anything at all. Their result can be extremely convincing, but these are not strong AI. Natural Language understanding is far from achieved (and it may never be). You might be able to somehow use symbolic resources in order to enhance the output of a model, but making such a model perform some actual reasoning about what it's talking about is a whole other story (a sci-fi one, for now at least).

How to feed a Knowledge Base into Language Models?

About