NLP Interview Coding Task
Please comment on the following NLP Interview Coding Task that I have prepared for the candidates on Data Science NLP position that I am looking for. The goal is to check candidate understanding of the fundamental role of text representations with vectors in NLP, as well as checking candidate coding skills and their ability to optimize computations with vectorization that Numpy provides.
In particular I need your opinion on:
- Is task clear?
- Is task adequate for coding a rough solution from scratch in 20 -30 minutes during the online interview?
- What level - Junior, Middle or Senior DS NLP Engineer - would you assign this task to?
Task:
# Write from scratch (you can only use Numpy arrays)
# very basic and simple algorithm to classify sentences:
test1 = cats like meat and fish is best for cats
test2 = train your mind reading good fiction, thrillers and other books
# Use these sentences to train your classifier:
# Class 1
sent1 = meat is a good food for all dogs and cats , dogs also like apples
# Class 2
sent2 = reading fiction books is a good food for mind and some thrillers are not
To solve this task, candidate should write count vectorizer and cosine similarity functions from scratch. Using these functions candidate can find similarity of test sentences to classes 1 and 2, and thus classify test sentences. Normalizing vectors would be a bonus for the candidate.
It took 20 minutes for me to code, test and describe this task. Not sure how much time NLP position candidate may need.
Topic cosine-distance classification nlp
Category Data Science