How can I find company descriptions for a long list of companies?

I'm going to train an ml algorithm to qualify potential sales leads based upon company descriptions. To do this, I need to find the company descriptions programatically.

E.g. given a long list of company names, how can I find descriptions for these companies. Here are my current techniques, which works ok, but not great:

  • Using Google Search API to and fetch summary from the results page (Google normally gives the company website as the first result when searching by company name)
  • Fetch all data from the company's website (includes an about-page and a lot of other less relevant data)

  • Using the FullContact Company API to get any descriptions the company has on its social profiles (works great if the company has an online precense)

Topic data scraping crawling

Category Data Science

About

Geeks Mental is a community that publishes articles and tutorials about Web, Android, Data Science, new techniques and Linux security.