Skip to content

Latest commit

 

History

History

company-categoriser

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 

Company categoriser

Assignment for interview process at demandmatrix.com

-Extracted 520 company categories and their descriptions from g2crowd.com
-Cleaned data of company Wikipedia page and company website using NLTK
-Vectorised and used cosine angles to get category of any company by its name

Getting Started

-Install Python 3.6
-Download prerequisites
-Run the script

Prerequisites

pip install bs4 wikipedia nltk urllib3