Skip to content

The word2Vec app delivers similar words by training a machine learning model (word2vec by gensim) with TED talk transcripts.

Notifications You must be signed in to change notification settings

BrainEditor/word2vecApp

 
 

Repository files navigation

Word2Vec application with TED talk transcripts

The word2Vec app delivers similar words by training a machine learning model (word2vec by gensim) with TED talk transcripts. I am using the npm package word2vec with the model via Node.js.

Installation

First download or pull this master branch, then initialize the server application (install Node.js first, if you don't have it)

#Install all packages
npm install 
#Start the server
node serve.js

You can find the application on localhost:3000.

Usage

Just type in a word and after submission you will see the words that have been used the most in combination with this word.

Usage

Background

This summer I was doing a class on applied machine learning and was supposed to look for a project idea. After some research in this field (which in fact is a whole new bag of burritos, I was really astonished and overwhelmed) one specific algorithm caught my eye: Word2Vec models.

What is a Word2Vec model?

As an input the algorithm receives a large amount of text data (speeches, book content, dictionaries, crawled text from websites and so on) and assigns each word to a corresponding vector in space (the dimension of the vector is typically around 50–1000, depending on the data set). These “word vectors” are positioned close to other words, that are used in a similar context such as within the same sentence. E.g. the word „sing“ could be positioned close to „song“. Here's a point cloud of words that show the result of a trained model and how the words are mapped according to the dataset.

Alt

I described the whole process in this medium article.

Feel free to clone or fork! ♥️

About

The word2Vec app delivers similar words by training a machine learning model (word2vec by gensim) with TED talk transcripts.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 99.9%
  • Other 0.1%