Training a Japanese Wikipedia Word2Vec Model by Gensim and Mecab

After “Training a Chinese Wikipedia Word2Vec Model by Gensim and Jieba“, we continue “Training a Japanese Wikipedia Word2Vec Model by Gensim and Mecab” with “Wikipedia_Word2vec” related scripts. Still, download the latest Japanese Wikipedia dump data first: https://dumps.wikimedia.org/jawiki/latest/jawiki-latest-pages-articles.xml.bz2. You can use … Continue reading →

Training a Chinese Wikipedia Word2Vec Model by Gensim and Jieba

We have posted two methods for training a word2vec model based on English wikipedia data: “Training Word2Vec Model on English Wikipedia by Gensim” and “Exploiting Wikipedia Word Similarity by Word2Vec“. Based on the pipeline and related scripts: Wikipedia_Word2vec,we can train … Continue reading →