视频位于:链接
需要的资源整理:
本视频涉及的相关资源:
1、jieba中文分词库:
https://github.com/fxsjy/jieba
2、spark word2vec计算:
http://spark.apache.org/docs/latest/ml-features.html#word2vec
3、腾讯开源800万word2vec数据:
https://ai.tencent.com/ailab/nlp/embedding.html
4、Scipy的相似度计算函数:
https://docs.scipy.org/doc/scipy-0.14.0/reference/generated/scipy.spatial.distance.cosine.html
5、相似度近邻搜索优化算法LSH局部敏感哈希
http://spark.apache.org/docs/latest/ml-features.html#locality-sensitive-hashing