Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

添加对 whoosh 的接口 #11

Open
terminus318 opened this issue Nov 6, 2012 · 5 comments
Open

添加对 whoosh 的接口 #11

terminus318 opened this issue Nov 6, 2012 · 5 comments

Comments

@terminus318
Copy link

最近研究 中文分词,准备自己做一个,采用双向匹配分词和HMM处理未登录词、削歧义。
不过看了jieba,感觉细节已经做了很多。
准备实现 whoosh 的分词接口,就用在下一个项目中。
不知道能不能提供些 jieba设计方面的资料

@oldcai
Copy link

oldcai commented Nov 6, 2012

可以提供源码 ;)

@fxsjy
Copy link
Owner

fxsjy commented Nov 23, 2012

@terminus318 , 最近工作较忙,还未研究whoosh。 在网上搜索时发现有人对whoosh和jieba做了集成。先mark一下: http://blog.csdn.net/wenxuansoft/article/details/8170714

@fxsjy
Copy link
Owner

fxsjy commented Jan 5, 2013

@fxsjy
Copy link
Owner

fxsjy commented Jul 1, 2013

@terminus318 , @oldcai , 结巴0.30版已经添加了用于Whoosh的分词接口:ChineseAnalyzer。
用法:https://github.com/fxsjy/jieba/blob/master/test/test_whoosh.py

@oldcai
Copy link

oldcai commented Jul 4, 2013

哈哈,感谢。

另:工信处女干事好忙,每次测试都要请她过来。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants