Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

最近计划用 bookdown 重写一下文档教程,欢迎大家提一下意见和建议 #35

Open
qinwf opened this issue Jun 16, 2016 · 6 comments

Comments

@qinwf
Copy link
Owner

qinwf commented Jun 16, 2016

No description provided.

@BruceZhaoR
Copy link

@qinwf

  1. 我觉得直接写现有的函数用法和例子就行,之前版本历史可以略掉?
  2. 原理能不能简单介绍一下,优缺点,瓶颈在哪里之类的。
  3. 词库介绍能不能再详细一点,我第一次接触完全是懵逼的,还问了些傻瓜的问题。
  4. 哪些是可以自己优化的,比如自己训练idf词库这样的,要是能给出训练方法那就完美了。
  5. FAQ 我这个周末整理一下,根据所有的issues整理归纳,大概会有5到8个的样子。不足你再补充一下
  6. 可以加上后续分析的例子,我看到很多文章都是基于jiebaR分词,然后完成各种炫酷的文本分析,加上这些参考链接,或许能让人更快上手。我会留心一下,如果碰到就加上链接。

最后非常感谢你开发了这个包,造福了广大文本分析爱好者!!

@qinwf
Copy link
Owner Author

qinwf commented Jun 16, 2016

新版的文档在这里更新 https://jiebaR.qinwf.com/

@BruceZhaoR
Copy link

有一些小细节问题,有时间的话能不能解答一下吗? 我也想弄一个。。感觉跑偏了。。(逃。。 😵

  1. https://github.com/qinwf/jiebaR_doc/tree/gh-pages 更新了,你是如何做到删除了之前的commit?
  2. 你的域名在哪买的呢?
  3. 是用master 渲染bookdown::render, 然后将结果转到gh-pages里面,就生成了这样的bookdown的网页?

@qinwf
Copy link
Owner Author

qinwf commented Jun 16, 2016

可以看 travis 的文件这几行:https://github.com/qinwf/jiebaR_doc/blob/master/.travis.yml#L52-L58

域名很多网站都可以买,阿里云,腾讯云,godaddy 挺多的,网上搜一下就有了。

@Hz-EMW
Copy link

Hz-EMW commented Jan 11, 2017

jiebaR可有相关的论文和研究报告之类,可以在说明文档中一并列出,方便研究人员进行参考和引用。您看如何?

@qinwf
Copy link
Owner Author

qinwf commented Jan 11, 2017

Thanks for your support. You can use this citation function in R, citation(package = "jiebaR") to get the BibTeX format for citation. Hope that you will find it helpful. jiebaR uses CppJieba as a library, and the author of CppJieba is Yanyi Wu.

You can also look at this page to see how other people cite packages on CRAN. http://onlinelibrary.wiley.com/doi/10.1002/9781118763667.oth1/pdf

The methods in this package are very common in most Chinese text segmentation. It is mostly about Hidden Markov Model, Viterbi algorithm, Maximum Likelihood decoding. The Viterbi algorithm is a dynamic programming algorithm for finding the most likely sequence of hidden states. For Chinese text segmentation, the hidden states are the segmentation status of words.

I will add some internal functioning material in the incoming docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants