#the environment
安装python需要用到的[依赖库](https://www.continuum.io/downloads "悬停显示") 根据系统下载对应的版本(推荐使用python2.7)
install the dependent library of python ,download the edition decided by your system(recommend use python 2.7)
#how to use
1.import this whole project to eclipse
2.open the integration.All.java
3.create a floder,and put DataSource.txt && k.txt in it
4.run the three_in_one function to finish these steps below
first:calculate the TF-IDF matrix
second:change the matrix to CSR matrix ,then run the cluster function,and find the best K
third:divide the DataSource into several txts,and find everyone's keywords and hot questions
this function take one parameter boolean
if true use Silhouette ways to find the best K (datasource<=1W)
if false use Inertia ways to find the best K (datasource > 1W)
5.run the fourth function to finish the tag progress
fourth:tag everyline of DataSource with a main class name and a second class name