We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
# init jieba library(jiebaR) seg_local=worker() # init cluster library(parallel) cl=makeCluster(3) # init args and functions args=c('abc def','abd efg','ah gs fhg') get_seg_local=function(d) segment(d,seg_local) get_seg_remote=function(d) segment(d,seg_remote) clusterEvalQ(cl,library(jiebaR)) # ====================== # 本地定义worker()并export # ====================== clusterExport(cl,'seg_local') # clusterExport(cl,'get_seg_local') parLapply(cl,args,get_seg_local) # Error in checkForRemoteErrors(val) : # 3 nodes produced errors; first error: Please create a new worker after jiebaR is reloaded. # ======================== # 远程定义master节点的worker() # ======================== clusterCall(cl,function(){ seg_remote=worker() }) parLapply(cl,args,get_seg_remote) # Error in checkForRemoteErrors(val) : # 3 nodes produced errors; first error: 找不到对象'seg_remote'
本地声明的报错信息主要是时间戳的不一致导致,Line 42
https://github.com/qinwf/jiebaR/blob/master/R/segment.R
当然,第二种方案报错并不是jiebaR的问题(我自己找了不少相关资料,但始终不得解),想请教一下对于jiebaR在并行计算中是否有更好的解决方案,谢谢!
The text was updated successfully, but these errors were encountered:
你好,我这几天暂时比较忙,先简单说一下,之后再细说。并行需要解决很多问题,其中一个是数据竞争 data racing 。你现在遇到的差不多算是这个问题。
Sorry, something went wrong.
除了 R 层面的粗并行,我有在用 c++11 的并行机制实现分词的并行。但是 windows 上 rtools gcc 4.9 有 bug ,64位下有 dll 加载有问题,所以这个特性没有正式加到主分支里。这个特性可能要等 rtools gcc 4.9 在 windows 稳定了才能用
你可以试着在各自的 子并行集群里新建 cutter 这样可能不会有时间戳问题,也可以避免数据竞争。比如,3个子集群,3个cutter。
感谢您的解答。实际上我的第二种方案也是想在每个子并行群中新建cutter,但是方法写错了,不应该将其放入匿名函数中。:-(
clusterCall(cl,function(){ seg_remote=worker() })
应改为
clusterEvalQ(cl,{seg_remote=worker()})
期待jiebaR的新特性!
No branches or pull requests
本地声明的报错信息主要是时间戳的不一致导致,Line 42
当然,第二种方案报错并不是jiebaR的问题(我自己找了不少相关资料,但始终不得解),想请教一下对于jiebaR在并行计算中是否有更好的解决方案,谢谢!
The text was updated successfully, but these errors were encountered: