Skip to content

Commit

Permalink
Finish Chinese Document
Browse files Browse the repository at this point in the history
  • Loading branch information
Longda-Feng committed Jul 22, 2016
1 parent 5b5c2db commit 5fe2fbd
Show file tree
Hide file tree
Showing 50 changed files with 2,020 additions and 41 deletions.
38 changes: 38 additions & 0 deletions docs/jstorm-doc/Community/Committers.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,41 @@ top-nav-pos: 2

* This will be replaced by the TOC
{:toc}



# Committer
封仲淹([@longdafeng](https://github.com/longdafeng))<br/>
刘键([@bastiliu](https://github.com/bastiliu))<br/>
王逸([@cody](https://github.com/unsleepy22))<br/>
方孝健([@hustfxj](https://github.com/hustfxj))<br/>
伍翀([wuchong](https://github.com/wuchong))<br/>
冯健([@fengjian](https://github.com/fengjian428))<br/>
李鑫([@tumen](https://github.com/tumen))<br/>
母延年([[email protected]](https://github.com/muyannian))<br/>
周鑫([@zhouxinxust](https://github.com/zhouxinxust))<br/>
罗实([[email protected]](https://github.com/luoshi0801))<br/>
翟玉勇[[email protected]](https://github.com/zhaiyuyong)<br/>
程明磊[[email protected]](https://github.com/BlueSkyChina)<br/>
陈昱([@cycyyy](https://github.com/cycyyy))<br/>



# Contributor
[冯嘉@vongosling](https://github.com/vongosling)<br/>
[赵颖@flyhighzy](http://weibo.com/flyhighzy)<br/>
[谢正清@feilaoda](http://weibo.com/feilaoda)<br/>
[温绍锦@wenshao](https://github.com/wenshao)<br/>
[胡磊@qiyuan4f](https://github.com/qiyuan4f)<br/>
[@situfang](https://github.com/situfang)<br/>
[徐冠鹏@herberteuler](https://github.com/herberteuler)<br/>
[李家宏@Gvain](https://github.com/Gvain)<br/>
[贺小桥@Hexiaoqiao](https://github.com/Hexiaoqiao)<br/>
[姜宪瑶@jxysoft](https://github.com/jxysoft)<br/>
[周小帆@njzhxf](https://github.com/njzhxf)<br/>
[@yaphet](https://github.com/darionyaphet)<br/>
[@bjlindeqiang](https://github.com/L-Donne)<br/>
[@dingjun](https://github.com/dingjun84)<br/>
[@Dollyn](https://github.com/Dollyn)<br/>
[肥老大@feilaoda](https://github.com/feilaoda)<br/>

20 changes: 20 additions & 0 deletions docs/jstorm-doc/Community/Email.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,23 @@ top-nav-pos: 1

* This will be replaced by the TOC
{:toc}


[jstorm-user](https://groups.google.com/forum/#!forum/jstorm-user)

[jstorm-dev](mailto:[email protected])

QQ Groups: 228374502













32 changes: 32 additions & 0 deletions docs/jstorm-doc/Community/Events-Meetups.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,35 @@ top-nav-pos: 3

* This will be replaced by the TOC
{:toc}


* 2016-4-21 QCON 2016 Beijing, [阿里巴巴实时计算平台 JStorm Turbo](http://2016.qconbeijing.com/presentation/2852)
- * [PPT](http://2016.qconbeijing.com/presentation/2852)
- * [Vedio](https://v.qq.com/iframe/preview.html?vid=j0311u986gc&amp;width=500&amp;height=375&amp;auto=0)

* 2016-3-18 CHINA HADOOP SUMMIT 2016 Beijing, [阿里巴巴实时计算平台JStorm Turbo](http://chinahadoop.com/archives/1320)
- * [PPT](http://event.chinahadoop.com/download.php?r_id=1&t=ppt&f=19-pm-81-fengzongyan.pdf)
- * [Audio](http://event.chinahadoop.com/download.php?r_id=1&t=audio&f=19-pm-81-fengzongyan.mp3)

* 2015-10-31 Ctrip BigData Meetup -- 携程大数据沙龙 《JStorm介绍与规划》

* 2015-10-24 Apache China RoadShow, [阿里实时计算-apache 路演](http://www.huodongxing.com/event/9291887966700)
- * [PPT](www.kaiyuanshe.cn/file-download-234-left.html())

* 2015-8-8 [Shanghai Big Data Streaming 1st Meetup](http://www.meetup.com/Shanghai-Big-Data-Streaming-Meetup/events/224418388/)
- * [JStorm@Alibaba](http://files.meetup.com/18743046/JStorm_alibaba_fengzhongyan.pdf)

* 2014-12-27 [Alibaba Tech Meetup 31th - 阿里技术沙龙第31期-大数据实时计算](http://club.alibabatech.org/salon_detail.htm?salonId=60)
- * [JStorm & Storm Performance Tunning -- vedio](http://player.youku.com/player.php/sid/XODYwMjY0ODY4/v.swf())

* 2014-12-17 [kaiyuanshe Hackson BigData RealTime Hackson basing JStorm -- 大数据实时分析编程黑客松](http://hacking.kaiyuanshe.cn/site/jstorm#)

* 2014-3-18 [Alibaba Tech Meetup 24th - 阿里技术沙龙第24期《JStorm流式计算》](http://v.youku.com/v_show/id_XNjkzMDg5MDky.html?from=s1.8-1-1.2)
- * [JStorm Introduction -- vedio](http://v.youku.com/v_show/id_XNjkzMDg5MDky.html)







4 changes: 4 additions & 0 deletions docs/jstorm-doc/Community/Issues.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,7 @@ top-nav-pos: 4

* This will be replaced by the TOC
{:toc}

https://github.com/alibaba/jstorm/issues


8 changes: 8 additions & 0 deletions docs/jstorm-doc/Community/JStormUsers.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,11 @@ top-nav-pos: 5

* This will be replaced by the TOC
{:toc}



# [users list](https://github.com/alibaba/jstorm/issues/81)

![jstorm-users]({{site.baseurl}}/img/community/jstorm-users.jpg)


10 changes: 0 additions & 10 deletions docs/jstorm-doc/Maintenance/BlobStore.md

This file was deleted.

18 changes: 18 additions & 0 deletions docs/jstorm-doc/Maintenance_cn/ClusterHA.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,21 @@ top-nav-title: 同城灾备&异地灾备

* This will be replaced by the TOC
{:toc}


# 概述
主要解决的问题是, 当一个机房断网断电时, 应用仍然能够提供服务。


# 同城多机房

To Be Open Source

# 异地全量数据多机房

To be Open Source

# 异地分片数据多机房

To be Open Source

62 changes: 62 additions & 0 deletions docs/jstorm-doc/Maintenance_cn/Configuration.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,65 @@ top-nav-title: 配置注解

* This will be replaced by the TOC
{:toc}

# Summary

The page doesn't list all setting. If you want to know all setting, please refer to [defaults.yaml](https://github.com/alibaba/jstorm/blob/master/jstorm-core/src/main/resources/defaults.yaml) .

Here just list frequently used setting.

```
storm.zookeeper.servers: zookeeper address.
storm.zookeeper.root: root directory of JStorm in zookeeper. When multiple JStorm system share a ZOOKEEPER, you need to set this option. the default is "/jstorm".
nimbus.host: nimbus ip, this setting is only for $JSTORM_HOME/bin/start.sh script.
storm.local.dir: JStorm temporary directory to store local binary or configuration. You need to make ensure JStorm program has written privilege to this directory.
java.library.path: zeromq and java zeromq library installation directory, if you are using other shared library, please put them in this directory. The default is "/usr/local/lib:/opt/local/lib:/usr/lib".
supervisor.slots.ports: a list of ports provided by the supervisors. Be careful not to conflict with other ports. The default is 68xx, while storm is 67xx.
topology.enable.classloader: false, classloader is disabled by default. If the jar of the application is conflict with one of jares which JStorm depends on. For example, an application depends on thrift9, but JStorm uses thrift7, then you need to enable this configure item.
## send message with sync or async mode
## if this setting is true, netty will use sync mode which means client can send one batch message only after receive one server's response
## Async mode means client can send message without server's response
storm.messaging.netty.sync.mode: false
## when netty is in async mode and client channel is unavailable( server is down or netty channel buffer is full),
## it will block sending until channel is ready or channel is closed
storm.messaging.netty.async.block: true
#This setting is useless when netty is in sync mode.
# If this setting is true and netty is in async mode, netty will batch message
# if this setting is false and netty is in async mode, netty will send tuple one by one without batch tuple into one big message.
storm.messaging.netty.transfer.async.batch: true
### default worker memory size, unit is byte
worker.memory.size: 2147483648
# Metrics Monitor
# topology.performance.metrics: it is the switch flag for performance
# purpose. When it is disabled, the data of timer and histogram metrics
# will not be collected.
topology.performance.metrics: true
# topology.alimonitor.metrics.post: If it is disable, metrics data
# will only be printed to log. If it is enabled, the metrics data will be
# posted to alimonitor besides printing to log.
topology.alimonitor.metrics.post: false
# when supervisor is shutdown, automatically shutdown worker
# if run jstorm under other container such as hadoop-yarn,
# this must be set as true
worker.stop.without.supervisor: false
#set how many tuple can spout send in one time.
# For example, if this is setting 100,
# spout can't send the No. 101th tuple until spout receive one tuple's ack message
topology.max.spout.pending: null
```


57 changes: 57 additions & 0 deletions docs/jstorm-doc/Maintenance_cn/ConfigurationAutomacticSync.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,60 @@ top-nav-title: 自动同步配置文件

* This will be replaced by the TOC
{:toc}

## 前提

jstorm版本高于或等于 2.2.0

## 背景

目前修改jstorm的集群配置(storm.yaml),需要在人肉到跳板机上先复制storm.yaml,编辑,然后用批量scp复制至集群并重启整个集群。如果需要对多个集群同时修改,效率太低,代价太大。

全局配置推送主要针对对集群配置的批量修改这个功能而开发。

## 实现

### standalone
jstorm-core中nimbus添加ConfigUpdateHandler插件,允许动态修改storm.yaml配置。默认的DefaultConfigUpdateHandler什么也不做,相当于没有插件。

同时实现了一个DiamondConfigUpdateHandler(external/jstorm-configserver模块),这个插件通过diamond做配置更新。当检测到有更新时,会备份当前storm.yaml(最多保留3个备份,分别为storm.yaml.1, storm.yaml.2, storm.yaml.3),然后将最新的配置写入到storm.yaml中。

supervisor会有一个配置更新线程(SupervisorRefershConfig),每隔半分钟左右向nimbus请求当前最新的配置,同时与本地配置做比较。如果不相同,则备份并覆盖本地配置。以完成配置的集群级别动态更新。

总体上来说,配置更新是从diamond推至nimbus,然后supervisor从nimbus拉取。

### jstorm-on-yarn
在yarn的情况下,情况稍微有点特殊。因为yarn的supervisor,有的配置项是动态生成的,如storm.local.dir, jstorm.log.dir等。这时如果直接把nimbus的配置覆盖掉supervisor的配置,是有问题的。

因此,在yarn的情况下,新加入了一个配置项,其默认值如下:

```
jstorm.yarn.conf.blacklist: "jstorm.home,jstorm.log.dir,storm.local.dir,java.library.path"
```

这个配置项指示,在supervisor端,对这几个配置,不要用nimbus的覆盖本地的。

具体实现上,在SupervisorRefershConfig初始化的时候,就会将这几个配置项保存起来,称为retainedYarnConfig。
后续与nimbus做配置比较时,是在过滤了这几个配置项的前提下做比较,如果配置不一样,则用**nimbus的配置,并在最后面追加retainedYarnConfig**,生成新的配置文件,然后覆盖本地的storm.yaml。

同时,为了区分是否yarn环境,需要对yarn的supervisor在启动时加入 `-Djstorm-on-yarn=true`参数。

需要注意的是,yarn的配置项黑名单,目前只支持简单的k-v格式,如果配置值为list或者map这种复杂格式,是有问题的。


## 操作

1. 在管控平台上触发对特定集群/所有集群的配置更新

2. 管控平台上会先从nimbus拉取当前的配置,让用户在线编辑,编辑完后推送

3. 每个集群会在diamond中保存当前配置文件,dataId为配置文件中cluster.name,为了保证所有环境都能收到配置,koala会将这个配置推送到diamond的所有环境(中心+单元)。

4. nimbus收到配置,与本地做检查,如果不一样,则更新。

5. supervisor拉取nimbus配置,根据是否yarn,与本地配置做检查,然后确认要不要更新。


## TODO

目前在收到新的配置后,实际上会回调RefreshableComponent.refresh来动态更新配置。但是现在支持的动态更新配置有限,主要是metrics、log相关的配置。2.2.1版本中,需要支持更多的配置项的自动更新。
6 changes: 5 additions & 1 deletion docs/jstorm-doc/Maintenance_cn/DynamicAdjustLog.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
---
title: "Dynamic Adjust Log"
title: "动态调整日志"
layout: plain_cn

# Top-level navigation
Expand All @@ -11,3 +11,7 @@ top-nav-title: 动态调整日志

* This will be replaced by the TOC
{:toc}


TO BE OPENED

52 changes: 52 additions & 0 deletions docs/jstorm-doc/Maintenance_cn/HealthCheck.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,55 @@ top-nav-title: Supervisor自检

* This will be replaced by the TOC
{:toc}

# 概述

Jstorm支持对集群进行健康检查,通过定时执行检测脚本获取机器的的健康状态,然后动态去调整集群。换句话说,jstorm可以根据机器的健康状态,会让supervisor主动触发执行动作,合理调整自身的状态。目前我们将superviosr的机器健康状态归类为4种情况:panic error warn info。

```
panic状态: 该状态下我们会首先将该机器上的所有worker杀死,最后supervisor进行自杀,让该机器从集群中移除;
error状态: 该状态下会将该机器上的所有worker杀死,并将其的可用端口数量设置为0,让该机器不再参与集群的调度;
warn状态: 该状态下会将其的可用端口数量设置为0,让该机器不再参与集群的调度;
info状态: 健康状态,不做任何处理
```
Note: 这个文档暂时是针对于2.x版本的


# 配置
对于机器的健康状态检测是通过一些检测脚本获取的。这些脚本你可以根据你的要求自己去实现。我们设定每种健康状态对应一个脚本执行目录,该执行脚本目录是可配置的。

```
panic脚本执行目录: 绝对路径,配置脚本目录参数storm.machine.resource.panic.check.dir
error脚本执行目录: 绝对路径,配置脚本目录参数storm.machine.resource.error.check.dir
warn脚本执行目录: 绝对路径,配置脚本目录参数storm.machine.resource.warning.check.dir
```

每个目录下健康检查脚本的数量是不受限制的。任何一个脚本检查到机器异常,该健康状态会立马被supervisor捕获。例如panic下的某个脚本检查到该机器异常,则supervisor捕获的的机器状态是panic;同样的warn目录下的任何一个脚本检查到机器异常,则supervisor捕获到的机器状态是warn。

## 对脚本的一点小小要求

由于jstorm里头做了限制,supervisor会根据执行脚本的输出来判断该机器是否异常。如果脚本输出是check don't passed, 则判断该执行脚本检查到该机器异常;其他情况下一律判断该机器状态是健康的。例如在warn目录下的检查cpu的脚本:

```
#!/usr/bin/env bash
MAX_cpu=70
top_command=`which top`
cpuInfo=`$top_command -b -n 1 | grep "Cpu(s)" | awk '{print $2+$3}'`
Cpu=${cpuInfo/.*}
if [ $Cpu -gt $MAX_cpu ];then
echo "check don't passed"
fi
```
该机器的cpu利用率如果大于70%,则输出check don't passed,这时判断该机器状态处于warn状态。其他输出或者执行脚本异常或脚本执行超时统统会判断该supervisor处于info状态。

## **其他配置参数**

```
supervisor.enable.check: 健康检查开发,默认关闭, 是supervisor级别的开关;
supervisor.frequency.check.secs: 健康检测脚本的执行频率,默认是60s;
storm.health.check.timeout.ms: 脚本执行超时时间,默认5s
```

Loading

0 comments on commit 5fe2fbd

Please sign in to comment.