Skip to content

Commit

Permalink
flume+hbase 联调完成
Browse files Browse the repository at this point in the history
  • Loading branch information
supermy committed Apr 1, 2015
1 parent 031c830 commit 16667f5
Show file tree
Hide file tree
Showing 20 changed files with 208 additions and 32 deletions.
17 changes: 9 additions & 8 deletions common/mycloud/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,21 +45,22 @@ Pig和Hive还为HBase提供了高层语言支持,使得在HBase上进行数据
*
* docker run -v /usr/local/bin:/target jpetazzo/nsenter:latest
*
* 初始化环境:docker-enter cid 进入 hregionserver容器:cd /home/jamesmo/ && start pre-start-hive.sh
* 初始化环境:docker-enter cid 进入 hregionserver容器:完成hive-hbase环境准备,和日志表建设,cd /home/jamesmo/ && start pre-start-hive.sh
*
* flume-ng启动:因hbase启动较慢,flume-ng_hbase镜像启动完成的时候hbase条件还不具备,看fig logs initdb完成之后,重新启动一下fig restart flume1,启动日志正常。
* flume-ng_hbase会自动建表,如果有同名表会被覆盖。注意/hbase目录允许所有用户777。
*
* 生产数据
*
* 生产数据
* telnet 192.168.59.103 44448
*
* 查看数据-hive(hregionserver-node)
* 查看数据-hive数据(hregionserver-node)
*
* sh /home/jamesmo/start-hive && select * from hive_hbase_log
* sh /home/jamesmo/start-hive.sh && select * from hive_hbase_log
*
* 查看数据-hbse(hregionserver-node)
* 查看数据-hbse数据(hregionserver-node)
*
* hbase shell <'scan "hive_hbase_log"'
*
*
> ## flume+kafka示例
TODO 联调,数据未入库
> ## flume+kafka示例
54 changes: 54 additions & 0 deletions common/mycloud/bind_addr.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
#!/usr/bin/env bash
# filename: bind_addr.sh
# docker默认使用’bridge’来设置container的网络模式(即从与docker0同网段的未使用的IP中取一个作为container的IP),我们这里使用’none‘来实现自己手动配置container的网络。

if [ `id -u` -ne 0 ];then
echo '必须使用root权限'
exit
fi

if [ $# != 2 ]; then
echo "使用方法: $0 容器名字 IP"
exit 1
fi

container_name=$1
bind_ip=$2

container_id=`docker inspect -f '{{.Id}}' $container_name 2> /dev/null`
if [ ! $container_id ];then
echo "容器不存在"
exit 2
fi
bind_ip=`echo $bind_ip | egrep '^(([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])\.){3}([0-9]|[1-9][0-9]|1[0-9]{2}|2[0-4][0-9]|25[0-5])$'`
if [ ! $bind_ip ];then
echo "IP地址格式不正确"
exit 3
fi

container_minid=`echo $container_id | cut -c 1-10`
container_netmask=`ip addr show docker0 | grep "inet\b" | awk '{print $2}' | cut -d / -f2`
container_gw=`ip addr show docker0 | grep "inet\b" | awk '{print $2}' | cut -d / -f1`

bridge_name="veth_$container_minid"
container_ip=$bind_ip/$container_netmask
pid=`docker inspect -f '{{.State.Pid}}' salt-master 2> /dev/null`
if [ ! $pid ];then
echo "获取容器$container_name的id失败"
exit 4
fi

if [ ! -d /var/run/netns ];then
mkdir -p /var/run/netns
fi

ln -sf /proc/$pid/ns/net /var/run/netns/$pid

ip link add $bridge_name type veth peer name X
brctl addif docker0 $bridge_name
ip link set $bridge_name up
ip link set X netns $pid
ip netns exec $pid ip link set dev X name eth0
ip netns exec $pid ip link set eth0 up
ip netns exec $pid ip addr add $container_ip dev eth0
ip netns exec $pid ip route add default via $container_gw
33 changes: 32 additions & 1 deletion common/mycloud/fig.yml
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,36 @@ rs:


#采集数据,传输给kafka
#flume1:
# image: myflume_base:latest
# environment:
# FLUME_AGENT_NAME: a1
# FLUME_CONF_DIR: /opt/flume/conf
# FLUME_CONF_FILE: /var/tmp/flume.conf.hbase
# ports:
# - "44448:44444"
# links:
# - zk:zk1


initdb:
image: inithivehbase_base:latest
links:
- nn:mynn
- dn1:mydn1
- dn2:mydn2
- zk:zookeeper2
- hb:hbasemasteripc
environment:
HBASEMASTERIPC_SERVICE_HOST: hbasemasteripc
HBASEMASTERIPC_SERVICE_PORT: 60010
HBASEMASTERIPC_SERVICE_PORT_SERVICE_PORT: 60000
HDFSNAMENODERPC_SERVICE_HOST: mynn
HDFSNAMENODERPC_SERVICE_PORT: 8020
ZOOKEEPERCLIENT_SERVICE_HOST: zookeeper2
ZOOKEEPERCLIENT_SERVICE_PORT: 2181
hostname: hregionserver1

flume1:
image: myflume_base:latest
environment:
Expand All @@ -109,5 +139,6 @@ flume1:
ports:
- "44448:44444"
links:
- zk:zk1
- hb:hbasemasteripc
- zk:zookeeper1

15 changes: 15 additions & 0 deletions common/mycloud/flume/fig.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
#hive 集成 hbase
#hive 建表


#采集数据,传输给kafka
flume1:
image: myflume_base:latest
environment:
FLUME_AGENT_NAME: a1
FLUME_CONF_DIR: /opt/flume/conf
FLUME_CONF_FILE: /var/tmp/flume.conf.hbase
ports:
- "44449:44444"
hostname: flume1

8 changes: 7 additions & 1 deletion common/myflume/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -9,4 +9,10 @@ RUN ls -hl /opt/flume/lib

RUN ls -hl /var/tmp/

EXPOSE 44444

RUN cat /etc/hosts

RUN echo "192.168.59.103 hbasemasteripc" >> /etc/hosts


EXPOSE 44444
6 changes: 3 additions & 3 deletions common/myflume/conf/flume.conf.hbase
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@ a1.sources.r1.port = 44444

a1.sinks.k1.type = hbase

#***docker link-name zookeeper1
a1.sinks.k1.zookeeperQuorum = zk1:2181
#***docker link-name zookeeper1 192.168.59.103
a1.sinks.k1.zookeeperQuorum = zookeeper1:2181
a1.sinks.k1.znodeParent=/hbase

a1.sinks.k1.table = hive_hbase_log
Expand All @@ -41,7 +41,7 @@ a1.sinks.k2.type = logger
#a1.channels = c1
#a1.sinks = k1
#a1.sinks.k1.type = asynchbase
#a1.sinks.k1.table = hive_hbase_log
#a1.sinks.k1.table = hive_hbase_log1
#a1.sinks.k1.columnFamily = log
#a1.sinks.k1.serializer = org.apache.flume.sink.hbase.SimpleAsyncHbaseEventSerializer
#a1.sinks.k1.channel = c1
Expand Down
Binary file added common/myflume/lib/hadoop-auth-2.5.0-cdh5.3.0.jar
Binary file not shown.
Binary file not shown.
Binary file added common/myflume/lib/htrace-core-2.04.jar
Binary file not shown.
21 changes: 21 additions & 0 deletions mymongodb/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,9 @@ MongoDB服务端可运行在Linux、Windows或OS X平台,支持32位和64位

MongoDB把数据存储在文件中(默认路径为:/data/db),为提高效率使用内存映射文件进行管理。

压力测试工具
https://github.com/brianfrankcooper/YCSB/tree/master/mongodb


### 常用场景1 实时数据采集处理(脚本采用3.0的引擎)

Expand Down Expand Up @@ -63,4 +66,22 @@ MongoDB把数据存储在文件中(默认路径为:/data/db),为提高
> show collections
> db.events.find()
>
> ## END
压力测试-运行示例
---------------------
### 构造镜像包
> 进入到当前目录
> ## fig build
### 运行
> 进入到当前目录
> ## fig up -d && fig ps
### 观察日志
> 初始化数据
>
> sh initdb.sh //****必须先完成初始化,否则mongsink 不能自动生成数据库。
>
> 下载并且运行一下代码:https://github.com/supermy/gs-accessing-data-mongodb
> 监控mongodb的运行状态:mongostat -h 192.168.59.103 -p 27017
> 查看服务器状态:mongo 192.168.59.103:27017 db.serverStatus()
> ## END
24 changes: 12 additions & 12 deletions mymongodb/fig.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,7 @@
##--noprealloc 生产机发布的时候去掉此选项
rs11:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs1 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs1 --rest --httpinterface
ports:
- "27018:27017"
links:
Expand All @@ -69,7 +69,7 @@ rs11:

rs12:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs1 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs1 --rest --httpinterface
ports:
- "27017"
links:
Expand All @@ -82,7 +82,7 @@ rs12:

rs13:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs1 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs1 --rest --httpinterface
ports:
- "27017"
links:
Expand All @@ -96,7 +96,7 @@ rs13:
###集群的复制集2
rs21:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs2 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs2 --rest --httpinterface
ports:
- "27019:27017"
links:
Expand All @@ -109,7 +109,7 @@ rs21:

rs22:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs2 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs2 --rest --httpinterface
ports:
- "27017"
links:
Expand All @@ -122,7 +122,7 @@ rs22:

rs23:
image: base_mongo
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --replSet rs2 --rest --httpinterface
command: mongod --storageEngine=wiredTiger --smallfiles --replSet rs2 --rest --httpinterface
ports:
- "27017"
links:
Expand All @@ -134,13 +134,13 @@ rs23:
# - data23

###配置文件集群
##docker run --name cfg1 -P -d mydev/mongodb --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
##docker run --name cfg2 -P -d mydev/mongodb --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
##docker run --name cfg3 -P -d mydev/mongodb --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
##docker run --name cfg1 -P -d mydev/mongodb --smallfiles --configsvr --dbpath /data/db --port 27017
##docker run --name cfg2 -P -d mydev/mongodb --smallfiles --configsvr --dbpath /data/db --port 27017
##docker run --name cfg3 -P -d mydev/mongodb --smallfiles --configsvr --dbpath /data/db --port 27017
cfg1:
image: base_mongo
hostname: cfg1
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
command: mongod --storageEngine=wiredTiger --smallfiles --configsvr --dbpath /data/db --port 27017
ports:
- "27017"
# volumes_from:
Expand All @@ -149,7 +149,7 @@ cfg1:
cfg2:
image: base_mongo
hostname: cfg2
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
command: mongod --storageEngine=wiredTiger --smallfiles --configsvr --dbpath /data/db --port 27017
ports:
- "27017"
# volumes_from:
Expand All @@ -158,7 +158,7 @@ cfg2:
cfg3:
image: base_mongo
hostname: cfg3
command: mongod --storageEngine=wiredTiger --noprealloc --smallfiles --configsvr --dbpath /data/db --port 27017
command: mongod --storageEngine=wiredTiger --smallfiles --configsvr --dbpath /data/db --port 27017
ports:
- "27017"
# volumes_from:
Expand Down
2 changes: 1 addition & 1 deletion mymongodb/initdbi-1.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
printjson(1);
config={_id: 'rs1', members:[{_id: 0,host:'172.17.4.41:27017'},{_id:1,host:'172.17.4.39:27017'},{_id:2,host:'172.17.4.40:27017'}]}
config={_id: 'rs1', members:[{_id: 0,host:'172.17.0.32:27017'},{_id:1,host:'172.17.0.30:27017'},{_id:2,host:'172.17.0.31:27017'}]}
rs.initiate(config);
2 changes: 1 addition & 1 deletion mymongodb/initdbi-1.jsbak
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
printjson(1);
config={_id: 'rs1', members:[{_id: 0,host:'172.17.4.41:27017'},{_id:1,host:'172.17.4.39:27017'},{_id:2,host:'rs13:27017'}]}
config={_id: 'rs1', members:[{_id: 0,host:'172.17.0.32:27017'},{_id:1,host:'172.17.0.30:27017'},{_id:2,host:'rs13:27017'}]}
rs.initiate(config);
2 changes: 1 addition & 1 deletion mymongodb/initdbi-2.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
printjson(1);
config={_id: 'rs2', members:[{_id: 0,host:'172.17.4.36:27017'},{_id:1,host:'172.17.4.37:27017'},{_id:2,host:'172.17.4.38:27017'}]}
config={_id: 'rs2', members:[{_id: 0,host:'172.17.0.27:27017'},{_id:1,host:'172.17.0.28:27017'},{_id:2,host:'172.17.0.29:27017'}]}
rs.initiate(config);
2 changes: 1 addition & 1 deletion mymongodb/initdbi-2.jsbak
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
printjson(1);
config={_id: 'rs1', members:[{_id: 0,host:'172.17.4.36:27017'},{_id:1,host:'172.17.4.37:27017'},{_id:2,host:'172.17.4.38:27017'}]}
config={_id: 'rs1', members:[{_id: 0,host:'172.17.0.27:27017'},{_id:1,host:'172.17.0.28:27017'},{_id:2,host:'172.17.0.29:27017'}]}
rs.initiate(config);
4 changes: 2 additions & 2 deletions mymongodb/initdbi.js
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
printjson(1);
sh.addShard("rs1/172.17.4.41:27017");
sh.addShard("rs2/172.17.4.36:27017");
sh.addShard("rs1/172.17.0.32:27017");
sh.addShard("rs2/172.17.0.27:27017");
sh.status();

db.runCommand( { listshards : 1 } );
Expand Down
2 changes: 1 addition & 1 deletion mymongodb/initdbi.jsbak
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
printjson(1);
sh.addShard("rs1/172.17.4.41:27017");
sh.addShard("rs1/172.17.0.32:27017");
sh.addShard("rs2/rs21:27017");
sh.status();

Expand Down
15 changes: 15 additions & 0 deletions mymongodb/test-performance.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
db = db.getSiblingDB("test");
for(var i=1;i<=1000;i++) db.c1.save({id:i,value1:"你好"});
db.c1.stats();

//插入五百万数据
for(var i=0; i<5000000; i++){
db.test1.insert({name : "mongodb_test" + i,seq : i});
}
db.test1.stats();

//插入一千万数据
for(var i=0; i<10000000; i++){
db.test2.insert({name : "mongodb_test"+ i,seq : i});
}
db.test2.stats();
32 changes: 32 additions & 0 deletions mymongodb/test-performance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
import time,pymongo,multiprocessing,random,string
class SqlToMongo:
def m_sql(self,x,y):
server="mongodb://python:[email protected]:27017/syslog"
conn=pymongo.Connection(server)
db=conn.syslog
col=db.thing
start=x*y
end=start+x
for i in xrange(start,end):
d=random.randint(start,end)
val=col.find({"user_id":d})
a=list(val)
def gen_load(x,taskid):
task=SqlToMongo()
print "task %s start!" % taskid
task.m_sql(x,taskid)
if __name__ == "__main__":
inser_number=2500000
pro_pool = multiprocessing.Pool(processes=101)
print time.strftime('%Y-%m-%d:%H-%M-%S',time.localtime(time.time()))
start_time=time.time()
manager = multiprocessing.Manager()
for i in xrange(40):
taskid=i
pro_pool.apply_async(gen_load,args=(inser_number,taskid))
pro_pool.close()
pro_pool.join()
elapsed = time.time()-start_time
print elapsed
time.sleep(1)
print "Sub-process(es) done."
1 change: 1 addition & 0 deletions mymongodb/test-performance.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
time mongo 192.168.59.103:27017/test --quiet test-performance.js

0 comments on commit 16667f5

Please sign in to comment.