Vipasyana service built on Apache Storm

If you want to use this system, you must install

Apache Zookeeper (Communication method between nodes in Apache Storm cluster)
Apache Storm 2.4.0
Java openjdk 8

You can follow the install instruction from the office
(There is also installation command I gave below)

Now, you are able to use Apache storm to create your own topology.
If you want to use the topology of vipasyana service, you must do more configuration

Install Miniconda on all nodes
Create virtual environment of each ai inference service (Ex: hfonly, mionly, vfonly, tf3to12)
Install corresponding dependencies in the virtual environment
Move xxxStart.sh to /usr/xxxStart.sh, and make it executable

Install Zookeeper

Go to Apache-Zookeeper website the get the url of installation
for example (https://dlcdn.apache.org/zookeeper/zookeeper-3.8.2/apache-zookeeper-3.8.2-bin.tar.gz)
You can use the command below to download the file

wget https://dlcdn.apache.org/zookeeper/zookeeper-3.8.2/apache-zookeeper-3.8.2-bin.tar.gz

Extract the file, you will get Zookeeper

tar -xvf apache-zookeeper-3.8.2-bin.tar.gz

Set the config file of Zookeeper by your own

vim /THE/PATH/TO/ZOOKEEPER/conf/zoo.cfg

Suggestion config

# The number of milliseconds of each tick
tickTime=2000
# The number of ticks that the initial
# synchronization phase can take
initLimit=10
# The number of ticks that can pass between
# sending a request and getting an acknowledgement
syncLimit=5
# the directory where the snapshot is stored.
# do not use /tmp for storage, /tmp here is just
# example sakes.
dataDir=/home/wnalb/zk/zkDataDir
dataLogDir=/home/wnlab/zk/zkLogDir
# the port at which the clients will connect
clientPort=2181
# the maximum number of client connections.
# increase this if you need to handle more clients
#maxClientCnxns=60
#
# Be sure to read the maintenance section of the
# administrator guide before turning on autopurge.
#
# http://zookeeper.apache.org/doc/current/zookeeperAdmin.html#sc_maintenance
#
# The number of snapshots to retain in dataDir
autopurge.snapRetainCount=20
# Purge task interval in hours
# Set to "0" to disable auto purge feature
autopurge.purgeInterval=24

## Metrics Providers
#
# https://prometheus.io Metrics Exporter
#metricsProvider.className=org.apache.zookeeper.metrics.prometheus.PrometheusMetricsProvider
#metricsProvider.httpPort=7000
#metricsProvider.exportJvmInfo=true

Now, you can follow (How to Start Up Storm Cluster) to start ZooKeeper

Check system Java version

Note: Apache Storm 2.4.0 work fine with openjdk-8-jdk

Check if openjdk-8-jdk is installed

sudo apt list openjdk-8-jdk

If you haven't installed the jdk, just install it

sudo apt install openjdk-8-jdk

Check avaliable java version and choose to openjdk-8-jdk

sudo update-alternatives --config java

Check if the version of java is correct

java -version

AFter that you might get the following environment

openjdk version "1.8.0_352"
OpenJDK Runtime Environment (build 1.8.0_352-8u352-ga-1~22.04-b08)
OpenJDK 64-Bit Server VM (build 25.352-b08, mixed mode)

Storm Config File

The config file is at
$STORM_INSTALLED_DIR/apache-storm-2.4.0/conf/storm.yaml

storm.local.dir: "/home/ytc/storm/local"
storm.zookeeper.servers:
  - "localhost"
#     - "server2"
# 
nimbus.seeds: ["192.168.2.132"]
# 
supervisor.slots.ports:
  - 6700
  - 6701
  - 6702
  - 6703
# 
ui.port: 6750

# Set worker heap size
worker.childopts: "-Xmx2g"

How to Start Up Storm Cluster

Start the zookeeper service

zkServer.sh start

Start nimbus (on numbus host)

storm nimbus

Start supervisor (on all of worker host)

storm supervisor

Start storm UI

storm ui

How to Build and Submit Topology

Note: it is a simple tutorial

Please follow the Makefile to know how to compile and pack Bolt writing in Java and Python
Please follow experiment.sh to know how to submit the topology to Apache Storm numbus

Compile YYY.java and assign corresponding ClassPath (Some jar file and the path which is concatenated by :)

javac -cp xxxx.jar:yyy.jar:. YYY.java

Make jar file commited to nimbus:

jar cvf test.jar storm-client-2.4.0.jar ExclamationBolt.class WordSpout.class  ExclamationTopology.class

If there is any external lib, put it in apache-storm-version/extlib (ALL the host must do)

cp kkk.jar $HOME/apache-storm-2.4.0/extlib/

Submit the topology to cluster (Under Apache Storm bin directory)

storm jar ZZZ.jar TopologyClass

Tutorial: About ShellBolt Working directory

Note that under Apche Storm ShellBolt, the working directory of python file will be
the resources directory in the .jar file which you submit
For example, we can see transfer3to12 service (TfEcgBolt.java)

super("/usr/tf3to12Start.sh", "./tf3to12/tfEcgServerBolt.py");

It will execute tfEcgServerBolt.py under the path of ./resources/./tf3to12/tfEcgServerBolt.py

Thus, all your python file might assume that ./resources is your working directory.
We can see that in resources/tf3to12/tfEcgServerBolt.py
args_parameter() => ('--modelPath',type=str,default='./tf3to12/step99500.ckpt.meta')

It loads model from ./tf3to12/XXX.ckpt.meta rather than ./XXX.ckpt.meta even they are in the same directory
If you load the path ./XXX.ckpt.meta directly, you will get error message telling you that you are not able to find the file

NOTE: But when you are importing other files in python, you can import directly as they are in the same directory
(See resources/tf3to12/tfEcgServerBolt.py)

from torch_model import Model3to12
from dbOperation import MongoDB

That is weird, but just follow it

ShellBolt python version (Pyenv) (Deprecated)

Deprecated
But learning it will help you know miniconda (Using in system now)

Note that in ShellBolt, this might execute /usr/bin/python (Guess)
So it won't be affected by pyenv to switch versions between python

super("python", "myExector.py")

How do I find this?
Experiment:

Try to switch python version(By pyenv) to 3.9.16 with no any python librarys installed,
All python process works well (No module name doesn't happen)

Then, we are going to modify the super command below
and now we can use pyenv global {version}
and will actually make the system's python version changed

super("/home/ytc/.pyenv/shims/python", "myExecutor.py")

(Now No module name occurs)

Notice that all the command below can't be used

super("$HOME/.pyenv/shims/python", "myExecutor.py")
super("~/.pyenv/shims/python", "myExecutor.py")

So, we write a shell script which will execute ~/.pyenv/shims/python
then, we can execute different python by modifing the content of this shell script
The content in the file is now

~/.pyenv/shims/python $1

Finally, put it at /usr/pystart.sh (Remember to chmod +x this file)
Now in ShellBolt, the actually command is

super("/usr/pystart.sh", "myExector.py")

ShellBolt python version (Miniconda) (Using)

We also need a shellscript to help us execute python under a certain user
because executing $HOME/... or ~/... is invalid

super("/usr/hfonlyStart.sh", "hfInfServerBolt.py");

The content in such file is like this.
And we can know that if we execute python under miniconda3/envs/MYENV/bin/python
we can use the package and dependencies under this environment

~/miniconda3/envs/hfonly/bin/python $1

Resource

Slide: https://docs.google.com/presentation/d/1j0hpGSWLX0SCssd0z7tjVW5ohR_3aJ12VJipMIcS5_8/edit#slide=id.g27e31dba315_0_0 Paper: https://drive.google.com/drive/folders/1pBse0Xx0vMjRYdTMQSpXaplGuhp5b2We

Name		Name	Last commit message	Last commit date
Latest commit History 77 Commits
simpleInference		simpleInference
statistics		statistics
stormEnv		stormEnv
userInfo		userInfo
vipasyanaService		vipasyanaService
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Vipasyana service built on Apache Storm

Install Zookeeper

Check system Java version

Storm Config File

How to Start Up Storm Cluster

How to Build and Submit Topology

Tutorial: About ShellBolt Working directory

ShellBolt python version (Pyenv) (Deprecated)

ShellBolt python version (Miniconda) (Using)

Resource

About

Releases

Packages

Languages

Gaaaamma/StormTopology

Folders and files

Latest commit

History

Repository files navigation

Vipasyana service built on Apache Storm

Install Zookeeper

Check system Java version

Storm Config File

How to Start Up Storm Cluster

How to Build and Submit Topology

Tutorial: About ShellBolt Working directory

ShellBolt python version (Pyenv) (Deprecated)

ShellBolt python version (Miniconda) (Using)

Resource

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages