Skip to content

Commit

Permalink
Update readme.md
Browse files Browse the repository at this point in the history
  • Loading branch information
WeibinMeng authored Apr 30, 2020
1 parent 6581204 commit 861517e
Showing 1 changed file with 9 additions and 9 deletions.
18 changes: 9 additions & 9 deletions readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,15 +62,15 @@ File Descriptions

```sh
#Filter variables in the logs
python preprocessing.py -rawlog ./data/BGL.log
python code/preprocessing.py -rawlog ./data/BGL.log

-rawlog:raw logs
```

### Antonyms&Synonyms Extraction
```sh
#Extract antonyms and synonyms
python get_syn_ant.py -logs ./data/BGL_without_variables.log -ant_file ./middle/ants.txt -syn_file ./middle/syns.txt
python code/get_syn_ant.py -logs ./data/BGL_without_variables.log -ant_file ./middle/ants.txt -syn_file ./middle/syns.txt

-logs: logs
-ant_file: antonyms
Expand All @@ -80,28 +80,28 @@ python get_syn_ant.py -logs ./data/BGL_without_variables.log -ant_file ./middle/
### Relation Triple Extraction

```sh
python get_triplet.py data/BGL_without_variables.log middle/bgl_triplet.txt
python code/get_triplet.py data/BGL_without_variables.log middle/bgl_triplet.txt

data/BGL_without_variables.log: logs
middle/bgl_triples.txt: triples
```

```sh
#If -s is added, temporary saving will be enabled. By default, every 10000 pieces will be saved, named "temp\_" + output\_file
python get_triplet.py input_file output_file -s
python code/get_triplet.py input_file output_file -s
```

```sh
#If another parameter is added after -s, the number of bars saved per time is modified
python get_triplet.py input_file output_file -s 50000
python code/get_triplet.py input_file output_file -s 50000
```


### Semantic Word Embedding

```shell
#Convert log file to single line for training
python getTempLogs.py -input data/BGL_without_variables.log -output middle/BGL_without_variables_for_training.log
python code/getTempLogs.py -input data/BGL_without_variables.log -output middle/BGL_without_variables_for_training.log
```

```shell
Expand All @@ -118,15 +118,15 @@ make #make before you run

```shell
#Read the original vector file
python mimick/make_dataset.py --vectors middle/bgl_words.model --w2v-format --output middle/bgl_words.pkl
python code/mimick/make_dataset.py --vectors middle/bgl_words.model --w2v-format --output middle/bgl_words.pkl

--vectors:Results of w2v, the first row is the number of rows and dimensions (can be omitted), the format of each subsequent row is word + word vector: word d1 d2... d32
```


```shell
#Train the new embedding according to oov
python mimick/model.py --dataset middle/bgl_words.pkl --vocab middle/testvocab.txt --output middle/oov.vector
python code/mimick/model.py --dataset middle/bgl_words.pkl --vocab middle/testvocab.txt --output middle/oov.vector

--dataset:Output of the first step
--vocab:New words, you can write multiple words in batches, one word per line
Expand All @@ -135,7 +135,7 @@ python mimick/model.py --dataset middle/bgl_words.pkl --vocab middle/testvocab.

### Generate vector for logs
```shell
python Log2Vec.py -logs ./data/BGL_without_variables.log -word_model ./middle/bgl_words.model -log_vector_file ./middle/bgl_log.vector -dimension 32
python code/Log2Vec.py -logs ./data/BGL_without_variables.log -word_model ./middle/bgl_words.model -log_vector_file ./middle/bgl_log.vector -dimension 32
```


Expand Down

0 comments on commit 861517e

Please sign in to comment.