Skip to content
/ EACS Public

An Extractive-and-Abstractive Framework for Source Code Summarization

License

Notifications You must be signed in to change notification settings

wssun/EACS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EACS

Title: An Extractive-and-Abstractive Framework for Source Code Summarization

Requirements

The dependencies can be installed using the following command:

pip install -r requirements.txt

Original Dataset

The CodeSearchNet original dataset can be downloaded from the github repo: https://github.com/github/CodeSearchNet, and the cleaned dataset (CodeXGLUE) can be downloaded from the https://drive.google.com/open?id=1rd2Tc6oUWBo7JouwexW3ksQ0PaOhUr6h

The JCSD and PCSD dataset can be downloaded from the github repo: https://github.com/xing-hu/TL-CodeSum and https://github.com/wanyao1992/code_summarization_public

Quick Start

Extractor

Run the Extractor_ classifier/make_label/make_dataset_label.py to generate the classification labels for classifier training.

For example:

cd Extractor_ classifier/make_label/
python make_dataset_label.py

And run the Extractor_ classifier/train.py to train the classifier model. For example:

cd Extractor_ classifier/
python train.py {language}

The classifier model will be saved in the Extractor_ classifier/model/, and run the Extractor_ classifier/classifier.py to generate the important sentences predicted value.

cd Extractor_ classifier/
python classifier.py {language}

The {language} can be selected in java, python, go, php, ruby, javascript, JCSD, PCSD

the output samples are as follows:

- output samples:
{
    "idx":0, 
    "code_tokens":["private", "int", "currentDepth", ...], 
    "docstring_tokens": ["returns", "a", "0", ...],  
    ...,  
    "ex_labels": [1, 0, 1, ...], 
    "cleaned_seqs_pred": [1, 0, 1, ...] 
}
...

EACS + CodeBert

To train the EACS CodeBert model:

cd EACS_codeBert/
python train.py {language}

To test and output the EACS CodeBert results:

python test.py {language}

The {language} can be selected in java, python, go, php, ruby, javascript, JCSD, PCSD

EACS + CodeT5

To train the EACS CodeBert model and it also will outputs the results:

cd EACS_codeT5/
python run_gen.py {language}

The {language} can be selected in java, python, go, php, ruby, javascript, JCSD, PCSD

Evaluation

After trainning the EACS + CodeBert and EACS + CodeT5 models, run the evaluation code to output Bleu, Meteor and Rouge-L:

(Switch into python 2.7)

cd Evaluation/
python evaluate.py

About

An Extractive-and-Abstractive Framework for Source Code Summarization

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages