Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How could I understand the format of your data set and get the multi view graph matrix of java code #5

Open
tmsxk opened this issue Nov 11, 2021 · 1 comment

Comments

@tmsxk
Copy link

tmsxk commented Nov 11, 2021

Hi, sorry if I bother you. I want to run your model using my own java data set, but I don't know how to get the same format of your data set from my own source code data set. I have download both your data file data.zip and the original tl-codesum dataset, but the data format between your data.zip and the original tl-codesum dataset is very different. Therefore, I want to know how you processed your java data set and got the format in data.zip. I would be very grateful if you can share your relevant scripts or codes. In addition, I also read your astruc.py code, but this is for the ast of the python code, I want to know how you get the mutil view graph and the corresponding matrix of your java code fragment. If you can share these codes or scripts, or provide some guidance on how to perform these operations, it can really help me a lot

@gingasan
Copy link
Owner

gingasan commented Mar 2, 2022

Sorry for our late.
There are little unified toolkits or platforms to process codes of various languages efficiently. Recently, we work on a more general tool in our leisure. You can refer to https://github.com/2001-SamLiu/Jsonl-Generator-for-AST. We haven't test the training performances of models when using this tool.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants