Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some details #6

Open
ItGirls opened this issue Aug 30, 2021 · 9 comments
Open

Some details #6

ItGirls opened this issue Aug 30, 2021 · 9 comments

Comments

@ItGirls
Copy link

ItGirls commented Aug 30, 2021

Hi, I read your code which is an excellent work. And here I listed some details and my questions about your work to avoid wrong understanding:

(1) The indicator function I(r,c) in your paper is to indicate whether the role r belongs to the type c . But in your code, you actually used the predefined event schema (i.e., ty_args_id which contains the infomation given by ty_args.json) . According to this, the indicator function not really decided whether the role r belongs to the type c, but just act as a computed weight coefficient to adjust the score computed by the sigmoid func.

(2) Except for the event schema, in prediction, you also use the other prior information, such as largest length of trigger/argument. Did all the three prior information are only calculated from training data?

(3)In the evaluation metric, I find that the evaluation metric of argument identification and argument classification missing trigger information, so it's not a very strict metric. And if adding these(I mean an argument is correctly identified if ifs offset, related trigger type and trigger's offsets exactly match a reference argument), the performance will decrease.

@JiaweiSheng
Copy link
Owner

Hi, thanks for your comments.

The simple indicator function I(r,c) is usually learned well in our experiments, reflecting the correspondence between the event types and argument roles. It helps the model to avoid learning redundant roles that the type doesn't have, and thus improves the model performance in our experiments.

Besides, the pre-defined event schema and the largest length actually works as additional post-processing, which slightly improves the performance. Note that they are only obtained from training data, and are very easy to obtain by data statistics.

As for the evaluation metric, we referred to the metric code from previous researches reported in the paper. You could also try other metric code for evaluation.

@ItGirls
Copy link
Author

ItGirls commented Sep 2, 2021

Hi, thanks for your comments.

The simple indicator function I(r,c) is usually learned well in our experiments, reflecting the correspondence between the event types and argument roles. It helps the model to avoid learning redundant roles that the type doesn't have, and thus improves the model performance in our experiments.

Besides, the pre-defined event schema and the largest length actually works as additional post-processing, which slightly improves the performance. Note that they are only obtained from training data, and are very easy to obtain by data statistics.

As for the evaluation metric, we referred to the metric code from previous researches reported in the paper. You could also try other metric code for evaluation.

Thank u so much. And, the pre-defined event schema and the largest length are only obtained from training data, not including evaluation data, right?

@JiaweiSheng
Copy link
Owner

Yes, correct.

@ItGirls
Copy link
Author

ItGirls commented Sep 2, 2021

Hi, thanks for your comments.
The simple indicator function I(r,c) is usually learned well in our experiments, reflecting the correspondence between the event types and argument roles. It helps the model to avoid learning redundant roles that the type doesn't have, and thus improves the model performance in our experiments.
Besides, the pre-defined event schema and the largest length actually works as additional post-processing, which slightly improves the performance. Note that they are only obtained from training data, and are very easy to obtain by data statistics.
As for the evaluation metric, we referred to the metric code from previous researches reported in the paper. You could also try other metric code for evaluation.

Thank u so much. And, the pre-defined event schema and the largest length are only obtained from training data, not including evaluation data, right?

Thank you for your replay. I did the statistic, and find that the statistic results on three ways(only on train, or on train/dev , or on train/dev/test) are only different in the length of the role "way", all other info in the the pre-defined event schema and the largest length are the same.

@ItGirls
Copy link
Author

ItGirls commented Sep 3, 2021

Yes, correct.

I forgot one important issue which is your work/code now is just suit for "no negative samples" (all data in your case have event), If I want to apply your work in situation where negative samples exist in train/dev/test data(such as ACE event data), I need to adjust not only the train process but also the model itself, is that right?

@ChesterXi
Copy link

Yes, correct.

I forgot one important issue which is your work/code now is just suit for "no negative samples" (all data in your case have event), If I want to apply your work in situation where negative samples exist in train/dev/test data(such as ACE event data), I need to adjust not only the train process but also the model itself, is that right?

I reconstructed ACE2005 and ran the experiment on casEE, why all my results are 0.000 0.001, is your ACE experiment running properly?

@ItGirls
Copy link
Author

ItGirls commented Apr 12, 2022

yes,I run it properly.

@jinzhuoran
Copy link

jinzhuoran commented Apr 16, 2022

是的,我运行正常。

请问你在ace2005上的性能是多少?我跑出来之后type的f值较低,这是正常的吗

@ItGirls
Copy link
Author

ItGirls commented Apr 20, 2022 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants