Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API to add wildcard strings into automata #1709

Open
subhajit-cdot opened this issue Aug 16, 2022 · 3 comments
Open

API to add wildcard strings into automata #1709

subhajit-cdot opened this issue Aug 16, 2022 · 3 comments

Comments

@subhajit-cdot
Copy link

Hi,
In nDPI, I can't find any API to add wildcard strings(ABC$, ^ABC, ^A.BC$ etc) into automata. Inside AC_PATTERN_t structure nDPI has support to add "from_start", "at_end" field for each string. But from application POV, without wrapper APIs we need to fill/expose internal AC_PATTERN_t structure. ndpi_add_string_to_automa() should handle this before adding string to the automata. Any views on this?

Thanks
S

@subhajit-cdot
Copy link
Author

In my understanding, these fields were provided to add wildcard feature in automata matching. So, if we add ABC$ in the automata, and string ABCD comes for matching, it should not match, AABC will match. Is this a correct understanding?
Thanks

@lucaderi
Copy link
Member

The library used by nDPI implements substring matching so what you do is currently implemented checking the results returned, or with tricks like ".activision.com" to avoid matching myactivision.com

@subhajit-cdot
Copy link
Author

subhajit-cdot commented Aug 18, 2022

Thanks @lucaderi . I have added another API as ndpi's string matching library already has support for "from_start", "at_end". Now this api can distinguish cases like "blabla.ABC.blabla" and "ABC". Using tricks like ".ABC." will match "blabla.ABC.blabla" as well, which is not desired.

In host_match[], there are string to match like "*.gateway.messenger.live.com". Are we expecting CDNs to come with exactly same content having "*." ? Because I think there is no support in ahocorasick for "*.*" type of regex to Match-zero-or-more Operator. (eg. abc*.test.*res*).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants