Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Process Hangs Due to Recursive Code Transformations in Rule Graphs #709

Closed
sasidv opened this issue Nov 6, 2024 · 20 comments
Closed

Process Hangs Due to Recursive Code Transformations in Rule Graphs #709

sasidv opened this issue Nov 6, 2024 · 20 comments

Comments

@sasidv
Copy link

sasidv commented Nov 6, 2024

Thank you again for the great work. I have a question to ensure I fully understand the expectations of the rule graphs. What I'm trying to do is the following change:

before code =

from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()

after code =

from tensorflow.keras import layers
sp = layers.builder.config("config3", "1").config("config1", "5").config("config2", "5").getOrCreate()

Objective:

The goal is to add .config("config3", "1") to the config chain only if the import statement from tensorflow.keras import layers exists in the file. The change should not occur if the import is different, for examplefrom pytorch.ll import layers. I created a rule graph with two nodes:

  1. The first node checks if the import exists in the file.
  2. If the import is found, the second rule is executed to add .config("config3", "1") to the chain.

Question 1: Infinite Transformation Loop When Adding .config("config3", "1")

Observation:
The process hangs when I add .config("config3", "1") because it matches the search query again. If I change the replace parameter in the extend_method_chain rule to .config("1"), the code changes successfully.


Question 2: Transformation Applied Regardless of Import Presence

Observation:
The second rule, extend_method_chain, is executed and .config("config3", "1") is added to the method chain even when from tensorflow.keras import layers is not present in the file. What I want is the change to happen only if the import exists. Additionally, I also tried to change the scope of the rule to the File, expecting to fix this error, and got below error
pyo3_runtime.PanicException: Could not create scope query for "File".


Blow is the code that hangs

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges

def main():
    code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""(
  (import_from_statement
    module_name: (dotted_name
      (identifier) @module_name_part
      (identifier) @submodule_name_part)
    name: (dotted_name
      (identifier) @imported_name)
  (#eq? @module_name_part "tensorflow")
  (#eq? @submodule_name_part "keras")
  (#eq? @imported_name "layers"))
)""",  # cs indicates we are using concrete syntax
        is_seed_rule=True
    )

    extend_method_chain = Rule(
        name="extend_method_chain",
        query="""(
  (call
    function: (attribute
      object: (attribute
        object: (identifier) @obj
        attribute: (identifier) @builder_method
      )
      attribute: (identifier) @config_method
    )
    arguments: (argument_list
      (string) @config_arg1
      (string) @config_arg2
    )
  )
  (#eq? @obj "layers")
  (#eq? @builder_method "builder")
  (#eq? @config_method "config")
)""",  # cs indicates we are using concrete syntax
        replace_node="config_method",
        replace="config(\"config3\", \"1\").@config_method",
        is_seed_rule=True
    )

    edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")

    # Create Piranha arguments
    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
    )

    # Execute Piranha and print the transformed code
    piranha_summary = execute_piranha(piranha_arguments)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

Do you have any suggestions or workarounds to fix both errors? Or am I doing something unexpected?

@danieltrt
Copy link
Contributor

danieltrt commented Nov 6, 2024

Alright! My first suggestion is to avoid using tree-sitter queries and instead use "concrete syntax," unless you truly need the full expressiveness of tree-sitter queries (such as for alternations or similar complex patterns).
Concrete syntax is straightforward: you write the exact code pattern you want to match and leave placeholders for parts that can vary.

For example:

:[x].foo(:[args+]) would match var1.foo(1,2,3,4).

:[var] matches one node, :[var+] would match multiple.

Now, regarding the infinite loop issue: Piranha operates by applying transformations iteratively until it reaches a fixpoint, meaning it will continue transforming code until there are no further changes. If the code still matches after a rewrite, the same rule will trigger again. This approach is effective for feature-flash cleanup, but not ideal for migrations. To prevent repeated matches, we use what we call filters. Take a look at your code with these changes:

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter

def main():
    code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    extend_method_chain = Rule(
        name="extend_method_chain",
        query="""cs :[builder].config(:[args+])""",  # cs indicates we are using concrete syntax
        replace_node="builder",
        replace=':[builder].config("config3", "1")',
        is_seed_rule=True,
        filters={Filter(enclosing_node="(assignment) @assign",
                        not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
    )

    edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")

    # Create Piranha arguments
    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
    )

    # Execute Piranha and print the transformed code
    piranha_summary = execute_piranha(piranha_arguments)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

@sasidv
Copy link
Author

sasidv commented Nov 6, 2024

Thank you for the suggestions! I'll try using concrete syntaxes for now, but I might need to explore Tree-sitter queries in the future. The looping issue has been resolved with the Filter , thanks, but when I remove the import tensorflow.keras import layers from the code above, the changes are still applied, which shouldn't happen. Do you have an idea?

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter

def main():
    code = """sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    extend_method_chain = Rule(
        name="extend_method_chain",
        query="""cs :[builder].config(:[args+])""",  # cs indicates we are using concrete syntax
        replace_node="builder",
        replace=':[builder].config("config3", "1")',
        is_seed_rule=True,
        filters={Filter(enclosing_node="(assignment) @assign",
                        not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
    )

    edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")

    # Create Piranha arguments
    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
    )

    # Execute Piranha and print the transformed code
    piranha_summary = execute_piranha(piranha_arguments)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

Expected output

sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()

Actual output

sp = layers.builder.config("config1", "5").config("config3", "1").config("config2", "5").getOrCreate()

@danieltrt
Copy link
Contributor

The second rule needs to be declared as seed = False

@sasidv
Copy link
Author

sasidv commented Nov 6, 2024

Thank you so much for your help in figuring this out. I tried setting is_seed_rule=False, but when I do that, the rule doesn't execute at all. However, when I set is_seed_rule=True, only the extend_method_chain rule is executed and the change is made. The issue, though, is that when I set it to True, the change is applied regardless of whether from tensorflow.keras import layers is present or not. Do you have any idea?

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter

def main():
    code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    extend_method_chain = Rule(
        name="extend_method_chain",
        query="""cs :[builder].config(:[args+])""",  # cs indicates we are using concrete syntax
        replace_node="builder",
        replace=':[builder].config("config3", "1")',
        is_seed_rule=False,
        filters={Filter(enclosing_node="(assignment) @assign",
                        not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
    )

    edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")

    # Create Piranha arguments
    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge], )
    )

    # Execute Piranha and print the transformed code
    piranha_summary = execute_piranha(piranha_arguments)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

@danieltrt
Copy link
Contributor

danieltrt commented Nov 6, 2024

I suspect the reason is the scope. Rules in PolyglotPiranha trigger other rules within scopes. In your code, the rule find_from_tensorflow_keras_import_layers is set to trigger extend_method_chain within the Parent scope (i.e., parent in the AST).

PolyglotPiranha can support multiple scopes (they are user defined for each language). For example, for java, you can find scopes here

Unfortunately, we don't have scopes defined for Python (we would need to add it to the cleanup_rules folder). However, there are two language agnostic scopes you can use right now. Parent and Global. If you switch from Parent to Global in your edge, the rule should now trigger. This isn't exactly what you need, since Global will trigger the rule for the entire codebase, rather than just at the File level. Alternatively, you can set number_of_ancestors_in_parent_scope to a high number such that Parent triggers all the way up to the File Node:

    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge]),
        number_of_ancestors_in_parent_scope=30 # set a big number here
    )

Let me know if this helps!

@sasidv
Copy link
Author

sasidv commented Nov 7, 2024

Thank you again. I tried increasing the value of number_of_ancestors_in_parent_scope to its maximum, but the issue persists, and the rule extend_method_chain still doesn't execute. If there are no other workarounds available, and using the "File" scope is the cleanest solution, I’m happy to implement that feature in Piranha for Python. If you could provide a list of all the changes I need to make in Piranha to implement the "File" scope, I'd be glad to create a PR once I have the details.

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter

def main():
    code = """from tensorflow.keras import layers
sp = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    extend_method_chain = Rule(
        name="extend_method_chain",
        query="""cs :[builder].config(:[args+])""",  # cs indicates we are using concrete syntax
        replace_node="builder",
        replace=':[builder].config("config3", "1")',
        is_seed_rule=False,
        filters={Filter(enclosing_node="(assignment) @assign",
                        not_contains=['cs :[other].config("config3", "1")'])} # here is the filter that says "only apply this rule once per assignment"
    )

    edge = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["extend_method_chain"], scope="Parent")

    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, extend_method_chain], edges=[edge]),
        number_of_ancestors_in_parent_scope=255 # set a big number here
    )

    # Execute Piranha and print the transformed code
    piranha_summary = execute_piranha(piranha_arguments)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

@danieltrt
Copy link
Contributor

You're right; I had forgotten. Unfortunately, parent scope only applies to nodes that are actual parents and not their context recursively, so the current workaround is to set it to global.

As for the required changes, I would be happy to review your PRs, but I currently don’t have write access to this repository, so I can't promise it will be merged. The change is pretty straightforward; you just have to create a scope config file and then add its relative path to the corresponding language, as shown here:

scopes: vec![],

In src/cleanup_rules/python/scope_config.toml

# Scope generator for python files
[[scopes]]
name = "File"
[[scopes.rules]]
enclosing_node = """
(module) @p_m
"""
scope = "(module) @python_module"

You may also add other scopes, like class, function, method etc

@sasidv
Copy link
Author

sasidv commented Nov 10, 2024

@danieltrt Thank you for your help. I created the file and successfully defined File scope, it allows me to execute the changes as expected. To complete this and create the PR, I also added scopes for Function and Class as shown below. File scope works perfectly fine, but I am getting the following error with the Class and Function scopes. Do you have any idea about this? Is there another change I should do, or anything wrong with scope_config.toml Thanks again

thread '<unnamed>' panicked at src/models/scopes.rs:80:5:
Could not create scope query for "Class"
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
Traceback (most recent call last):
  File "/Users/xxx/Documents/yyy/src/transformation1.py", line 19, in <module>
    main()
  File "/Users/xxx/Documents/yyysrc/transformation1.py", line 13, in main
    piranha_summary = execute_piranha(piranha_arguments)
pyo3_runtime.PanicException: Could not create scope query for "Class"

Below is the python/scope_config.toml

# Scope generator for python files
[[scopes]]
name = "Function"
[[scopes.rules]]
enclosing_node = """
(
  (function_definition
      name: (_) @n
      parameters: (parameters) @fp
  ) @xdn
)"""
scope = """
(
  [
    (function_definition
        name: (_) @z
        parameters: (parameters) @tp
    )
    (#eq? @z "@n")
    (#eq? @tp "@fp")
  ]
) @qdn
"""

[[scopes]]
name = "Class"
[[scopes.rules]]
enclosing_node = """
((class_definition name: (_) @n) @c)
"""
scope = """
(
  ((class_definition
      name: (_) @z) @qc)
  (#eq? @z "@n")
)
"""


[[scopes]]
name = "File"
[[scopes.rules]]
enclosing_node = """
(module) @p_m
"""
scope = "(module) @python_module"

@danieltrt
Copy link
Contributor

That seems right to me for your example. PolyglotPiranha cannot find an enclosing class in from tensorflow.keras import layers, hence you get that error.

The function scope seems to have a little problem. This version works for me:

[[scopes]]
name = "Function"
[[scopes.rules]]
enclosing_node = """
(
  (function_definition
      name: (_) @n
      parameters: (parameters) @fp
  ) @xdn
)"""
scope = """
(
  [
    (function_definition
        name: (_) @z
        parameters: (parameters) @tp
    ) @qdn
    (#eq? @z "@n")
    (#eq? @tp "@fp")
  ]
)
"""

Try running it on this code instead:

def test_fn(x):
    from tensorflow.keras import layers
    other = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()
    
first = layers.builder.config("config1", "5").config("config2", "5").getOrCreate()

Hopefully you will observe the expected behavior 😃

@sasidv
Copy link
Author

sasidv commented Nov 11, 2024

It works now—thanks a lot for your help. However, if PolyglotPiranha can’t find an enclosing class or function, should it print a warning or error message instead of raising an exception? Is this the expected behavior?

@danieltrt
Copy link
Contributor

I agree, I think a warning could have been a better solution here ? I don't remember the rationale for this design decision

@maldil
Copy link

maldil commented Nov 12, 2024

The PR has been opened. Thank you once again for your help.

@ketkarameya
Copy link
Contributor

I think we can add a flag that optionally prints a warning rather than an throwing an exception.
Throwing an exception gives a quicker feedback on what graph paths are working on the given codebase (so that the rules can be updated), and it also makes behavior more deterministic. Keeping track of warnings without a proper observability infrastructure is difficult.

@maldil
Copy link

maldil commented Nov 18, 2024

Sure, if the expectation is to apply each rule at least once, that makes sense to me.

@maldil
Copy link

maldil commented Nov 27, 2024

Hi @danieltrt

Thank you again for helping me understand how to run Piranha. I have a follow-up question regarding specifying the enclosing_node in the filter, which you recommended to use as a solution to resolve the process hang issue.

I am trying to make the following change:

input code

from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create(dt)

def function1():
    sp = layer.builder.config("config1", "5")

def function2():
    sp = layer.builder.config("config1", "5")

expected code

from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")

def function2():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")

I tried to implement this change as described below. I have a query to identify the imports and another query to identify dt = data.build() and layer = layers.create(dt) (though I removed these for simplicity in the code below). The logic then checks for the existence of sp = layer.builder.config("config1", "5"). If all conditions are satisfied, the modification is to add sp.enable("config1").

I tried using function_definition, block, and even assignment as the enclosing node in the filter. However, it still hangs. If I use the entire module as the enclosing node, it successfully adds sp.enable("config1") but only for function2(), indentation is also incorrect.

below is the output with incorrect indentation, if I use module as enclosing node in the filter,

from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")

def function2():
    sp = layer.builder.config("config1", "5")
sp.enable("config1")

I browsed through all most all the examples in the repository. Since this change involves adding a statement, which is slightly different from most of the examples I found. I came across one example that involve adding import statement, which usually do not require indentation as they are typically located at the top of the file. However, I could not find an example demonstrating how to add other statement within the code with proper indentation.

Do you have any suggestions or ideas on how to resolve this?

Below is the complete code

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
from collections import Counter
def main():
    code = """from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")

def function2():
    sp = layer.builder.config("config1", "5")"""

    expected_code = """from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")

def function2():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    find_from_pandas_import_data = Rule(
        name="find_from_pandas_import_data",
        query="""rgx from pandas import data""",  # rgx indicates regex match
        is_seed_rule=False
    )


    find_layer_builder_config = Rule(
        name="layer_builder_config",
        query="""(function_definition
  body: (block
    (expression_statement
      (assignment
        left: (identifier) @var_name
        right: (call
          function: (attribute
            object: (attribute
              object: (identifier) @object_name
              attribute: (identifier) @builder_name
            )
            attribute: (identifier) @config_name
          )

        )
      )
  (#eq? @object_name "layer")
  (#eq? @builder_name "builder")
  (#eq? @config_name "config")
    )@ass
  )
)
""",
        replace_node="ass",
        replace="@ass\nsp.enable(\"config1\")",
        is_seed_rule=False,
        # # holes={"ass"},
        filters={Filter(enclosing_node="(module) @call",
                         not_contains=["cs sp.enable(\"config1\")"])}
    )

    edge1 = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["find_from_pandas_import_data"], scope="File")
    edge2 = OutgoingEdges("find_from_pandas_import_data", to=["layer_builder_config"], scope="File")


    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="python",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, find_from_pandas_import_data, find_layer_builder_config], edges=[edge1,edge2]),
    )

    piranha_summary = execute_piranha(piranha_arguments)
    rule_match_counter = Counter([m[0] for m in piranha_summary[0].matches])
    print(rule_match_counter)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()

@danieltrt
Copy link
Contributor

danieltrt commented Nov 27, 2024

Hey @maldil,

I believe the issue here is that Piranha isn't handling indentation properly. Unfortunately, this version of Piranha doesn’t natively support indentation. Since Uber primarily uses Piranha for cleanups in Java or other languages where indentation isn’t critical, adding support for Python indentation hasn’t been a priority for them.

To address this, I’ve implemented indentation support in my own custom branch, which you can find here.
My solution is a bit hacky because I didn’t fully adhere to the abstractions used in Piranha's design. Additionally, I’ve only implemented it for concrete syntax. However, it works well enough for my use case (which sounds similar to yours? migrations)

I’ve also made it possible to perform single-pass transformations, which helps avoid looping issues. This is particularly useful for my research, as I primarily use Piranha for migrations rather than cleanup tasks. In this version, you don’t even need filters!
If you’d like to try it out, you can install my version by following the usual installation steps from the source. Here's the process:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
cargo build --no-default-features
maturin build --release

Here's the modified script which works like you want to in my version of piranha

from polyglot_piranha import execute_piranha, PiranhaArguments, Rule, RuleGraph, OutgoingEdges, Filter
from collections import Counter
def main():
    code = """from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")

def function2():
    sp = layer.builder.config("config1", "5")"""

    expected_code = """from tensorflow.keras import layers
from pandas import data

dt = data.build()
layer = layers.create()

def function1():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")

def function2():
    sp = layer.builder.config("config1", "5")
    sp.enable("config1")"""

    find_from_tensorflow_keras_import_layers = Rule(
        name="find_from_tensorflow_keras_import_layers",
        query="""rgx from tensorflow.keras import layers""",  # rgx indicates regex match
        is_seed_rule=True
    )

    find_from_pandas_import_data = Rule(
        name="find_from_pandas_import_data",
        query="""rgx from pandas import data""",  # rgx indicates regex match
        is_seed_rule=False
    )


    find_layer_builder_config = Rule(
        name="layer_builder_config",
        query="""cs :[var] = layer.builder.config("config1", :[val])""",
        replace_node="*",
        replace=""":[var] = layer.builder.config("config1", :[val])\nsp.enable(\"config1\")""",
        is_seed_rule=False,
    )

    edge1 = OutgoingEdges("find_from_tensorflow_keras_import_layers", to=["find_from_pandas_import_data"], scope="File")
    edge2 = OutgoingEdges("find_from_pandas_import_data", to=["layer_builder_config"], scope="File")


    piranha_arguments = PiranhaArguments(
        code_snippet=code,
        language="py",
        rule_graph=RuleGraph(rules=[find_from_tensorflow_keras_import_layers, find_from_pandas_import_data, find_layer_builder_config], edges=[edge1,edge2]),
    )

    piranha_summary = execute_piranha(piranha_arguments)
    rule_match_counter = Counter([m[0] for m in piranha_summary[0].matches])
    print(rule_match_counter)
    print(piranha_summary[0].content)


if __name__ == "__main__":
    main()
    
Screenshot 2024-11-27 at 9 32 32 PM

I'd be happy to clean this code version if you are interested in using it. I'm not sure how quickly we would be able to merge it to this repository though

@maldil
Copy link

maldil commented Nov 27, 2024

thanks a lot for sharing, let me try this.

@maldil
Copy link

maldil commented Dec 3, 2024

@danieltrt Thanks, it works! I noticed it also prints some debugging-related text, which I believe can be removed.

Since this approach works for concrete syntaxes, do you have any insights into the effort and required changes to adapt it for Tree-sitter queries?

@danieltrt
Copy link
Contributor

I think the logic would be pretty similar. The way it works is by identifying the depth at which the matched code exists and treating it as a "box" structure. For example:

-----
-----
-----
-----  | if something:
       |   x = 1
       | else:
       |   x = 0

I used this concept to extract "unindented" code. During replacements, we just need to check where the tag (e.g., stmt) is and ensure the replacement respects its indentation. For example:

while True:
    if something:
       @stmt

Here we just need to apply the indentation of "stmt" to the captured box block, resulting in

while True:
    if something:
        if condition:
            x = 1
        else:
            x = 0

@sasidv
Copy link
Author

sasidv commented Dec 19, 2024

Thank you for your help. I’m not sure if I should open a feature request to track this, but since the primary issue of the process hang has been resolved, I will go ahead and close this ticket.

@sasidv sasidv closed this as completed Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants