Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sample Python auto generation #205

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from
Draft

Conversation

DavidKorczynski
Copy link
Collaborator

Sample for auto-generating an OSS-Fuzz project for a given Python project.

This differs a bit from the existing set up. The approach in this PR relies on cloning the Python repository within an OSS-Fuzz base-builder image. Within this image Fuzz Introspector is cloned, and a fuzz introspector analysis purely based on static analysis is performed, to extract details about the Python library under analysis.

It's a bit raw at this stage, and can likely be better integrated into OSS-Fuzz-gen. However, I'm not 100% sure what the smartest steps are here, so am sharing in case there are opinions. It may be smarter to operate on two tracks in parallel and merge them later on once it's better known what works/doesn't work.

Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
Copy link
Collaborator

@oliverchang oliverchang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice! just initial comments.

I wonder if there's a way to refactor this to make all of this a bit neater in the future. Right now the concept of a benchmark is quite ingrained in these classes, so we need to untie that, or alternately somehow create a synthetic "benchmark" based on the Python project we're generating targets for.

@@ -0,0 +1,9 @@
# Python auto-gen
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we place this in /languages/python instead?


def create_sample_harness(github_repo: str, func_elem):

prompt_template = """Hi, I'm looking for your help to write a Python fuzzing harness for the %s Python project. The project is located at %s and I would like you to write a harness targeting this module. You should use the Python Atheris framework for writing the fuzzer. Could you please show me the source code for this harness?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put this into a file?

@@ -119,6 +119,13 @@ def _pre_build_check(self, target_path: str,
return False
return True

def build_and_run_python(self, generated_project: str, target_path: str):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid adding language specific methods here? We should ideally make this class language agnostic.

@@ -145,6 +152,36 @@ def build_and_run(self, generated_project: str, target_path: str,
generated_project, benchmark_target_name))
return build_result, run_result

def run_target_local_python(self, generated_project: str, target_name: str,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very similar to run_target_local. Is the only difference we're not passing things from self.benchmark ?

We should find some better way to refactor things so there's less duplication here. One simple way I can think of now is to perhaps factor out a general run_oss_fuzz_helper(...) which is called by both run_target_local and python_fuzzgen.

@@ -199,6 +237,14 @@ def build_target_local(self,
print(f'Failed to build image for {generated_project}')
return False

if language == 'python':
command = 'python3 infra/helper.py build_fuzzers %s' % (generated_project)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use fstrings here to be consistent?

@@ -199,6 +237,14 @@ def build_target_local(self,
print(f'Failed to build image for {generated_project}')
return False

if language == 'python':
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What breaks if we let this run through the existing code from line 248 instead? IS there a way to make this work by changing the env vars being set there instead?

Copy link

@defihook defihook left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use fstrings here to be consistent?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants