- Des Moines, Iowa, USA
-
12:14
(UTC -06:00) - https://benjijang.com
- https://orcid.org/0000-0001-6175-105X
- in/ben-steenhoek
Lists (1)
Sort Name ascending (A-Z)
Stars
An agent benchmark with tasks in a simulated software company.
Official repository for our paper "FullStack Bench: Evaluating LLMs as Full Stack Coders"
Anthropic's Interactive Prompt Engineering Tutorial
PurCL / LLMSAN
Forked from chengpeng-wang/LLMSANLLMSAN: Sanitizing Large Language Models in Bug Detection with Data-Flow
A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!
Set of tools to assess and improve LLM security.
Goshawk is a static analyze tool to detect memory corruption bugs in C source codes. It utilizes NLP to infer custom memory management functions and uses data flow analysis to abstract their behavi…
A Unified Library for Parameter-Efficient and Modular Transfer Learning
OWASP Benchmark is a test suite designed to verify the speed and accuracy of software vulnerability detection tools. A fully runnable web app written in Java, it supports analysis by Static (SAST),…
A test suite designed to verify the speed and accuracy of software vulnerability detection tools
A manually vetted dataset for security vulnerability detection in Java projects
Friends don't let friends make certain types of data visualization - What are they and why are they bad.
Security vulnerability database inclusive of CVEs and GitHub originated security advisories from the world of open source software.
Zero shot vulnerability discovery using LLMs
A curated list of awesome remote jobs and resources. Inspired by https://github.com/vinta/awesome-python
Create web-based user interfaces with Python. The nice way.
This repo will contain the code of a paper we are publishing.
[NeurIPS'24] SemCoder: Training Code Language Models with Comprehensive Semantics Reasoning
An overview of LLMs for cybersecurity.
Code for "An Empirical Study of Deep Learning Models for Vulnerability Detection", published in ICSE 2023.