bstee615

Benjamin Steenhoek bstee615

Research Scientist @ Microsoft. Recently completed PhD from Iowa State University. Interests and research: deep learning for software engineering

46 followers · 60 following

Achievements

Organizations

Lists (1)

Sort

🔮 Future ideas

1 repository

Stars

TheAgentCompany / TheAgentCompany

An agent benchmark with tasks in a simulated software company.

Python 82 7 Updated Dec 20, 2024

fiatjaf / awesome-jq

A curated list of awesome jq tools and resources.

831 42 Updated Dec 14, 2024

DS4SD / docling

Get your documents ready for gen AI

Python 16,800 867 Updated Dec 19, 2024

bytedance / FullStackBench

Official repository for our paper "FullStack Bench: Evaluating LLMs as Full Stack Coders"

Python 51 3 Updated Dec 12, 2024

PurCL / LLMSCAN

Python 13 1 Updated Nov 23, 2024

coinse / autofl

Jupyter Notebook 17 9 Updated Aug 14, 2024

anthropics / prompt-eng-interactive-tutorial

Anthropic's Interactive Prompt Engineering Tutorial

Jupyter Notebook 1,978 209 Updated Jul 11, 2024

PurCL / LLMSAN

Forked from chengpeng-wang/LLMSAN

LLMSAN: Sanitizing Large Language Models in Bug Detection with Data-Flow

Java 1 Updated Oct 6, 2024

lm-sys / RouteLLM

A framework for serving and evaluating LLM routers - save LLM costs without compromising quality!

Python 3,390 253 Updated Aug 10, 2024

meta-llama / PurpleLlama

Set of tools to assess and improve LLM security.

Python 2,807 463 Updated Dec 20, 2024

Yunlongs / Goshawk

Goshawk is a static analyze tool to detect memory corruption bugs in C source codes. It utilizes NLP to infer custom memory management functions and uses data flow analysis to abstract their behavi…

C++ 80 15 Updated Dec 18, 2023

dockur / windows

Windows inside a Docker container.

Shell 31,159 2,128 Updated Dec 21, 2024

squaresLab / LLMAO

Java 25 6 Updated Jan 27, 2024

adapter-hub / adapters

A Unified Library for Parameter-Efficient and Modular Transfer Learning

Jupyter Notebook 2,609 354 Updated Dec 25, 2024

OWASP-Benchmark / BenchmarkJava

OWASP Benchmark is a test suite designed to verify the speed and accuracy of software vulnerability detection tools. A fully runnable web app written in Java, it supports analysis by Static (SAST),…

Java 675 1,087 Updated Dec 16, 2024

aisec-dev / BenchmarkJava

Forked from OWASP-Benchmark/BenchmarkJava

A test suite designed to verify the speed and accuracy of software vulnerability detection tools

Java 1 Updated Jul 12, 2024

Python 1 Updated Oct 5, 2024