Open Science @ MPI-SP

I lead my group with an Open Science and Reproducibility Policy, which includes publishing the experimental infrastructure, tools, data, and the scripts to produce tables and figures. Most of the published repositories continue to be actively maintained by the community and receive external contributions. Apart from this group-level policy, I am also actively involved in community efforts to promote preregistration and artifact evaluation as a way to improve the soundness and reprodicibility of our empirical evaluations as well as the soundness of our peer review process.

For instance, I am leading a grassroots initative to introduce a novel preregistration-based publication process for fuzzing research that consists of two main stages: In the first stage, the program committee (PC) evaluates all submissions based on: (i) the significance and novelty of the hypotheses or techniques and (ii) the soundness and reproducibility of the methodology specified to validate the claims or hypotheses---but explicitly not based on the strength of the (preliminary) results. These draft registered reports are presented and improved at the FUZZING’22 workshop. After the workshop, the final versions of the registered reports are re-checked and approved by the PC. In the second stage, the PC and the Artifact Evaluation Committee (AEC) check whether the experimental methodology as laid out by the authors was correctly followed. I am excited that the outcome of this stage will be published in the ACM Transactions on Software Engineering and Methodology (TOSEM) via the Preregistration track (which I have been invited to establish assisting Cristian Cadar, and I am now heading as Guest Editor-In-Chief).

News

Jun'25: Jing Liu becomes new MPI-SP Open Science Ambassador.
May'25: Organized a Sparkling Science seminar on "Making Open Science Work at MPI-SP".
Apr'25: Created a new MPI-SP Wiki website to discuss Open Science resources at MPG and MPI-SP.
Aug'23: Seongmin Lee becomes inaugural MPI-SP Open Science Ambassador.

Open Source Software and Open Data and Analysis

2026
FSE'26

🧑‍💻	Evaluating LLM-Based Regression Test Generation.
	ACM International Conference on the Foundations of Software Engineering (FSE'26). 23pp.
	🧑‍💻 https://github.com/niMgnoeSeeL/cleverest

FSE'26

🧑‍💻	In Bugs We Trust? On Measuring the Randomness of a Fuzzer Benchmarking Outcome.
	ACM International Conference on the Foundations of Software Engineering (FSE'26). 21pp.
	🧑‍💻 https://github.com/ardier/in_bugs_we_trust/

ICSE'26

🧑‍💻	Scaling Security Testing by Addressing the Reachability Gap.
	48th IEEE/ACM International Conference on Software Engineering (ICSE'26). 11pp.
	🧑‍💻 https://github.com/GPSapia/ReachabilityAgent_ICSE

SP'26

🧑‍💻	Cottontail: LLM-Driven Concolic Execution for Highly Structured Test Input Generation.
	47th IEEE Symposium on Security and Privacy (SP'26). 18pp.
	🧑‍💻 https://github.com/haoxintu/cottontail

NDSS'26

🧑‍💻	Chasing Shadows: Pitfalls in LLM Security Research.
	The Network and Distributed System Security Symposium (NDSS'26). 15pp.
	🧑‍💻 https://github.com/Dormant-Neurons/llm-pitfalls

TOSEM

🧑‍💻	Vital: Vulnerability-Oriented Symbolic Execution via Type-Unsafe Pointer-Guided Monte Carlo Tree Search.
	ACM Transactions on Software Engineering and Methodology. 24pp.
	🧑‍💻 https://github.com/haoxintu/Vital-SE

2025
ISSTA'25

🧑‍💻	Top Score on the Wrong Exam: On Benchmarking in Machine Learning for Vulnerability Detection.
	34th ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'25). 22pp.
	🧑‍💻 https://github.com/niklasrisse/TopScoreWrongExam

TSE

🧑‍💻	AFLNet Five Years Later: On Coverage-Guided Protocol Fuzzing.
	IEEE Transactions on Software Engineering. 14pp.
	🧑‍💻 https://github.com/aflnet/aflnet

ICLR'25

🧑‍💻	How Much is Unseen Depends Chiefly on Information About the Seen.
	13th International Conference on Learning Representations (ICLR'25). 22pp.
	🧑‍💻 https://github.com/niMgnoeSeeL/UnseenGA

ICSE'25

🧑‍💻	Invivo Fuzzing by Amplifying Actual Executions.
	47th International Conference on Software Engineering (ICSE'25). 13pp.
	🧑‍💻 https://github.com/OctavioGalland/afllive

ICSE'25

🧑‍💻	Accounting for Missing Events in Statistical Information Leakage Analysis.
	47th International Conference on Software Engineering (ICSE'25). 12pp.
	🧑‍💻 https://github.com/niMgnoeSeeL/ChaoMI

FSE'25

🧑‍💻	MendelFuzz: The Return of the Deterministic Stage.
	ACM International Conference on the Foundations of Software Engineering (FSE'25). 21pp.
	🧑‍💻 http://github.com/HexHive/MendelFuzz-Artifact

2024
TOSEM

🧑‍💻	On the Impact of Lower Recall and Precision in Defect Prediction for Guiding Search-based Software Testing.
	ACM Transactions on Software Engineering and Methodology 33(6). 27pp.
	🧑‍💻 https://doi.org/10.6084/m9.figshare.16564146

USENIX Sec'24

🧑‍💻	Uncovering the Limits of Machine Learning for Automatic Vulnerability Detection.
	33rd USENIX Security Symposium (USENIX Sec'24). 19pp.
	🧑‍💻 https://github.com/niklasrisse/USENIX_2024
	https://github.com/niklasrisse/VPP

CCS'24

🧑‍💻	Testing Side-Channel Security of Cryptographic Implementations Against Future Microarchitectures.
	31st ACM Conference on Computer and Communications Security (CCS'24). 16pp.
	🧑‍💻 https://github.com/hw-sw-contracts/leakage-model-testing

ICSE'24

🧑‍💻	Extrapolating Coverage Rate in Greybox Fuzzing.
	46th International Conference on Software Engineering (ICSE'24). 13pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.10460578

NDSS'24

🧑‍💻	Large Language Model guided Protocol Fuzzing.
	Network and Distributed System Security Symposium (NDSS'24). 15pp.
	🧑‍💻 https://zenodo.org/doi/10.5281/zenodo.8373804
	https://github.com/ChatAFLndss/ChatAFL

TSE

🧑‍💻	Human-In-The-Loop Automatic Program Repair.
	IEEE Transactions on Software Engineering. 24pp.
	🧑‍💻 https://github.com/charakageethal/learn2fix-journal-ext/

2023
CACM

🧑‍💻	Boosting Fuzzer Efficiency: An Information Theoretic Perspective.
	Communcations of the ACM 66(11). 9pp.
	🧑‍💻 https://doi.org/10.6084/m9.figshare.12415622

ESEC / FSE'23

🧑‍💻	Statistical Reachability Analysis.
	31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC / FSE'23). 12pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.8267404

ASE'23

🧑‍💻	Precise Data-Driven Approximation for Program Analysis via Fuzzing.
	38th IEEE/ACM International Conference on Automated Software Engineering (ASE'23). 12pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.7902214

ICSE'23

🧑‍💻	Reachable Coverage: Estimating Saturation in Fuzzing.
	45th International Conference on Software Engineering (ICSE'23). 13pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.7571359

ICSE'23

🧑‍💻	Evaluating the Impact of Experimental Assumptions in Automated Fault Localization.
	ACM/IEEE 45th International Conference on Software Engineering (ICSE'23). 13pp.
	🧑‍💻 https://figshare.com/articles/conference_contribution/Debugging_Assumptions_Artifact/21786743
	🔗 https://debugging-assumptions.github.io/

ISSTA'23

🧑‍💻	Green Fuzzing: A Saturation-based Stopping Criterion using Vulnerability Prediction.
	32nd ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'23). 13pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.7944722
	https://github.com/tum-i4/green-fuzzing-artifacts/tree/1.0.0

SBFT'23

🧑‍💻	Continuous Fuzzing: A Study of the Effectiveness and Scalability of Fuzzing in CI/CD Pipelines.
	2023 IEEE/ACM International Workshop on Search-Based and Fuzz Testing (SBFT'23). 13pp.
	🧑‍💻 https://github.com/kloostert/CICDFuzzBench
	https://github.com/kloostert/CICDFuzzBench

TSE'23

🧑‍💻	An Experimental Assessment of Using Theoretical Defect Predictors to Guide Search-based Software Testing.
	IEEE Transactions on Software Engineering.
	🧑‍💻 https://github.com/premosa-sbst

2022
ICSE'22

🧑‍💻	On the Reliability of Coverage-based Fuzzer Benchmarking.
	44th International Conference on Software Engineering (ICSE'22). 13pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.6045830
	https://github.com/icse22data/

USENIX SEC'22

🧑‍💻	Stateful Greybox Fuzzing.
	31st USENIX Security Symposium (USENIX SEC'22). 18pp.
	🧑‍💻 https://github.com/bajinsheng/SGFuzz

ISSTA'22

🧑‍💻	Human-in-the-Loop Oracle Learning for Semantic Bugs in String Processing Programs.
	31st ACM/SIGSOFT International Symposium on Software Testing and Analysis (ISSTA'22). 12pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.6530839
	https://github.com/charakageethal/grammar2fix

2021
CCS'21

🧑‍💻	Regression Greybox Fuzzing.
	28th ACM Conference on Computer and Communications Security (CCS'21). 12pp.
	https://github.com/aflchurn/aflchurn
	🧑‍💻 https://www.kaggle.com/marcelbhme/aflchurn-ccs21/code

EMSE'21

🧑‍💻	Locating faults with program slicing: an empirical analysis.
	Empirical Software Engineering 26(3).
	🧑‍💻 https://doi.org/10.6084/m9.figshare.13369400.v1

ESEC / FSE'21

🧑‍💻	Estimating Residual Risk in Greybox Fuzzing.
	15th Joint meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC / FSE'21). 12pp.
	🧑‍💻 https://doi.org/10.5281/zenodo.4970239
	https://github.com/Adaptive-Bias/fse21_paper270