Tarun Suresh

I'm a third-year undergraduate student in the Computer Science Department at the University of Illinois, Urbana-Champaign advised by Prof. Sasa Misailovic, Prof. Gagandeep Singh, and Prof. Heng Ji.

Email  /  LinkedIn  /  Google Scholar

Research

I am deeply interested in the intersection of deep learning, systems, and formal methods/programming languages. My research goal is to develop principled techniques that enhance the effectiveness, scalability, and reliability of AI systems, particularly in software development and safety-critical domains.

I'm currently researching:

- Language Models for Program Synthesis, Math, Logic, and Reasoning Tasks

- Embedding Models, Re-Ranking, and Retrieval-Augmented Generation

- Inference-time Scaling of LLMs

- Formal Verification and Robustness of Deep Neural Networks

Papers
CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Tarun Suresh*, Revanth Gangi Reddy*, Yifei Xu, Zach Nussbaum, Andriy Mulyar, Brandon Duderstadt, Heng Ji
ICLR 2025
[Blog Post][Paper][Code]

We introduce CoRNStack, a large-scale, high-quality contrastive training dataset for code that spans multiple programming languages. We demonstrate that contrastive training of embedding models and LLM re-rankers using CoRNStack leads to state-of-the-art performance across a variety of code retrieval tasks. Notably, our lightweight retriever + re-ranker achieves state-of-the-art repoistory level bug localization on SWE-Bench over top automated software development frameworks.

IterGen: Iterative Structured LLM Generation
Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, Sasa Misailovic
ICLR 2025
[Paper][PDF]

IterGen is a framework that enables users to define and enforce semantic constraints into LLM output and efficiently backtrack by grammar symbols and re-sample during LLM generation until these constraints are satisfied. IterGen improves LLM-generated SQL accuracy by 18% and eliminates LLM privacy leakage.

Relational Verification Leaps Forward with RABBit
Tarun Suresh*, Debangshu Banerjee*, Gagandeep Singh
NeurIPS 2024
[Paper][PDF][Code]

We introduce a scalable Branch-and-Bound-based verifier RABBit for precisely verifying hyperproperties such as monotonicity and fairness defined over multiple executions of Deep Neural Networks.

Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa, Bhrugu Bharathi, Long Phan, Andy Zhou, Alice Gatti, Tarun Suresh, Maxwell Lin, Justin Wang, Rowan Wang, Ron Arel, Andy Zou, Dawn Song, Bo Li, Dan Hendrycks, Mantas Mazeika
ICLR 2025
[Paper][PDF]

We develop a method, called TAR, for building tamper-resistant safeguards into open-weight LLMs such that adversaries cannot remove the safeguards even after thousands of steps of fine-tuning. In extensive evaluations and red teaming analyses, we find that our method greatly improves tamper-resistance while preserving benign capabilities.

Incremental Randomized Smoothing Certification
Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Sasa Misailovic, Gagandeep Singh
ICLR 2024
[Paper][PDF][Code]

We present IRS, the first probabilistic approach for 5x faster robustness re-certification of Deep Neural Networks after model compression (pruning, quantization) or fine-tuning

Is Watermarking LLM Generated Code Robust?
Tarun Suresh, Shubham Ugare, Gagandeep Singh, Sasa Misailovic
Tiny ICLR 2024 (Oral Presentation)
[Paper][PDF][Code]

We present the first study of the robustness of existing watermarking techniques on code generated by large language models and propose a parsing-based algorithm that easily removes these watermarks via semantic preserving transformations of the code.

SynCode: LLM Generation with Grammar Augmentation
Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, Gagandeep Singh
Under Review
[Paper][PDF][Code]

SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that is scalable to general-purpose programming languages and has soundness and completeness guarantees. SynCode reduces syntax errors by 96-100% for various languages (JSON, Python, Go) and enables 1.5x-10x faster LLM inference than existing approaches.

Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions
Yinglun Xu, Tarun Suresh, Rohan Gumaste, David Zhu, Ruirui Li, Zhengyang Wang, Haoming Jiang, Xianfeng Tang, Qingyu Yin, Monica Xiao Cheng, Qi Zheng, Chao Zhang, Gagandeep Singh
Under Review
[Paper][PDF]

To address challenges from the risk of reward hacking and the complexity of reinforcement learning during preference-based reinforcement learning, we develop a novel two-step learning method called PRC. The high-level idea is to limit the reinforcement learning agent to optimize over a constrained action space that excludes out-of-distribution state-actions, which are unreliable and increase the complexity of the reinforcement learning problem.

Towards Continuous Verification of DNNs
Shubham Ugare, Debangshu Banerjee, Tarun Suresh, Gagandeep Singh, Sasa Misailovic
WFML @ ICML 2023
[Paper][PDF][Code]

We propose efficient deterministic formal verifiers to speed up DNN re-verification after pruning, quantization, or fine-tuning.



Template from here