Tarun Suresh

I'm a third-year undergraduate student in the Computer Science Department at the University of Illinois, Urbana-Champaign advised by Prof. Sasa Misailovic, Prof. Gagandeep Singh, and Prof. Heng Ji.

Email  /  LinkedIn  /  Google Scholar

Research

My research is at the intersection of deep learning, formal methods, and programming languages. I am largely interested in improving the capabilities of AI systems in challenging, real-world coding, math, and logical reasoning tasks.

I'm currently researching:

- Deep Learning for Program Synthesis and Code Semantics Understanding

- Tool Use with Language Models and Language Model-driven Software Engineering

- LLM Post-Training (Preference and Reinforcement Finetuning) and Inference (Decoding, Reasoning, Search, Planning) Algorithms For Code

- AI for Formal Methods and Formal Methods for AI

Papers
CRANE: Reasoning with Constrained LLM Generation
Debangshu Banerjee*, Tarun Suresh*, Shubham Ugare, Sasa Misailovic, Gagandeep Singh
VerifAI @ ICLR 2025
[Paper]

We show theoretically that constraining LLM generation to a fixed output grammar can reduce reasoning capabilities and by augmenting the grammar with additional production rules for CoT reasoning steps, we can preserve LLM expressivity. Building on these theoretical results, we introduce CRANE, a constrained decoding algorithm that only enforces the grammar when generating final answers and intermediate expressions, while keeping reasoning steps unconstrained. CRANE boosts accuracy by up to 10 percentage points on first-order logic generation and other symbolic reasoning tasks.

CoRNStack: High-Quality Contrastive Data for Better Code Retrieval and Reranking
Tarun Suresh*, Revanth Gangi Reddy*, Yifei Xu, Zach Nussbaum, Andriy Mulyar, Brandon Duderstadt, Heng Ji
ICLR 2025
[Blog Post][Paper][Code]

We introduce CoRNStack, a large-scale, high-quality contrastive training dataset for code that spans multiple programming languages. We demonstrate that contrastive training of embedding models and LLM re-rankers using CoRNStack leads to state-of-the-art performance across a variety of code retrieval tasks. Notably, our lightweight retriever + re-ranker achieves state-of-the-art repoistory level bug localization on SWE-Bench over top automated software development frameworks.

IterGen: Iterative Semantic-aware Structured LLM Generation with Backtracking
Shubham Ugare, Rohan Gumaste, Tarun Suresh, Gagandeep Singh, Sasa Misailovic
ICLR 2025
[Paper][PDF]

IterGen is a decoding algorithm which leverages grammar-based backtracking and selective rejection sampling to efficiently enforce user-defined semantic constraints into LLM output. IterGen improves LLM-generated SQL accuracy by 18% and eliminates LLM privacy leakage.

Relational Verification Leaps Forward with RABBit
Tarun Suresh*, Debangshu Banerjee*, Gagandeep Singh
NeurIPS 2024
[Paper][PDF][Code]

We introduce a GPU-accelerated, Branch-and-Bound-based verifier RABBit for formally verifying hyperproperties, such as ensembles, conformal prediction, and robustness, defined over multiple executions of Deep Neural Networks. RABBit improves neural network verified accuracy by 8% over state-of-the-art verifiers within the same compute budget.

Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa, Bhrugu Bharathi, Long Phan, Andy Zhou, Alice Gatti, Tarun Suresh, Maxwell Lin, Justin Wang, Rowan Wang, Ron Arel, Andy Zou, Dawn Song, Bo Li, Dan Hendrycks, Mantas Mazeika
ICLR 2025
[Paper][PDF]

We develop a method, called TAR, for building tamper-resistant safeguards into open-weight LLMs such that adversaries cannot remove the safeguards even after thousands of steps of fine-tuning. In extensive evaluations and red teaming analyses, we find that our method greatly improves tamper-resistance while preserving benign capabilities.

Incremental Randomized Smoothing Certification
Shubham Ugare, Tarun Suresh, Debangshu Banerjee, Sasa Misailovic, Gagandeep Singh
ICLR 2024
[Paper][PDF][Code]

We present IRS, the first probabilistic approach for 5x faster robustness re-certification of Deep Neural Networks after model compression (pruning, quantization) or fine-tuning

Is Watermarking LLM Generated Code Robust?
Tarun Suresh, Shubham Ugare, Gagandeep Singh, Sasa Misailovic
Tiny ICLR 2024 (Oral Presentation)
[Paper][PDF][Code]

We present the first study of the robustness of existing watermarking techniques on code generated by large language models and propose a parsing-based algorithm that easily removes these watermarks via semantic preserving transformations of the code.

SynCode: LLM Generation with Grammar Augmentation
Shubham Ugare, Tarun Suresh, Hangoo Kang, Sasa Misailovic, Gagandeep Singh
Under Review
[Paper][PDF][Code]

SynCode is a novel framework for the grammar-guided generation of Large Language Models (LLMs) that is scalable to general-purpose programming languages and has soundness and completeness guarantees. SynCode reduces syntax errors by 96-100% for various languages (JSON, Python, Go) and enables 1.5x-10x faster LLM inference than existing approaches.

Two-Step Offline Preference-Based Reinforcement Learning with Constrained Actions
Yinglun Xu, Tarun Suresh, Rohan Gumaste, David Zhu, Ruirui Li, Zhengyang Wang, Haoming Jiang, Xianfeng Tang, Qingyu Yin, Monica Xiao Cheng, Qi Zheng, Chao Zhang, Gagandeep Singh
Under Review
[Paper][PDF]

To address challenges from the risk of reward hacking and the complexity of reinforcement learning during preference-based reinforcement learning, we develop a novel two-step learning method called PRC. The high-level idea is to limit the reinforcement learning agent to optimize over a constrained action space that excludes out-of-distribution state-actions, which are unreliable and increase the complexity of the reinforcement learning problem.

Towards Continuous Verification of DNNs
Shubham Ugare, Debangshu Banerjee, Tarun Suresh, Gagandeep Singh, Sasa Misailovic
WFML @ ICML 2023
[Paper][PDF][Code]

We propose efficient deterministic formal verifiers to speed up DNN re-verification after pruning, quantization, or fine-tuning.



Template from here