Publications
2026
- CGO ’26Thinking Fast and Correct: Automated Rewriting of Numerical Code through Compiler AugmentationSiyuan Brant Qian, Vimarsh Sathia, Ivan R. Ivanov, Jan Hückelheim, Paul Hovland, and William S. MosesIn 22nd ACM/IEEE International Symposium on Code Generation and Optimization, Jan 2026, Sydney, Australia.
Floating-point numbers are finite-precision approximations to real numbers and are ubiquitous in computer applications in nearly every field. Selecting the right floating-point representation that balances performance and numerical accuracy is a difficult task – one that has become even more critical as hardware trends toward high-performance, low-precision operations. Although the common wisdom around changing floating-point precision implies that accuracy and performance are inversely correlated, more advanced techniques can often circumvent this tradeoff. Applying complex numerical optimizations to real-world code, however, is an arduous engineering task that requires expertise in numerical analysis and performance engineering, and application-specific numerical context. While there is a plethora of existing tools that partially automate this process, they are limited in the scope of optimization techniques or still require substantial human intervention. We present Poseidon, a modular and extensible framework that fully automates floating-point optimizations for real-world applications within a production compiler. Our key insight is that a small surrogate profile often reveals sufficient numerical context to drive effective rewrites. Poseidon operates as a two-phase compiler: the first compilation instruments the program to capture numerical context; the second compilation consumes profiled data, generates and evaluates candidate rewrites, and solves for optimal performance/accuracy tradeoffs. Poseidon’s interoperability with standard compiler analyses and optimizations grants it analysis and optimization advantages unavailable to existing source- and binary-level approaches. On multiple large-scale applications, Poseidon leads to outsized benefits in performance without substantially changing accuracy, and outsized accuracy benefits without diminishing performance. On a quaternion differentiator, Poseidon enables a 1.46x speedup with a relative error of 10e-7. On DOE’s LULESH hydrodynamics application, Poseidon improves program accuracy to exactly match a 512-bit simulation run without substantially reducing performance.
2025
- Correctness ’25Data Race Detection through Vibe TranslationJan Hückelheim, Vimarsh Sathia, and Siyuan Brant QianIn 9th International Workshop on Software Correctness for HPC Applications, Jan 2025, St Louis, USA.
We propose a data race detection approach for code written in a source programming language, by means of AI-agent translation to a target language, followed by conventional tool-based detection in the target language. We evaluate this translate-then-check approach by translating the C/Fortran+OpenMP programs in DataRaceBench to the Go programming language, and using the Go data race detector to check for races. The translation is controlled through natural language prompts, similar to approaches popularized as vibe coding. Translate-then-check achieves 92.8% accuracy and 9 false negatives for the C programs in DataRaceBench, compared to 89.9% accuracy and 17 false negatives for Clang with ThreadSanitizer and Archer applied to the original C programs. We discuss the approach and its overall accuracy, and show examples where translate-then-check leads to false negatives or positives due to limitations of the Go data race checker or, in some cases, limitations of the translation.
2024
- LAFI ’24Abstract Interpretation for Automatic DifferentiationIn 10th Workshop on Languages for Inference at POPL, Jan 2024, London, UK.
Despite the fact that Automatic Differentiation (AD) underlies much of machine learning, scientific computing and graphics, static analyses of AD code has been limited. In this position paper, we argue why AD code needs precise, general and scalable abstract interpretation and why this abstract interpretation should leverage the specific structure inherent to AD. Lastly, we demonstrate a new use of abstract interpretation of AD to analyze Coordinate MLPs from Graphics, attaining orders of magnitude more precision than prior work.
2023
- OOPSLA ’23Synthesizing Precise Static Analyzers for Automatic DifferentiationIn 38th ACM SIGPLAN Conference on Object-Oriented Programming, Systems, Languages and Applications, Oct 2023, Cascais, Portugal.
We present Pasado, a technique for synthesizing precise static analyzers for Automatic Differentiation. Our technique allows one to automatically construct a static analyzer specialized for the Chain Rule, Product Rule, and Quotient Rule computations for Automatic Differentiation in a way that abstracts all of the nonlinear operations of each respective rule simultaneously. By directly synthesizing an abstract transformer for the composite expressions of these 3 most common rules of AD, we are able to obtain significant precision improvement compared to prior works which compose standard abstract transformers together suboptimally. We prove our synthesized static analyzers sound and additionally demonstrate the generality of our approach by instantiating these AD static analyzers with different nonlinear functions, different abstract domains (both intervals and zonotopes) and both forward-mode and reverse-mode AD. We evaluate Pasado on multiple case studies, namely soundly computing bounds on a neural network’s local Lipschitz constant, soundly bounding the sensitivities of financial models, certifying monotonicity, and lastly, bounding sensitivities of the solutions of differential equations from climate science and chemistry for verified ranges of initial conditions and parameters. The local Lipschitz constants computed by Pasado on our largest CNN are up to 2750× more precise compared to the existing state-of-the-art zonotope analysis. The bounds obtained on the sensitivities of the climate, chemical, and financial differential equation solutions are between 1.31 − 2.81× more precise (on average) compared to a state-of-the-art zonotope analysis.