Hi, I'm John.

I’m a first-year ECE PhD student at the University of Texas at Austin, co-advised by Dr. Haris Vikalo and Dr. Atlas Wang.

My research focuses on interpretability and AI safety, with additional interests in healthcare and computational biology. Lately I’ve been working on activation engineering: lightweight interventions on model internals that make LLMs more reliable. An early version of this work appears in the updated preprint below.

If you would like to connect, please email me using the contact button above.

GitHub Scholar Twitter

Recent Publications.

Granularity in submission

When Is Rank-1 Steering Cheap? Geometry, Granularity, and Budgeted Search

John T. Robertson, Jianing Zhu, Haris Vikalo, Zhangyang Wang

We formalize rank-1 activation steering as a budget-constrained search over intervention layer and coefficient, and introduce concept granularity: a measure of directional heterogeneity that predicts how hard a concept is to steer. The resulting GRACE workflow makes cheaper, more reliable interventions on LLMs. In submission, 2026.

Paper Site Code

AgingBench in submission

AgingBench: Long-Lived AI Agents Age Too, They Quietly Decay After Deployment

Jianing Zhu, Yeonju Ro, John T. Robertson, Kevin Wang, Junbo Li, Haris Vikalo, Aditya Akella, Zhangyang Wang

We introduce a longitudinal reliability benchmark for memory-enabled LLM agents, organized around four aging mechanisms: compression, interference, revision, and maintenance. Across ~400 runs over 14 models, 7 scenarios, and 8 to 200-session horizons (including Claude Code), we find aging is multi-dimensional and often invisible to standard snapshot evaluation. In submission, 2026.

Paper Site Code

NextVir PLOS Comp Bio 2025

NextVir: Enabling Classification of Tumor-Causing Viruses with Genomic Foundation Models

John T. Robertson, Shorya Consul, Haris Vikalo

We adapt genomic foundation models with LoRA fine-tuning for viral mixture separation, achieving state-of-the-art oncoviral DNA classification. PLOS Computational Biology, 2025.

Paper Code

All publications

Selected Experience.

Course Assistant, Probability and Random Processes

2026

University of Texas at Austin (Instructor: Dr. Vivek Telang)

TA for Probability and Random Processes in Shinjuku, Japan at J. F. Oberlin University.

Graduate Researcher

2025 to Present

University of Texas at Austin (Advised by Dr. Haris Vikalo & Dr. Atlas Wang)

Developing interpretable machine learning methods for AI safety, with a focus on activation engineering and applications in computational biology / healthcare.
Spearheading multiple early works in activation steering.

AI Research Intern

2024

Kilby Labs, Texas Instruments (Advised by Dr. Arthur Redfern)

Sole undergraduate intern; developed two patent-pending works on efficient deep learning for edge devices.
TIedNet: a CNN architecture using shared weights and LoRA-like perturbations for memory-efficient image classification.
Conditional PTQ: a method for post-training static quantization that predicts optimal scales per sample.

Projects.

This section is under construction. Detailed project pages are coming soon.

TWIIRL: Token-Wise Interpretable Interventions via Reinforcement Learning

We reformulate activation steering as a token-level decision problem: a small GRU controller emits per-token coefficients on a fixed diffmeans direction, trained via offline preference-based RL with an explicit KL trust region. On Gemma 2 9B, TWIIRL strictly dominates fixed-coefficient steering on the concept-coherence Pareto frontier at less than 0.01% per-token overhead.

DNA-ADLM: Anchored Diffusion for DNA Inpainting

We frame DNA inpainting as constrained generation under an Anchored Diffusion Language Model: observed anchor tokens are pinned while missing positions are iteratively resampled. Built on a masked discrete diffusion backbone (MDLM-style with a DiT denoiser), pretrained on chromosome 11 with 5-mer tokenization.

Audio Spectrogram Transformer + MIL: Interpretable ALS Severity Classification

Interpretable model for ALS severity classification from speech, combining an audio spectrogram transformer with multiple instance learning. Won second place at the Speech Analysis for Neurodegenerative Diseases Grand Challenge. Accepted to IEEE ICASSP 2026 (oral); unpublished due to attendance conflicts.

All projects