Krishna Srikar Durbha

I am a fourth-year Ph.D. student at the Laboratory of Image and Video Engineering, The University of Texas at Austin, advised by Prof. Alan Bovik. I collaborate with the Meta Video Infrastructure Team on Video Engineering and Perceptual Quality Optimization.

My current research advances generative modeling for vision and Multimodal Large Language Models (MLLMs) — investigating few-step generation frameworks, MeanFlows, and visual grounding. Building on my prior work in representation learning and perceptual quality assessment (IQA/VQA), I design and train generative and multimodal models that are semantically faithful and perceptually high-fidelity.

Krishna Srikar Durbha

Education

  • Ph.D. in Electrical and Computer Engineering
    2022 - Present
    The University of Texas at Austin
    Advisor: Dr. Alan C. Bovik
  • Bachelor of Technology in Electrical Engineering
    2018 - 2022
    Indian Institute of Technology Hyderabad

Research Interests

  • Generative Modeling
  • Image and Video Representation Learning
  • Perceptual Quality Assessment
  • Multimodal Large Language Models
  • Computer Vision
  • Image and Video Processing
  • Video Streaming Optimization

News

Ongoing Research

Preprints & Under Review

Published Papers

First Author Equal Contribution

Experience

Projects

Time-Constrained Representation Alignment (TC-REPA)

Tested restricting Representation Alignment (REPA) loss to high-noise timesteps on SiT-XL/2 flow-matching models, establishing that the effective axis for relaxing alignment is the training iteration, not the diffusion timestep.

Diffusion Models Representation Alignment Flow Matching

Per-Prompt Evaluation of Guidance Strategies in T2I Models

Benchmarked seven CFG-family guidance strategies across SDXL, PixArt-Σ, SD3, and FLUX on T2I-CompBench++, revealing that aggregate FID/IS scores conceal compositional hallucinations exposed only by per-prompt distributional evaluation.

Text-to-Image Classifier-Free Guidance Compositional Evaluation

High-Resolution 4K Synthesis using Few-Step Models

Benchmarked diffusion, flow-matching, consistency, and flow-map-matching models for 4K image synthesis, quantifying the perceptual-quality and inference-efficiency trade-offs between knowledge-distilled and principled few-step techniques.

Few-Step Generation High-Resolution Synthesis Diffusion / Flow Matching