Initializing neural analysis... Delhi · IN

Learning
representations &
the structure
of signal.

I'm Arnesh — a CS & AI undergraduate at IIIT Delhi. My research sits at the intersection of self-supervised learning, signal processing, and deep-learning theory & interpretability — published at NeurIPS, CVPR, TMLR, INTERSPEECH & ICMR.

Affiliation
IIIT Delhi · SBI Lab
Collaborators
Brown · UCSD
Interests
SSL · Signal · DL Theory
Founder
StarkVision · NivionAI
Scroll · Portfolio / 2026
Published · ICML · NeurIPS · CVPR · TMLR
SBI Lab · IIIT Delhi
8 papers · 2025–26
Published & presented at
NeurIPS· CVPR· TMLR· INTERSPEECH· ICMR· ECCV · ICML· under review
01 / Research

Three threads,
one curiosity.

My research lives at the intersection of self-supervised learning, signal processing, and deep-learning theory & interpretability — asking how networks build representations, what invariances they preserve, and where those representations quietly fail.

I · Self-Supervised Learning

Representations that generalize.

Theoretical work on self-supervised learning — relational representation learning, robustness, and the conditions under which invariances transfer across tasks.

II · Signal Processing

Frequency, phase, multi-view.

Spectrogram transformers, Fourier-phase forensics, and multi-view cross-attention for 2D / 3D signals — from birdsong (INTERSPEECH) to spinal MRI (CVPR), dental scans and AI-image detection.

III · Theory & Interpretability

Inside the latent.

Uncovering what intermediate layers actually encode — generalization gaps, representation geometry, and why certain features only surface deep inside the network. Work under review at ICML 2026.

Founder · Lead Researcher

StarkVision

An independent, student-led research community I founded in 2025 — where undergraduates publish at top venues as first authors.

20+ members. Three active projects. Mentorship on research methodology, writing, and publication strategy. Everything below carries student-only authorship.

Output · 2024 — 2025
M-SCAN · Lumbar Stenosis Grading
Multi-view cross-attention for spinal MRI
CVPR '25 · Demo
Melody or Machine · Synthetic Music Detection
Cross-modal contrastive alignment
TMLR
SocialDF · Deepfake Detection
Benchmark + model for harmful synthetic content
ICMR '25 · Workshop
CLAM · AI for Music
Safeguarding authenticity in the music industry
NeurIPS '25 · Workshop
More coming
In prog.
02 / Publications

Papers &
pre-prints.

Ten works across journals, top-tier conferences and workshops. Asterisk (*) denotes equal contribution.

★ C.02026
Uncovering the Latent Potential of Deep Intermediate Representations
Arnesh Batra, Arush Gumber, Aniket Khandelwal, Jashn Khemani, Anubha Gupta
ICML 2026 · Spotlight · Accepted
Coming soon
J.12025
Melody or Machine: Benchmarking and Detecting Synthetic Music via Cross-Modal Contrastive Alignment
Arnesh Batra, Dev Sharma, Krish Thukral, Ruhani Bhatia, Naman Batra, Aditya Gautam
TMLR · Transactions on Machine Learning Research
PDF
C.12024
Audio Spectrogram Transformer Guided Classification & Information Retrieval for Birds
Yashwardhan Chaudhuri*, Paridhi Mundra*, Arnesh Batra*, Orchid Chetia Phukan, Arun Balaji Buduru
INTERSPEECH 2024 · ISCA
PDF
C.22025
M-SCAN: A Multistage Framework for Lumbar Spinal Canal Stenosis Grading Using Multi-View Cross Attention
Arnesh Batra, Arush Gumber, Anushk Kumar
CVPR 2025 · Demo Track
arXiv
W.12025
Relational Representation Learning
Lucas Maes, Ian K Hajra, Arnesh Batra, Hugues Van Assel, Damien Scieur, Randall Balestriero
NeurIPS 2025 Workshop · UniReps — Unifying Representations in Neural Models
PDF
W.22025
CLAM: Safeguarding Authenticity & Addressing Implications for the Music Industry
Arnesh Batra, Dev Sharma, Krish Thukral, Ruhani Bhatia, Naman Batra, Aditya Gautam
NeurIPS 2025 Workshop · AI for Music
PDF
W.32025
SocialDF: Benchmark Dataset & Detection Model for Mitigating Harmful Deepfake Content on Social Media
Arnesh Batra, Anushk Kumar, Jashn Khemani, Arush Gumber, Arhan Jain, Somil Gupta
ACM ICMR 2025 Workshop · MAD'25
ACM
W.42025
DentalNet: Geometric Aware Multi-View Transformer for Occlusion Grade Prediction in Dental 3D Scans
Arnesh Batra, Arush Gumber, Vaibhav Sharma, Peemit, Rinkle Sardana, Tulika Tripathi, Anubha Gupta
NeurIPS 2025 Workshop · Imageomics
PDF
03 / Experience

Labs &
collaborators.

Working across academia and industry — from FAIR/Brown to medical-imaging labs at IIIT Delhi and startups building production-grade AI.

May 2025 — present
NivionAI
Co-Founder
Re-imagining industry workflows through AI-driven innovation grounded in state-of-the-art research. Cleared Stage 1 of the AI Grand Challenge organized by NCIIPC, Government of India.
Delhi · IN
May 2025 — Dec 2025
Brown University
Research Collaborator · w/ Prof. Randall Balestriero (FAIR/Brown)
Theoretical research on self-supervised learning frameworks — representation robustness, generalization, mathematical analysis and empirical validation. Work accepted at NeurIPS '25 UniReps.
Remote
May 2025 — present
SBI Lab · IIIT Delhi
Research Intern · w/ Prof. Anubha Gupta
Medical image analysis across 2D and 3D modalities — blood-cell, dental and spinal imaging. Designing hybrid transformer architectures integrating volumetric and multi-view data with SOTA performance on multiple datasets. Work under review at ECCV 2026.
Delhi · IN
Feb 2025 — May 2025
UC San Diego
Research Intern · w/ Prof. Pengtao Xie
Built the first LLM framework for DNA-RNA interaction prediction. Curated a 100k+ sequence dataset, proposed lightweight genomics architectures achieving substantial scalability and efficiency gains.
Remote
May 2025 — Jul 2025
Mythyaverse
AI Engineering Intern
Explainable deep-learning models for ECG signal interpretation. Built a real-time sign-language translation system using transformer architectures — 95%+ accuracy on Arabic Sign Language. Shipped LLM + multimodal pipelines into production.
Remote
May 2024 — Oct 2024
HMI Lab · IIIT Delhi
Research Intern · w/ Prof. Jainendra Shukla
Multimodal LLMs for dynamic tasks — live sports commentary and fine-grained video captioning. Designed and benchmarked video encoders and attention-based fusion pipelines.
Delhi · IN
04 / Elsewhere

Projects, awards
& leadership.

Selected projects

FOCUS FLOW — Adaptive Learning Platform2024
AI-driven learning platform for neurodiverse students with real-time attention tracking via Mediapipe + OpenCV.
SuperResolution for Gravitational Lensing2024
GANs, SRResNet and diffusion-based denoising — AUROC 0.9978 · 98.3% acc · SSIM 0.94.
RE-DACT — Secure Data Redaction2024
Transformer-based redactor for text, PDF, images and video with 99%+ accuracy. 2nd place at SIH 2024 Grand Finals.

Leadership

StarkVision — Founder & Lead Researcher2025 →
Research community of 20+ members mentoring undergraduate-led research. See the dedicated section above.
Coordinator · Cyborg, IIIT-D2023 →
Led a hands-on neural-networks workshop for 40+ students; mentored first-time model implementations.

Honors & awards

Amazon ML Challenge 2025Rank 13 / 20,000
Price prediction from product image + description.
Smart India Hackathon 2024Top 2 · Finals
RE-DACT selected for grand finale at IIT Kharagpur.
Grand AI Challenge PS-03Top 6 / 100+
NCIIPC · Government of India · cleared Stage 1.
F1nalyze · Formula 1 Datathon1st place
Kaggle competition.
ML Hackathon · IIT Jodhpur4th place
FILL THE VOID() · IIT Jammu5th place
JEE Main99.72 %ile · AIR 3334
Out of 1.2 million candidates.

Academic service

Reviewer · ACM ICMR 2025
Reviewer · NeurIPS Workshops 2025
Teaching50+ students
Neural networks & data analysis fundamentals (NumPy, Pandas).
05 / Off-hours

Away from the terminal.

When I'm not reading papers or training models — I'm usually somewhere with a camera, a racquet, or a window seat.

01 · Photography

Making light sit still.

Street, low-light, and long exposures. I shoot to train my eye on composition — the rule of thirds is basically attention.

@arnesh_photography_
02 · Badminton

Fast, cross-court.

Played since school. It's the one place I don't overthink — reaction, footwork, the satisfying click of a clean smash.

Doubles preferred
03 · Travel

New cities, same notebook.

Somewhere between changing skylines and unfamiliar menus — most of my best ideas arrive on a train.

Next: open