Doan Nam Long Vu | Simone Balloccu

Position: PhD Student

Topics: Multimodal AI for mental health

Bio

I am a researcher working on NLP, large language models, and clinical AI, with a focus on interpretability and mental health applications. My research investigates how language models encode and respond to clinically relevant content, particularly how the register in which symptoms are described shapes model behavior and assessments. A central question driving my work is whether LLMs, when applied to mental health contexts, are responding to underlying clinical conditions or to the surface-level language used to express them.

Additionally, I am exploring the deployment of state-of-the-art, open-source multimodal AI models in real-world clinical research scenarios. My applied work includes transforming clinical questionnaires into natural dialogues, developing dialogue-aware text-to-speech system, and utilizing automatic speech recognition for niche use cases, such as analyzing interactions between caretakers and children with autism. Furthermore, I am experimenting with video-based models for gaze tracking and behavioral classification in children.

Research Interests

Natural Language Processing
Large Language Models
Multimodal AI
Model Interpretability
Clinical AI & Mental Health Applications
Text-to-Speech & Automatic Speech Recognition

Publications

2026

The Scaffold Effect: How Prompt Framing Drives Apparent Multimodal Gains in Clinical VLM Evaluation

Doan Nam Long Vu and Simone Balloccu

In , Mar 2026

Abs DOI

Trustworthy clinical AI requires that performance gains reflect genuine evidence integration rather than surface-level artifacts. We evaluate 12 open-weight vision-language models (VLMs) on binary classification across two clinical neuroimaging cohorts, \textscFOR2107 (affective disorders) and \textscOASIS-3 (cognitive decline). Both datasets come with structural MRI data that carries no reliable individual-level diagnostic signal. Under these conditions, smaller VLMs exhibit gains of up to 58% F1 upon introduction of neuroimaging context, with distilled models becoming competitive with counterparts an order of magnitude larger. A contrastive confidence analysis reveals that merely \emphmentioning MRI availability in the task prompt accounts for 70-80% of this shift, independent of whether imaging data is present, a domain-specific instance of modality collapse we term the \emphscaffold effect. Expert evaluation reveals fabrication of neuroimaging-grounded justifications across all conditions, and preference alignment, while eliminating MRI-referencing behavior, collapses both conditions toward random baseline. Our findings demonstrate that surface evaluations are inadequate indicators of multimodal reasoning, with direct implications for the deployment of VLMs in clinical settings.

2025

Roleplaying with Structure: Synthetic Therapist-Client Conversation Generation from Questionnaires

In , Oct 2025

Abs DOI

The development of AI for mental health is hindered by a lack of authentic therapy dialogues, due to strict privacy regulations and the fact that clinical sessions were historically rarely recorded. We present an LLM-driven pipeline that generates synthetic counseling dialogues based on structured client profiles and psychological questionnaires. Grounded on the principles of Cognitive Behavioral Therapy (CBT), our method creates synthetic therapeutic conversations for clinical disorders such as anxiety and depression. Our framework, SQPsych (Structured Questionnaire-based Psychotherapy), converts structured psychological input into natural language dialogues through therapist-client simulations. Due to data governance policies and privacy restrictions prohibiting the transmission of clinical questionnaire data to third-party services, previous methodologies relying on proprietary models are infeasible in our setting. We address this limitation by generating a high-quality corpus using open-weight LLMs, validated through human expert evaluation and LLM-based assessments. Our SQPsychLLM models fine-tuned on SQPsychConv achieve strong performance on counseling benchmarks, surpassing baselines in key therapeutic skills. Our findings highlight the potential of synthetic data to enable scalable, data-secure, and clinically informed AI for mental health support. We will release our code, models, and corpus at this https URL.