ExpNLP | Simone Balloccu

Natural Language Processing For Expert Domains (ExpNLP)

At ExpNLP, we work on AI (and NLP specifically) for expert domains. That means, we focus on the real-world impact of AI. We ask ourselves “How can language technologies support experts’ work in complex, high-stakes settings”? This requires us to measure performance not only by benchmark scores, but also by reliability, transparency, and practical value. We highly value qualitative insights, explanations beyond pure scores, and human evaluation.

Our research interests include:

Evaluating AI in expert domains: in areas like software engineering, healthcare, and industry, errors are costly and understanding strengths and limitations of AI is critical.
Integrating multimodal data in Expert AI: experts deal with many forms of information beyond text, including images, structured data, reports, and other domain-specific signals. Expert AI must effectively We investigate combine such data to support real-world expert workflows.
Expert–AI interaction and collaboration: we aim to design AI systems that assist rather than replace domain experts, enabling powerful partnerships and better decision-making.

People

PhD Students

Doan Nam Long Vu

Topics: Multimodal AI for mental health

I am a researcher working on NLP, large language models, and clinical AI, with a focus on interpretability and mental health applications. My research investigates how language models encode and respond to clinically relevant content, particularly how the register in which symptoms are described shapes model behavior and assessments. A central question driving my work is whether LLMs, when applied to mental health contexts, are responding to underlying clinical conditions or to the surface-level language used to express them. Additionally, I am exploring the deployment of state-of-the-art, open-source multimodal AI models in real-world clinical research scenarios. My applied work includes transforming clinical questionnaires into natural dialogues, developing dialogue-aware text-to-speech system, and utilizing automatic speech recognition for niche use cases, such as analyzing interactions between caretakers and children with autism. Furthermore, I am experimenting with video-based models for gaze tracking and behavioral classification in children.

Anna Mokhova

Topics: Expert-AI collaboration in coding

A key challenge of AI research today is for AI systems reflect the knowledge, judgment, and professional standards of domain experts. The central focus of my PhD is to advance AI systems in order for them to act as a supportive collaborator with the help of human experts insights. This alignment improves trust, reliability, and real-world usefulness, especially in high-stakes areas like software engineering. Besides human-AI scope, being a linguist, I am also interested in more traditional NLP topics such as how to tackle multilinguality or semantic ambiguity. Outside of my research, I enjoy exploring and practicing various forms of art, hiking and visiting new places.

Ruilong Wang

Topics: Multimodal RAG systems for automotive applications

I am a PhD student working on NLP, LLMs, and knowledge-intensive AI for industrial applications. In collaboration with Volkswagen, I focus on developing AI systems that support production planning and engineering decision-making. My research investigates how expert knowledge from technical documents, standards, historical project data, and human experience can be incorporated into AI systems to make them more reliable, explainable, and effective in real-world industrial settings. In particular, I explore how LLMs can be combined with structured knowledge representations to build AI assistants that reason over domain knowledge and support experts in complex decision processes. My work also includes attribution-aware question answering, especially methods that improve the grounding and evidential support of LLM-generated responses.

Student Thesis

Concluded

Supervised by Dr. Simone Balloccu
- Enhancing Natural Language Inference in Biomedical Applications Using Large Language Models (finished) — Kai Zhao

Ongoing

Co-supervised by Ruilong Wang
- Time-Stamped Graphs for Cross-Year Reasoning in VolksWagen Annual Reports (ongoing) — Karim Abdelrahman
- Contrast Set Generation for Evaluation QA Models in Annual Report reasoning (ongoing) — Cagin Senemoglu
Co-supervised by Doan Nam Long Vu
- Expressive Text-To-Speech Generation and Evaluation for Therapist-Client Dialogues (ongoing) — Marleen Sinsel
- Synthetic Therapist-Client Conversation Generation From Questionnaires (ongoing) — Mohamed Aziz Boudabous

Openings / Collaboration

If you are interested in working with us, feel free to get in touch!

At the moment, we welcome application from B.Sc. and M.Sc. thesis students who want to work on compatible topics. If you are interested, feel free to contact Dr. Simone Balloccu. If possible include a grade transcript, an initial topic (or at least proposals), and a CV.

Application for internships are currently considered only if the candidate is self-funded (or virtual). Candidates are welcome to contact Dr. Simone Balloccu with topic proposals. We strongly suggest to highlight the overlapping between your proposal and the existing research directions of the ExpNLP.