DAS 2026

Our Keynote Speakers

Jordy Van Landeghem

KU Leuven

Talk 1: Parse, Reflect, Retrieve, Compile: An Agent Stack for Enterprise Document AI

Agents are reshaping Document AI — but can they be trusted in production? This tutorial walks through the agent stack for enterprise documents, from parsing to retrieval to reasoning, exposing where current systems silently fail and what it takes to close the gap. Drawing on recent benchmarks (ParseBench, MADQA), open-source tools (DRAG), and lessons from deploying agentic workflows in industry, we examine the tension between flexibility and reliability: agents that self-correct through visual reflection, agents that learn search strategies from experience, and the emerging need for systems that compile learned intelligence into deterministic, auditable pipelines. We conclude with open research problems and an invitation to collaborate.

Speaker's Bio: JORDY VAN LANDEGHEM received an M.A. degree in Linguistics (2015), an M.Sc. degree in Artificial Intelligence (2017), and a Ph.D. degree in Computer Science (2024), all from KU Leuven, Belgium. He completed research internships at Oracle and Nuance Communications and spent seven years as Lead AI Research Engineer at Contract.fit, a European IDP start-up. His doctoral research on "Intelligent Automation for AI-Driven Document Understanding" spans probabilistic deep learning, calibration, uncertainty quantification, and out-of-distribution robustness. He spearheaded the DUDE benchmark and the ICDAR 2023 competition, with further publications at ICML, ICCV, and WACV. Most recently a Senior ML Engineer at Instabase leading GenAI and Agentic AI efforts, while collaborating on MADQA, he now runs an independent global AI/ML consultancy from Belgium (Probably Approximately Human BV).

Nibal Nayef

Talk 2: From Retrieval to Reasoning: The Evolution of RAG-based Systems for Document-Centric AI

Retrieval-Augmented Generation (RAG) has become a key paradigm for enabling question answering over large and heterogeneous document collections. It has evolved from early single pass retrieval–generation approaches into a broader class of systems that integrate retrieval with reasoning, including emerging approaches often referred to as agentic RAG. This talk provides an overview of this evolution toward more iterative and adaptive methods. We discuss key design principles, including data ingestion, retrieval strategies, context selection, and answer generation. A particular focus is placed on document-centric scenarios, where knowledge is derived from documents with diverse layouts, document images, and other modalities. In these settings, system performance depends not only on retrieval and generation, but also on document understanding and representation techniques. The presentation also discusses key challenges in building RAG systems and outlines directions for future research and development. The goal is to provide a conceptual framework and practical insights for designing next-generation RAG systems.

Speaker's Bio: NIBAL NAYEF is a senior researcher in document analysis and machine learning, working at the intersection of document intelligence and AI systems. She currently consults on data science and AI solutions, including RAG-based assistants for education and enterprise applications. Her work covers deep learning and end-to-end ML system development, from research to production. At MyScript, she developed models for layout analysis of handwritten documents, mathematical expression recognition, gesture recognition, and writer adaptation using GNNs, Transformers, LSTMs, and MLPs. Earlier, at the L3i Laboratory, University of La Rochelle, she worked on document image analysis using CNN- and FCN-based approaches, as well as on document image quality assessment and enhancement. She led the creation of several benchmarks for multilingual document image analysis and quality assessment, including the widely used RRC-MLT-2017 and RRC-MLT-2019 datasets. She received her PhD in computer science from RPTU University of Kaiserslautern-Landau. Her research interests include document-centric AI, NLP, and data-driven solutions for real-world applications..

Brandon Smock

Kensho Technologiesr

Talk 3: TBC

Brief TBC

Speaker's Bio: BRANDON SMOCK is a Senior Applied Scientist for Document Intelligence at Kensho Technologies, with deep expertise in machine learning and algorithm development. During his tenure at Microsoft as a Principal Applied Scientist, he spearheaded the development of the Table Transformer (TATR), a state-of-the-art deep learning approach to recognizing and extracting data from tables in unstructured documents. The Table Transformer models have since been downloaded over two million times in a single month on Hugging Face, placing them among the most popular object detection models available. His work is characterized by a strong focus on scalable, data-centric machine learning, including automated cleaning of large-scale crowd-sourced data and the creation of realistic synthetic training data. Brandon has presented his work at venues including CVPR and ICDAR, and continues to push the boundaries of document intelligence research.

Sheraz Ahmed

DeepReader GmbH & DFKI Kaiserslautern

Talk 4: TBC

Brief TBC

Speaker's Bio: SHERAZ AHMED has been associated with DFKI for over fifteen years and is currently a Principal Researcher there. He completed his PhD at TU Kaiserslautern on the topic of generic frameworks for information segmentation in document images, and has since become a leading scientific voice in the Smart Data & Knowledge Services research area. He has attracted international attention through his publications on the application of machine learning to document analysis and life sciences, as well as his work on explainable AI systems. He is also the founder and CEO of DeepReader GmbH, a company he established to bridge the gap between academic research and industry applications. His research interests span document understanding, explainable AI, pattern recognition, anomaly detection, genome analysis, and natural language processing. In recognition of his contributions, Sheraz was honored with the prestigious DFKI Research Fellow Award.