MMAI @ IEEE ICDM 2025
Advancing Multimodal AI Research and Applications
The 5th IEEE International Workshop on Multimodal AI (MMAI) aims to advance cutting-edge research and real-world applications in multimodal artificial intelligence. Building on the success of previous MMAI workshops, this year’s edition continues to foster collaboration between researchers and practitioners exploring the latest innovations in multimodal learning, data fusion, and cross-modal applications.
For MMAI 2025, we are introducing a dual-workshop mode:
- In-person workshop at MMAI@IEEE ICDM 2025
- Online workshop at MMAI@IEEE Big Data 2025
Authors are encouraged to select their preferred venue when submitting their papers. Please visit both workshop pages for more details and submission instructions.
Papers accepted by MMAI@IEEE ICDM 2025 are also eligible for complimentary online presentation at MMAI@IEEE Big Data 2025 on Dec. 8, 2025.

In conjunction with IEEE ICDM, the workshop will be held in person. Please see our schedule for details.
About Multimodal AI
Multimodality is the most general form for information representation and delivery in a real world. Using multimodal data is natural for humans to make accurate perceptions and decisions. Our digital world is actually multimodal, combining various data modalities, such as text, audio, images, videos, touch, depth, 3D, animations, biometrics, interactive content, etc. Multimodal data analytics algorithms often outperform single modal data analytics in many real-world problems.
Multi-sensor data fusion has also been a topic of great interest in industry nowadays. In particular, such companies working on automotive, drone vision, surveillance or robotics have grown exponentially. They are attempting to automate processes by using a wide variety of control signals from various sources.
With the rapid development of Big Data technology and its remarkable applications to many fields, multimodal Artificial Intelligence (AI) for Big Data is a timely topic. This workshop aims to generate momentum around this topic of growing interest, and to encourage interdisciplinary interaction and collaboration between Natural Language Processing (NLP), computer vision, audio processing, machine learning, multimedia, robotics, Human-Computer Interaction (HCI), social computing, cybersecurity, cloud computing, edge computing, Internet of Things (IoT), and geospatial computing communities. It serves as a forum to bring together active researchers and practitioners from academia and industry to share their recent advances in this promising area.
Topics
This is an open call for papers, which solicits original contributions considering recent findings in theory, methodologies, and applications in the field of multimodal AI and Big Data. The list of topics includes, but not limited to:
- Multimodal representations (language, vision, audio, touch, depth, etc.)
- Multimodal data modeling
- Multimodal data fusion
- Multimodal learning
- Cross-modal learning
- Multimodal big data analytics and visualization
- Multimodal scene understanding
- Multimodal perception and interaction
- Multimodal information tracking, retrieval and identification
- Multimodal big data infrastructure and management
- Multimodal benchmark datasets and evaluations
- Multimodal AI in robotics (robotic vision, NLP in robotics, Human-Robot Interaction (HRI), etc.)
- Multimodal object detection, classification, recognition, and segmentation
- Multimodal AI safety (explainability, interpretability, trustworthiness, etc.)
- Multimodal Biometrics
- Multimodal applications (autonomous driving, cybersecurity, smart cities, intelligent transportation systems, industrial inspection, medical diagnosis, healthcare, social media, arts, etc.)
Important Dates:
- Short paper submission: Monday, September 01, 2025
- Poster paper submission: Monday, September 08, 2025
- Notification to authors: Friday, September 19, 2025
- Camera-ready of Accepted Papers: Thursday, September 25, 2025
Checkout our CFP for additional details.
Organizers
- Kaiqun Fu, South Dakota State University
- Duoduo Liao, George Mason University
- Yanjia Zhang, Baptist Health South Florida
PC Members
- Zhiqian Chen, Mississippi State University, USA
- Naresh Erukulla, Macy’s Inc., USA
- Maryam Heidari, George Mason University, USA
- Fanchun Jin, Google Inc., USA
- Achin Kulshrestha, Google Inc., USA
- Ge Jin, Purdue University, USA
- Ashwin Kannan, Amazon, USA
- Kevin Lybarger, George Mason University, USA
- Abhimanyu Mukerji, Amazon, USA
- Chen Shen, Google Inc., USA
- Arpit Sood, Meta, USA
- Gregory Joseph Stein, George Mason University, USA
- Alex Wong, Yale University, USA
- Marcos Zampieri, George Mason University, USA