MMAI @ IEEE ICDM 2025

The 5th IEEE International Workshop on Multimodal AI (MMAI) aims to advance cutting-edge research and real-world applications in multimodal artificial intelligence. Building on the success of previous MMAI workshops, this year’s edition continues to foster collaboration between researchers and practitioners exploring the latest innovations in multimodal learning, data fusion, and cross-modal applications.

For MMAI 2025, we are introducing a dual-workshop mode:

In-person workshop at MMAI@IEEE ICDM 2025
Online workshop at MMAI@IEEE Big Data 2025

Authors are encouraged to select their preferred venue when submitting their papers. Please visit both workshop pages for more details and submission instructions.

Papers accepted by MMAI@IEEE ICDM 2025 are also eligible for complimentary online presentation at MMAI@IEEE Big Data 2025 on Dec. 8, 2025.

In conjunction with IEEE ICDM, the workshop will be held in person. Please see our schedule for details.

About Multimodal AI

Multimodality is the most general form for information representation and delivery in a real world. Using multimodal data is natural for humans to make accurate perceptions and decisions. Our digital world is actually multimodal, combining various data modalities, such as text, audio, images, videos, touch, depth, 3D, animations, biometrics, interactive content, etc. Multimodal data analytics algorithms often outperform single modal data analytics in many real-world problems.

Multi-sensor data fusion has also been a topic of great interest in industry nowadays. In particular, such companies working on automotive, drone vision, surveillance or robotics have grown exponentially. They are attempting to automate processes by using a wide variety of control signals from various sources.

With the rapid development of Big Data technology and its remarkable applications to many fields, multimodal Artificial Intelligence (AI) for Big Data is a timely topic. This workshop aims to generate momentum around this topic of growing interest, and to encourage interdisciplinary interaction and collaboration between Natural Language Processing (NLP), computer vision, audio processing, machine learning, multimedia, robotics, Human-Computer Interaction (HCI), social computing, cybersecurity, cloud computing, edge computing, Internet of Things (IoT), and geospatial computing communities. It serves as a forum to bring together active researchers and practitioners from academia and industry to share their recent advances in this promising area.

Topics

This is an open call for papers, which solicits original contributions considering recent findings in theory, methodologies, and applications in the field of multimodal AI and Big Data. The list of topics includes, but not limited to:

Multimodal representations (language, vision, audio, touch, depth, etc.)
Multimodal data modeling
Multimodal data fusion
Multimodal learning
Cross-modal learning
Multimodal big data analytics and visualization
Multimodal scene understanding
Multimodal perception and interaction
Multimodal information tracking, retrieval and identification
Multimodal big data infrastructure and management
Multimodal benchmark datasets and evaluations
Multimodal AI in robotics (robotic vision, NLP in robotics, Human-Robot Interaction (HRI), etc.)
Multimodal object detection, classification, recognition, and segmentation
Multimodal AI safety (explainability, interpretability, trustworthiness, etc.)
Multimodal Biometrics
Multimodal applications (autonomous driving, cybersecurity, smart cities, intelligent transportation systems, industrial inspection, medical diagnosis, healthcare, social media, arts, etc.)

Important Dates:

Short paper submission: ~~Monday, September 01, 2025~~ ~~Monday, September 08, 2025~~ Wednesday, September 17, 2025
Poster paper submission: ~~Monday, September 08, 2025~~ ~~Monday, September 15, 2025~~ Wednesday, September 17, 2025
Notification to authors: Friday, September 19, 2025
Camera-ready of Accepted Papers: ~~Thursday, September 25, 2025~~ Sunday, October 5, 2025

Checkout our CFP for additional details.

Organizers

Kaiqun Fu, South Dakota State University
Duoduo Liao, George Mason University
Yanjia Zhang, Baptist Health South Florida

PC Members

Zhiqian Chen, Mississippi State University, USA
Naresh Erukulla, Macy’s Inc., USA
Maryam Heidari, George Mason University, USA
Fanchun Jin, Google Inc., USA
Achin Kulshrestha, Google Inc., USA
Ge Jin, Purdue University, USA
Ashwin Kannan, Amazon, USA
Kevin Lybarger, George Mason University, USA
Abhimanyu Mukerji, Amazon, USA
Chen Shen, Google Inc., USA
Arpit Sood, Meta, USA
Gregory Joseph Stein, George Mason University, USA
Alex Wong, Yale University, USA
Marcos Zampieri, George Mason University, USA