Multimodal AI

Multimodal Learning Research Notes - Draft

Collection of research papers and resources on multimodal learning, generation, and editing models.

Multimodal Learning Research Notes

Papers:

LLMs Meet Multimodal Generation and Editing: A Survey: https://arxiv.org/html/2405.19334v1

Unified Multimodal Understanding and Generation Models: Advances, Challenges, and Opportunities: https://arxiv.org/html/2505.02567v3

Deep Research:

https://docs.google.com/document/d/1mbB_-o_vVIn54ZsHaJyp3opIAh3Kz6clMr4t_uXDkMI/edit?usp=sharing

https://chatgpt.com/s/dr_68943a432e9881918bdc63ed1e5e25cb

https://www.kimi.com/share/d2a3kotf4396rncvlmvg

Article:

https://weaviate.io/blog/multimodal-models