AMC Faculty Advance AI Creativity Research at CVPR 2026 with Novel Image-Video Generation and Editing Frameworks

2026-06-10

The Division of Arts and Machine Creativity (AMC) at The Hong Kong University of Science and Technology (HKUST) proudly announces its steadfast contributions to the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026, one of the world’s leading conferences in computer vision and artificial intelligence.

This year, AMC faculty members continue to push the boundaries of AI-driven visual creativity, with three research papers accepted and two major workshops co-organized at the conference. These achievements further underscore the Division’s leadership in bridging artistic expression with advanced machine intelligence.

Our warmest congratulations to Prof. Anyi RAO and Prof. Harry YANG, along with their research teams and collaborators, for their outstanding contributions. We look forward to their continued impact in shaping the future of creative AI and visual computing.

Accepted Research Papers

Composing Concepts from Images and Videos via Concept-prompt Binding
Xianghao Kong, Zeyu Zhang, Yuwei Guo, Zhuoran Zhao, Songchun Zhang, Anyi Rao
Introduces Bind & Compose (BiCo), a one-shot framework for flexible visual concept composition from images and videos. It binds visual concepts to textual tokens through a hierarchical design, enabling selective concept combination via prompt manipulation. With improved concept-token alignment and temporal modeling, the method achieves strong concept consistency, prompt fidelity, and motion quality, advancing creative video generation and editing.
Learning Latent Proxies for Controllable Single-Image Relighting
Haoze Zheng, Zihao Wang, Xianfeng Wu, Yajing Bai, Yexin Liu, Yun Li, Xiaogang Xu, Harry Yang
Presents a novel framework for controllable image relighting from a single input image by learning latent proxy representations of lighting conditions. The approach enables flexible and intuitive manipulation of illumination while preserving scene structure and realism, addressing key challenges in inverse rendering and image editing. It demonstrates strong generalization and high-quality relighting results across diverse scenes.
Group Editing: Edit Multiple Images in One Go
Yue Ma, Xinyu Wang, Qianli Ma, Qinghe Wang, Mingzhe Zheng, Xiangpeng Yang, Hao Li, Chongbo Zhao, Jixuan Ying, Harry Yang, Hongyu Liu, Qifeng Chen
Proposes a unified framework for simultaneous multi-image editing, enabling consistent and efficient modifications across a group of images. By modeling shared structures and relationships among images, the method supports coherent edits while reducing redundancy and user effort. This approach significantly improves scalability and consistency compared to traditional single-image editing pipelines.

Workshop Co-Organization

AI for Creative Visual Content Generation, Editing and Understanding
Organized by Ozgur Kara, Junho Kim, Victor Escorcia, Dong Liu, Fabian Caba Heilbron, Jay Mahajan, Jiaju Ma, Songlin Yang, Rushikesh Zawar, Maneesh Agrawala, Anyi Rao, Alexander Schwing, James M. Rehg
Continuing its role as a key platform for interdisciplinary exchange, this workshop brings together researchers, artists, and practitioners to explore how AI technologies enable new forms of creative visual content production, editing, and interpretation.

Agentic AI for Visual Media
Organized by Jinjin Gu, Lei Sun, Zhendong Li, Zhenfei Yin, Anyi Rao, Yeying Jin, Jing Shao, Enze Xie, He Zhang, Jian Wang, Danda Pani Paudel, Philip Torr, Luc Van Gool
This workshop focuses on the emerging paradigm of agentic AI systems for visual media, examining how autonomous and interactive AI agents can plan, generate, and manipulate visual content. It highlights new research directions at the intersection of generative models, multimodal reasoning, and creative workflows.

Illustration of BiCo, a one-shot method that enables flexible visual concept composition

Showcase of controllable single-image relighting by learning latent proxy representations of lighting conditions

Gallery of GroupEditing applying consistent and unified modifications across a set of related images