Licheng Yu - Facebook AI
|
|
The Llama 4 herd: The beginning of a new era of natively multimodal AI innovation Llama team [Blog] (Led 17Bx128 and 17Bx16's text+image reinforcement learning Stage) |
|
|
|
|
|
|
|
|
Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs
Zeiyi Huang, Yuyang Ji, Xiaofang Wang, Nikhil Mehta, Tong Xiao, Donghyun Lee, Sigmund Vanvalkenburgh, Shengxin Zha, Bolin Lai, Licheng Yu, Ning Zhang, Yong Jae Lee, Miao Liu [Paper] |
|
|
|
|
|
The Llama 3 Herd of Models Llama team (Led Llama3.2 Multimodal 11B/90B Pre-training + 11B Post-training) |
|
|
Animated Stickers: Bringing Stickers to Life with Video Diffusion
David Yan, Winnie Zhang, Luxin Zhang, Anmol Kalia, Dingkang Wang, Ankit Ramchandani, Miao Liu, Albert Pumarola, Edgar Schoenfeld, Elliot Blanchard, Krishna Narni, Yaqiao Luo, Lawrence Chen, Guan Pang, Ali Thabet, Peter Vajda, Amy Bearman, Licheng Yu [Paper] |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression
Animesh Sinha, Bo Sun, Anmol Kalia, Arantxa Casanova, Elliot Blanchard, David Yan, Winnie Zhang, Tony Nelli, Jiahui Chen, Hardik Shah, Licheng Yu, Mitesh Kumar Singh, Ankit Ramchandani, Maziar Sanjabi, Sonal Gupta, Amy Bearman, Dhruv Mahajan [Paper] |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Learning and Verification of Task Structure in Instructional Videos
Medhini Narasimhan, Licheng Yu, Sean Bell, Ning Zhang, Trevor Darrell |
|
|
AMELI: Enhancing Multimodal Entity Linking with Fine-Grained Attributes
Barry Menglong Yao, Yu Chen, Qifan Wang, Sijia Wang, Minqian Liu, Zhiyang Xu, Licheng Yu, Lifu Huang [Paper] |
|
|
|
|
|
|
|
|
FaD-VLP: Fashion Vision-and-Language Pre-training towards Unified Retrieval and Captioning
Suvir Mirchandani, Licheng Yu, Mengjiao Wang, Animesh Sinha, Wenwen Jiang, Tao Xiang, Ning Zhang [Paper] |
|
|
|
|
|
|
|
|
|
LOOPITR: Combining Dual and Cross Encoder Architectures for Image-Text Retrieval Jie Lei, Xinlei Chen, Ning Zhang, Mengjiao Wang, Mohit Bansal, Tamara L. Berg, Licheng Yu [Paper] |
|
VALUE: A Multi-Task Benchmark for Video-and-Language Understanding Evaluation
Linjie Li, Jie Lei, Zhe Gan, Licheng Yu, Yen-Chun Chen, Rohit Pillai, Yu Cheng, Luowei Zhou, Xin Eric Wang, William Yang Wang, Tamara L. Berg, Mohit Bansal, Jingjing Liu, Lijuan Wang, Zicheng Liu [Paper][Leaderboard] |
|
|
|
What is More Likely to Happen Next? Video-and-Language Future Event Prediction
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal |
|
HERO: Hierarchical Encoder for Video+Language Omni-representation Pre-training
Linjie Li*, Yen-Chun Chen*, Yu Cheng, Zhe Gan, Licheng Yu, Jingjing Liu |
|
|
|
TVR: A Large-Scale Dataset for Video-Subtitle Moment Retrieval
Jie Lei, Licheng Yu, Tamara L. Berg, Mohit Bansal |
|
UNITER: Learning UNiversal Image-Text Representations
Yen-Chun Chen*, Linjie Li*, Licheng Yu*, Ahmed El Kholy, Faisal Ahmed, Zhe Gan, Yu Cheng, Jingjing Liu
Achieving SOTA on 13 Vision+Language Datasets/Tasks, and
|
|
|
|
|
|
|
|
|
|
Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
Hao Tan, Licheng Yu, Mohit Bansal |
|
TVQA: Localized Compositional Video Question Answering
Jie Lei, Licheng Yu, Mohit Bansal, Tamara L. Berg |
|
|
|
From Image to Language and Back Again Journal of Natural Language Engineering (JNLE), 2018 Anya Belz, Tamara L. Berg, Licheng Yu [Paper] |
|
Physics-Inspired Garment Recovery from a Single-View Image ACM Transactions on Graphics, 2018
Shan Yang, Tanya Ambert, Zherong Pan, Ke Wang, Licheng Yu, Tamara L. Berg, Ming C. Lin |
|
A Unified Framework for Manifold Landmarking IEEE Transactions on Signal Processing, 2018
Hongteng Xu, Licheng Yu, Mark Davenport, Hongyuan Zha [Paper] |
|
Hierarchically-Attentive RNN for Album Summarization and Storytelling Licheng Yu, Mohit Bansal, Tamara L. Berg [Paper] [Dataset API] |
|
|
|
|
|
Visual Madlibs: Fill-in-the-blank Image Description and Question Answering Licheng Yu, Eunbyung Park, Alexander C. Berg, Tamara L. Berg |
|
|
|
|
|
Quaternion-based Sparse Representation of Color Image IEEE International Conference on Multimedia and Expo, ICME 2013 Licheng Yu, Yi Xu, Hongteng Xu, Hao Zhang [Paper][Supplementary File] (Oral presentation) |
|
Single Image Super-resolution via Phase Congruency Analysis IEEE Visual Communications and Image Processing, VCIP 2013 Licheng Yu, Yi Xu, Bo Zhang [Paper] (Oral presentation) |
|
Self-Example Based Super-resolution with Fractal-based Gradient Enhancement IEEE International Conference on Multimedia and Expo, ICME workshop 2013 Licheng Yu, Yi Xu, Hongteng Xu [Paper] |
|
Robust Single Image Super-resolution based on Gradient Enhancement APSIPA Annual Summit and Conference, APSIPA 2012 Licheng Yu, Yi Xu, Hongteng Xu, Xiaokang Yang |