Tai Nguyen
|
I am Tai (Đức Tài), a research engineer at Apple . I am working on multimodal and on-device language models for Apple Intelligence features. Previously, I got my MS from the University of Pennsylvania, where I got started on research with Eric Wong and Chris Callison-Burch. I was also fortunate to work with Ben Bogin from Ai2. Before that, I helped build an analytics tool to support mainframes at IBM Systems. I studied Economics at the wonderful Haverford College and wrote my undergraduate thesis on the impact of Airbnb on welfare. I grew up in Saigon, Vietnam. 🇻🇳 Email / GitHub / Google Scholar / huggingface / Twitter |
|
Research
(*: equal contribution)
DataDecide: How to Predict Best Pretraining Data with Small ExperimentsIan Magnusson*, Nguyen Tai*, Ben Bogin*, David Heineman, Jena D Hwang, Luca Soldaini, Akshita Bhagia, Jiacheng Liu, Dirk Groeneveld, Oyvind Tafjord, Noah A Smith, Pang Wei Koh, Jesse Dodge ICML 2025 DataWorld Workshop Oral arxiv / code / blog / huggingface / press / |
MMTEB: Massive Multilingual Text Embedding BenchmarkKenneth Enevoldsen, Isaac Chung, ... Nguyen Tai ..., Niklas Muennighoff (82 authors) ICLR 2025 arxiv / code / website / |
In-context Example Selection with InfluencesNguyen Tai, Eric Wong arXiv 2024 arxiv / code / blog / |
Explanation-based Finetuning Makes Models More Robust to Spurious CuesJosh Magnus Ludan, Yixuan Meng*, Tai Nguyen*, Saurabh Shah*, Qing Lyu, Marianna Apidianaki, Chris Callison-Burch ACL 2023 arxiv / code / |
Software Entity Recognition with Noise-robust LearningTai Nguyen, Yifeng Di, Joohan Lee, Muhao Chen, Tianyi Zhang ASE 2023 arxiv / code / huggingface / |
Projects
Big Data BowlRyan Brill, Joseph Rudoler, Tai Nguyen, Ryan Gross 2023 writeup / video / code / feature article / One of 5 finalists, winning $15,000. We got to meet the Director of Research of the NFL and had a professional video made. |
Underthesea2022 website / code / Contributed a small amount to an open-source Vietnamese toolkit built by the amazing Anh Vu. This helped me get started on NLP. |
STEAM For Vietnam2022 website / During Covid, I volunteered for a non-profit that provides free online education for Vietnamese children. I worked on the data science team. |