{"id":"bowen-cheng","title":"Bowen Cheng","content":"**Bowen Cheng** (程博文) is an artificial intelligence researcher at [Meta's Superintelligence Lab](https://iq.wiki/wiki/meta-superintelligence-team). He specializes in multimodal foundation models and has contributed to significant AI projects at OpenAI, including GPT-4o, and Tesla's Full Self-Driving (FSD) software. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[2\\]](#cite-id-q6l9jCBPSE)\n\n## Education\n\nCheng received both his Bachelor of Science and his Ph.D. in Electrical and Computer Engineering (ECE) from the University of Illinois Urbana-Champaign (UIUC). During his doctoral studies, his advisors were Professor Alexander Schwing and Professor Thomas Huang. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[2\\]](#cite-id-q6l9jCBPSE) [\\[4\\]](#cite-id-KpffBig6ri)\n\n## Career\n\nAs of 2025, Bowen Cheng is a researcher at [Meta's Superintelligence Lab](https://iq.wiki/wiki/meta-superintelligence-team) (MSL). He joined the newly formed group after a tenure at OpenAI, where he worked as a researcher on multimodal understanding and interaction. While at OpenAI, he was part of the post-training team focused on building multimodal models. Prior to OpenAI, Cheng was a Senior Research Scientist at Tesla, where he worked on the Autopilot team. Throughout his academic career, he completed several research internships at prominent technology labs, including Facebook AI Research (FAIR) in both New York City and Menlo Park, Google Research in Los Angeles, Microsoft Research in Redmond, and Microsoft Research Asia in Beijing. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[3\\]](#cite-id-0KGECiwWZT) [\\[2\\]](#cite-id-q6l9jCBPSE) [\\[4\\] ](#cite-id-KpffBig6ri)[\\[5\\]](#cite-id-hfIvIQ0rvU) [\\[6\\]](#cite-id-SXb2Avx1Sl)\n\nCheng has been a core contributor to several high-profile projects in the field of artificial intelligence. His work spans computer vision, autonomous driving, and large-scale multimodal models.\n\nHis notable contributions include:\n\n* **Meta Superintelligence Lab**: Joined as a research scientist in a team assembled to focus on advanced AI research and development. [\\[2\\]](#cite-id-q6l9jCBPSE)\n* **OpenAI**:\n  * **GPT-4o**: Served as a core contributor, focusing on perception and the advanced voice mode, which featured significantly lower latency in audio interaction.\n  * **Thinking with Images**: Initiated research and was a foundational contributor to this project, which he described as a paradigm shift in solving perception problems.\n  * **o3 and o4-mini**: Acted as a core contributor to these models.\n  * **GPT-4.1**: Listed as a core contributor.\n  * **OpenAI Audio API**: Contributed research to the next-generation audio models. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[3\\]](#cite-id-0KGECiwWZT)\n* **Tesla**:\n  * **FSD v12**: Was a core contributor to the twelfth version of Tesla's Full Self-Driving software. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[3\\]](#cite-id-0KGECiwWZT)\n* **Academic Research**:\n  * **Mask2Former**: A universal image segmentation architecture.\n  * **MaskFormer**: An architecture for panoptic segmentation.\n  * **Panoptic-DeepLab**: A bottom-up approach for panoptic segmentation.\n\nThese projects highlight his work in segmentation transformers and multimodal systems. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[5\\]](#cite-id-hfIvIQ0rvU) [\\[6\\]](#cite-id-SXb2Avx1Sl)\n\n## Research Interests\n\nCheng's primary research interest is in building real-time multimodal interaction systems. He aims to develop AI that can process streaming audio and video inputs to produce streaming audio and video outputs in real time. His vision for such systems includes features like an infinite context window for smooth interaction, advanced long-term memory capabilities, and the ability to stay updated with new information while proactively creating content. [\\[1\\]](#cite-id-3Zbb0jnSPl) [\\[6\\]](#cite-id-SXb2Avx1Sl) [\\[5\\]](#cite-id-hfIvIQ0rvU)","summary":"Bowen Cheng is an AI researcher at Meta's Superintelligence Lab. He holds a Ph.D. from UIUC and previously worked at OpenAI, contributing to GPT-4o, and at Tesla on the FSD v12 project. His research focuses on multimodal foundation models.","images":[{"id":"QmYHUqLJJaWYC8GDUKPPJkXcnS2KoJDMBCQ4QWA92Vu12m","type":"image/jpeg, image/png"}],"categories":[{"id":"people","title":"people"}],"tags":[{"id":"AI"},{"id":"Developers"},{"id":"PeopleInDeFi"},{"id":"Organizations"}],"media":[],"metadata":[{"id":"references","value":"[{\"id\":\"3Zbb0jnSPl\",\"url\":\"https://bowenc0221.github.io/\",\"description\":\"Bowen Cheng's personal website\",\"timestamp\":1756209277497},{\"id\":\"q6l9jCBPSE\",\"url\":\"https://medium.com/g-able/unpacking-metas-superintelligence-team-a-deep-dive-into-their-ambitious-ai-play-99cc821cffb7\",\"description\":\"Analysis of Meta's Superintelligence team\",\"timestamp\":1756209277497},{\"id\":\"0KGECiwWZT\",\"url\":\"https://x.com/bowenc0221\",\"description\":\"Bowen Cheng's X/Twitter profile\",\"timestamp\":1756209277497},{\"id\":\"KpffBig6ri\",\"description\":\"LinkedIn: Bowen Cheng\",\"timestamp\":1756209387295,\"url\":\"https://www.linkedin.com/in/bowen-cheng/\"},{\"id\":\"hfIvIQ0rvU\",\"description\":\"Google Scholar: Bowen Cheng\",\"timestamp\":1756209472240,\"url\":\"https://scholar.google.com/citations?user=JETJjHoAAAAJ\"},{\"id\":\"SXb2Avx1Sl\",\"description\":\"Bowen Cheng’s research while affiliated with Meta and other places\\n\",\"timestamp\":1756209492623,\"url\":\"https://www.researchgate.net/scientific-contributions/Bowen-Cheng-2192364647\"}]"},{"id":"website","value":"https://bowenc0221.github.io/"},{"id":"twitter_profile","value":"https://x.com/bowenc0221"},{"id":"linkedin_profile","value":"https://www.linkedin.com/in/bowen-cheng/"},{"id":"github_profile","value":"https://github.com/bowenc0221"},{"id":"email_url","value":"mailto:bcheng9@illinois.edu"},{"id":"previous_cid","value":"\"https://ipfs.everipedia.org/ipfs/QmVyVrandphaP7YrHxvbSwwwpCws4sLehWXuoQ6f6D8Qhc\""},{"id":"commit-message","value":"\"Republishing wiki for Bowen Cheng\""},{"id":"previous_cid","value":"QmVyVrandphaP7YrHxvbSwwwpCws4sLehWXuoQ6f6D8Qhc"}],"events":[{"id":"806fb882-2d00-4fc2-b0d5-37c9422a1bd1","date":"2020-08","title":"Graduated Ph.D. from UIUC","type":"DEFAULT","description":"Completed his Ph.D. in Electrical and Computer Engineering at the University of Illinois Urbana-Champaign, advised by Prof. Alexander Schwing and Prof. Thomas Huang (2017-2020).","multiDateStart":null,"multiDateEnd":null},{"id":"b3aee3af-be3d-4eff-9f26-a00391e52b59","date":"2022-09","title":"Joined Tesla Autopilot Team","type":"DEFAULT","description":"Worked as a Senior Research Scientist at Tesla, where he was a core contributor to the Full Self-Driving (FSD) v12 project.","multiDateStart":null,"multiDateEnd":null},{"id":"bf6d74dd-43ca-4bff-b16e-f8f781dc0885","date":"2024-03","title":"Joined OpenAI","type":"DEFAULT","description":"Joined OpenAI's post-training team, contributing to multimodal models including GPT-4o, the 'Thinking with Images' research, and advanced voice mode.","multiDateStart":null,"multiDateEnd":null},{"id":"6e62da7c-347f-401e-ae5e-f5e0c2c79441","date":"2024-05","title":"Joined Meta Superintelligence Lab","type":"DEFAULT","description":"Became a researcher at Meta's newly formed Superintelligence Lab (MSL), focusing on building real-time multimodal interaction systems.","multiDateStart":null,"multiDateEnd":null}],"user":{"id":"0x8af7a19a26d8fbc48defb35aefb15ec8c407f889"},"author":{"id":"0x8af7a19a26d8fbc48defb35aefb15ec8c407f889"},"language":"en","version":1,"linkedWikis":{"blockchains":[],"founders":[],"speakers":[]}}