Hammad Ayyubi

I am a final year PhD student at the Dept. of Computer Sicence, Columbia University, advised by Prof. Shih-Fu Chang.

My research interests focuses on Computer Vision, Natural Language Processing and Commonsense Reasoning. In particular, I am interested in building systems that can reason about our world in an interpretable, robust, and trustworthy manner. This involves extensive work with LLMs, agents, tools, and instruction-tuning.

I have been fortunate to have worked with some amazing people via internships at Microsoft, Google and Adobe – Jianwei Yang, Oriana Riva, Tianqi Liu, Arsha Nagrani, Mingda Zhang, Anurag Arnab, and Vlad Morariu.

Prior to joining Columbia, I finished my Master’s at UC San Diego, advised by Prof. Gary Cottrell. I also worked with Prof. Manmohan Chandraker and Prof. David Kriegman during that time. I finished my Bachelor’s at Indian Institute of Technology, BHU (IIT, BHU).

I am on industry job market. If you are interested, feel free to reach out.

Dept. of Computer Science

Columbia University

New York, NY 10027

news

Oct 15, 2024	One paper on Event Graph based Interpretable VideoQA accepted to NeurIPS MAR Workshop, 2024.
Sep 26, 2024	Our paper on Multimodal Reasoning on Generated Images was accepted at NeurIPS’24. Dataset and code here.
Sep 20, 2024	Our paper on Entity-Aware Video Captioning was accepted at EMNLP’24.
Jul 22, 2024	One paper on Procedure Planning accepted at ECCV’24.
Jul 20, 2024	One paper on insufficient context in Multimodal Reasoning accepted at ACM MM’24.
Jun 1, 2024	Excited to be starting my summer internship at Adobe!
Dec 9, 2023	I am looking for a MS/UG student to work with me on Multimodal Commonsense Reasoning. If you are interested, please reach out.
Dec 9, 2023	LLM Knowledge Enhanced Video Entity-Aware Summarization (arxiv)
Dec 9, 2023	One paper accepted to AAAI 2024!
Nov 1, 2023	One paper accepted to EMNLP Findings 2023!