Scott K. Geng

Hello, I'm Scott! I am a fourth-year undergrad studying math and computer science at Columbia University, where I am very fortunate to be advised by Prof. Junfeng Yang and Prof. Carl Vondrick.

I am broadly interested in teaching machines to robustly reason about open-world (and often multi-modal) data with as few labels as possible. Concretely, my current work leverages vision-language models and self-supervised representation learning techniques to model how humans interact from unlabeled in-the-wild videos. My research is supported by the Rabi Fellowship.

Email  /  CV  /  Google Scholar  /  GitHub

profile photo

I've been lucky to explore research in several different fields during my time at Columbia. Currently, I work on problems in vision-language reasoning, social intelligence, and few-shot adversarial robustness at the Columbia Computer Vision Lab. Previously, I worked on program representation learning at the Software Systems Lab. And even further before that, I did research quantifying movement disorders with the Kuo Lab on Columbia's medical campus.

Affective Faces for Goal-Driven Dyadic Communication
Scott Geng*, Revant Teotia*, Purva Tendulkar, Sachit Menon, Carl Vondrick
In submission
arXiv / project page

We introduce a video framework for modeling goal-conditioned interactions between verbal and non-verbal communication in dyadic conversations. To study this problem, we also introduce the RealTalk video dataset, which contains 100+ hours of unscripted in-the-wild conversations.

Understanding Zero-shot Adversarial Robustness for Large-Scale Models
Chengzhi Mao*, Scott Geng*, Junfeng Yang, Xin Wang, Carl Vondrick
ICLR, 2023

We identify the novel problem of zero-shot adversarial robustness and propose a new text-grounded adversarial training objective that can help make CLIP robust while preserving its ability to generalize.

NeuDep: Neural Binary Memory Dependence Analysis
Kexin Pei, Dongdong She*, Michael Wang*, Scott Geng*, Zhou Xuan, Yaniv David, Junfeng Yang, Suman Jana, Baishakhi Ray
ESEC/FSE, 2022
arXiv / code

Unlike in natural language, the semantic meaning of code is directly measureable as the CPU's memory values during runtime. Inferring these execution traces is a natural self-supervised task, which we leverage to learn a nice representation of binary code.

Cerebellar Oscillations in Familial and Sporadic Essential Tremor
Shi-Bing Wong, Yi-Mei Wang, Chih-Chun Lin, Scott Geng, Nora Vanegas-Arroyave, Seth Pullman, Sheng-Han Kuo, Ming-Kai Pan
The Cerebellum, 2021

Low-frequency brain waves are correlated with symptom severity in sporadic essential tremor but not familial (i.e. genetic based). Suggests a difference in mechanism.


Course Assistant (Spring 2021, Fall 2021): COMS 4771 Machine Learning

Jon Barron has a very clean website.
Last updated: December 15th, 2022.