Hannah Cyberey

I’m a Postdoctoral Research Associate in the School of Data Science at the University of Virginia, supervised by Alex Gates. My current research studies the functional backbone of AI progress—how AI capabilities emerge, interconnect, and evolve within the broader research ecosystem, as part of UVA’s National Security Data and Policy Institute.

I received my PhD in Computer Science from the University of Virginia, advised by Prof. David Evans and Yangfeng Ji. I’m broadly interested in topics on AI safety and ethics. My PhD research focuses on trustworthy natural language processing (NLP), addressing issues related to robustness and fairness of language models. My recent work explores representation engineering methods for mitigating bias and countering censorship.

Email: yc4dx at virginia dot edu

news

Aug 20, 2025	Our paper “Unsupervised Concept Vector Extraction for Bias Control in LLMs” is accepted to EMNLP 2025 (Main Conference)!
Jul 21, 2025	I successfully defended my PhD. I’m officially Dr. Cyberey!
Jul 08, 2025	Our paper “Steering the CensorShip: Uncovering Representation Vectors for LLM “Thought” Control” is accepted to COLM 2025!
Apr 24, 2025	Our paper “Do Prevalent Bias Metrics Capture Allocational Harms from LLMs?” is accepted to the Workshop on Insights from Negative Results in NLP
May 09, 2024	I passed my PhD dissertation proposal defense!

latest posts

Apr 24, 2025	Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought" Control
Aug 10, 2024	The Mismeasure of Man and Models
Aug 17, 2023	Adjectives Can Reveal Gender Biases Within NLP Models