Integreat Tuesday seminar: Mor Geva

This seminar is part of the seminar series on Integreat's objective Fair, Explainable and Trustworthy Machine Learning.

Image may contain: Person, Hair, Eye, Eyelash, Neck.

Title

Knowledge in LLMs: Lack of Dependencies and Sparks of Reasoning

Abstract

Some of the most prominent issues with large language models (LLMs), such as generation of factually incorrect text and logically incorrect reasoning, may be attributed to the way models represent knowledge internally. For example, a model could fail to provide consistent answers to the questions “What is the area code in Oslo?” and ”What is the area code in Norway's capital?” due to encoding of “Oslo” and “Norway's capital” as two disjoint places. In this talk, I will present new findings on how LLMs represent and utilize knowledge dependencies internally. First, we will analyze the ripple effects of knowledge editing, showing that editing a specific fact in the model does not implicitly induce modification of other facts depending on it. Next, we will consider a setting of latent two-hop reasoning, showing that model consistency in answering questions about an entity (e.g., Oslo) versus a description of it (e.g., Norway's capital) typically does not correlate with its ability to resolve the entity from its description. We will conclude by showing that local interventions on the LLM computation can mitigate these limitations, suggesting that despite their poor representation of knowledge dependencies, LLMs do exhibit latent reasoning abilities.

Bio

Mor Geva is an Assistant Professor (Senior Lecturer) at the School of Computer Science at Tel Aviv University and a Visiting Researcher at Google Research. Her research focuses on understanding the inner workings of large language models, to increase their transparency and efficiency, control their operation, and improve their reasoning abilities. Mor completed a Ph.D. in Computer Science and a B.Sc. in Bioinformatics at Tel Aviv University, and was a postdoctoral researcher at Google DeepMind and the Allen Institute for AI. She was nominated as one of the MIT Rising Stars in EECS and is a laureate of the Séphora Berrebi Scholarship in Computer Science. She was awarded the Dan David Prize for graduate students in the field of AI and recently received an Outstanding Paper Award at EACL 2023.

Practical

The Tuesdays seminars series are devoted to various topics relevant for the Integreat´s research focus. Presenters from the Integreat community and beyond have 40 minutes to present, followed by a group discussion. 

Seminars are open for attendance for everybody.  

For those unable to attend in person, contact the organisers for the attendance link.

Point of contacts:

Published Jan. 17, 2024 10:22 AM - Last modified Feb. 5, 2024 9:53 AM