Juri Opitz

Hi there! I’m a researcher interested in machine learning, statistics, NLP (Natural Language Processing), and computational linguistics.

My Ph.D. was obtained from Heidelberg University, where I was advised by Anette Frank. I am now based in Switzerland, working at the University of Zurich’s CL department.

Overview of some work and interests 🔍

Meaning representations, Explainability, and Decomposability 🧐

I like to study representations and their ability to meaningfully capture data (e.g., text, images, etc.), find ways to improve their representation power, efficiency, and interlinks.

Example: Who does what to whom? A meaning representation (MR) tries to express this in a structured and explicit format, such as a graph. In this paper we refine neural sentence embeddings with MRs to decompose them into different interpretable aspects. It keeps the efficiency and power of the neural sentence embeddings while adding some valuable explainability! Check out this repository for the code.

System evaluation 😵‍💫

Even one of the simplest of all evaluation tasks (classification evaluation) is far from trivial. For an intuitive analytical overview and comparison of classification metrics such as Macro F1, Weighted F1, Kappa, Matthews Correlation Coefficient (MCC), check out this paper at MIT press or arxiv. Then evaluation issues typically get compounded when looking at tasks where we don’t generate class labels, but generate artificial text, or other structured predictions, such as semantic graphs. Here’s some work on generation evaluation (click) and semantic parsing evaluation, introducing standardized and fine-grained matching.

Other interests ✨:

NLP for history / humanities: Nowadays we got huge digitized historic data sets at our fingertips. How can computers help us make sense of tremendous amounts of such data? In a project, we’ve tried automatically reconstructing coordinates and movement patterns for thousands of medieval entities (🤴👸🧑‍🌾…), starting from the time of the Carolingian dynasty (ca. 750 CE) to Maximilian I. (ca. 1500 CE). Of course, “automatic” also means that there’s much room for reducing the error of the resconstructions – If you’ve got a nice idea for reducing the error in such approximations, here’s has code and data.

Selected works 📜

🍄 A Closer Look at Classification Evaluation Metrics and a Critical Reflection of Common Evaluation Practice. Available at: MIT press, arxiv

🍄 SBERT studies Meaning Representations: Decomposing Sentence Embeddings into Explainable Semantic Features. Available at: ACL anthology, arxiv

🍄 SMATCH++: Standardized and Extended Evaluation of Semantic Graphs. Available at: ACL anthology, arxiv

For other publications, see Google Scholar.

Teaching

At Heidelberg University

Lecture on advanced programming with python
Seminar on self-attention variants in transformer models
Seminar on computational argumentation
Seminar on semantic parsing and generation (two term projects of excellent students resulted in publications: one and two)

At TU Darmstadt

Co-supervised a graduate student project on summary evaluation, resulting in this publication.

Invited talks

Metrics of meaning representations and their interesting applications @DMR workshop at International Conference for Comp. Semantics, 2023, Nancy, France.
NLP for scholars – and the role of linguistics @NLP retreat of Data and Web Science Group, 2024, Annweiler-Trifels, Germany.
NLP and linguistics @Text+ Plenary, 2024, Mannheim, Germany.
NLP and linguistics @NLP seminar series at National Research Council, Canada.