Workshop: Using and Developing Software for Keyness Analysis

Have you read about keyness measures and keyword analysis, but still have open questions? Do you know there are multiple keyness measures, but wonder which one is best for your research question? Or do you have experience using keyness, but wonder why different tools implement different measures? Then our workshop on “Using and Developing Software for Keyness Analysis” might be just the right event for you, as it will address these and many other related questions.”

The workshop will take place on February 27 (public event) and February 28 (closed event). The participation is free of charge and is possible both on-site and online. It is organized within the scope of the project “Zeta and Company” (https://zeta-project.eu/en/), conducted at Trier University, Germany and funded by the DFG. 

The goal of this workshop is to foster exchange among tool developers and scholars from Computational Linguistics and Computational Literary Studies. The developers of stylo (Maciej Eder), Scattertext (Jason Kessler), TXM (Serge Heiden and Bénédicte Pincemin) and TEITOK (Maarten Janssen) will present their corpus analysis tools and the rationale behind their keyness implementation choices. The Zeta team will also present our implementation (pydistinto) and evaluation results. Finally, Stephanie Evert will present a keynote lecture on keyness to close the public programme. 

The workshop will be divided into two parts and will last two half days, roughly from noon on the first day to noon the second day. The first part will be a public event that you can attend without restrictions, but registration is required. The second part will be a closed event, where the developers will discuss their further goals and ideas. 

To apply for the workshop please send an email to Julia Dudar (dudar@uni-trier.de) providing your full name, institution, email-address and whether you will participate online or on site. Registration is open until February 15th. 

We will be happy to meet you at our workshop.

Key information

Date and time (public event): February 27, 2023, 13:00 to ca. 19:15

The closed part of the event will take place on February 28, from 9:00 to 13:00. 

Location: University of Trier, Germany, Gästeraum der Mensa. Please see indications below on how to find the venue. 

Evening Keynote

Stephanie Evert: Measuring Keyness”

Preliminary programm

TimeEvent
13:00 Meet and Greet
13:30 Welcome / Opening Remarks (Christof Schöch)
13:45 “Computing specificities in TXM: philological, NLP and corpus configuration considerations” (Serge Heiden)
14:15 “Scatterchron: visualizing diachronic or multi-class corpora in whole and parts” (Jason Kessler)
14:45 Coffee break
15:15 ‘Zeta and Company’ (Keli Du, Julia Dudar)
15:45“Words that (might) matter, or keywords extraction using ‘stylo’” (Maciej Eder)
16:15 Short break
16:30 “Keyness in TEITOK: attempts, problems, and limitations” (Maarten Janssen)
17:00 “ Fishing for Keyness. The Specificity Measure in Textometry” (Bénédicte Pincemin)
17:30 Break
17:45Evening keynote: “Measuring Keyness” by Stephanie Evert (hybrid via Zoom)
18:45End of the academic programme
Timetable of the workshop

Abstracts

Jason Kessler: “Scatterchron: visualizing diachronic or multi-class corpora in whole and parts”

Serge Heiden: “Computing specificities in TXM: philological, NLP and corpus configuration considerations”

Bénédicte Pincemin: “Fishing for Keyness. The Specificity Measure in Textometry”

Maarten Jannsen: Keyness in TEITOK: attempts, problems, and limitations

Julia Dudar & Keli Du: Zeta and Company

Maciej Eder: “Words that (might) matter, or keywords extraction using ‘stylo’”

Additional workshop

In addition to the Keyness Workshop, we are happy to offer an introduction to Scattertext by Jason Kessler! The workshop will take place on Tuesday, 28.02.2023 from 14–16 o’clock in the Gästeraum of the canteen. The abstract can be found below:

Jason Kessler: “It was obvious in retrospect: interactive language visualization with Scattertext”

Scattertext is a Python package designed to make it easy to produce interactive visualizations of a corpus. Interactive visualization helps not only finding associated keyphrases but, by looking at keywords in context, understanding why they are associated.

The problem of visualizing how a foreground corpus differs from a background corpus will be the focus of the first part of the tutorial. This was the original use case of Scattertext, and we will see how simple term frequency plots can yield interesting category-associated terms. We will also look at ways of plotting term distinctiveness measures and include instructions on adding a custom measure.

We will also discuss how to visualize different kinds of features beyond unigrams, including phrase-like n-grams, lexicons, and phonemes. Special attention will be paid to reducing the set of features to ensure the visualizations render quickly.

We will also address the problem of focusing keyness analysis on terms with similar meanings or usage through embedding techniques such as word2vec or prebuilt embeddings.

The second half of the tutorial will focus on more experimental aspects of Scattertext.

Visualizing keyness for multiple categories of text in the same visualization is a difficult problem. We will look at category-based dispersion plots and use multiple interactive scatterplots to examine two-dimensional projections of category and term embeddings.

Finally, we will explore how to use Scattertext to see changes in a corpus over time. We will see how a combination of parallel tag cloud-like visualizations and various types of scatter plots, including correlations, dispersion, and average-time based.

The tutorial will consist of Jupyter notebooks and slides discussing related work. The material will be available at https://github.com/JasonKessler/KeynessToolsTalk as the date of the tutorial approaches.

Itineraries

How to arrive on campus

To easily get to Trier University’s campus I (main campus), it is recommended to use public transport (buses). Depending on where you are starting from, different bus lines can be taken. It is also possible to arrive by car and park near the campus. You can find out more about your own directions here.

  • Start point A: Main (train) station (“Hauptbahnhof”)
    • Daytime:
      • Line 3 (direction Kürenz/Lud.-Erhard-Ring): every ten minutes, get out at the stop “Universität”
      • Line 231 & 31 (direction Pluwig/Bonerath): every half hour, get out at the stop “Universität Süd” 
    • Early morning/evening/weekend:
      • Line 83/88 (direction Tarforst): get out at the stop “Universität”
      • Line 81 (direction Tarforst): every half hour, get out at the stop “Universität Süd”
  • Start point B: City centre (“Porta Nigra, Bussteig 2”)
    • Daytime:
      • Line 3 (direction Kürenz/Lud.-Erhard-Ring): every ten minutes, get out at the stop “Universität”
      • Line 6 (direction Tarforst): every ten minutes, get out at the stop “Universität Süd”
    • Early morning/evening/weekend:
      • Line 83/88 (direction Tarforst): get out at the stop “Universität”

How to get from the bus station to the “Gästeraum der Mensa”

Pro-Tip: You can put “Mensa der Universität Trier, University of Trier, 54296 Trier” into Google Maps to get hands-on directions. 

Starting at bus station “Universität”

To get to the Gästeraum der Mensa (guest room of the canteen), when you get out of the bus, walk straight ahead (left path next to the bridge) and follow the way indicated by the signs in place. You will pass the N building on the left side, then the E and D buildings on the right. You then have to turn left and cross the bridge with the yellow railing after which you will pass the library on the left side (green building) and arrive on the Forumsplatte from where you can directly access the canteen with the Gästeraum.

Starting at bus station “Universität Süd”

To get to the Gästeraum der Mensa (guest room of the canteen), when you get out of the bus, turn right and climb the stairs that lead up to a parking lot. The building you’ll see first is the DM-building. You need to pass it and follow the path until you arrive at another staircase you have to climb to arrive at the V building. In front of the V building, you can choose between several staircases; we recommend using those at your left-hand side (near a little pond), then you’ll get out almost directly in front of the canteen.