Burrows‘ Zeta was first suggested by Burrows (2007) and was used originally for stylometric Authorship Attribution. There are several variants of Zeta proposed by Craig and Kinney (2009) and by Schöch et al. (2018). Zeta is mathematically very simple and has a bias towards content words, two attributes that make this measure attractive for other application domains in CLS, such as genre analysis (Schöch 2018) or gender analysis (Hoover 2010). This measure quantifies degrees of dispersion of a feature in two corpora and compares them. It is performed by comparing the document proportions of a target word or feature (that is, the proportion of all documents in which the target word occurs at least once) in the target and the comparison corpus. In our framework, we implemented two variants of Zeta: Burrows’ Zeta (Zeta_orig, Burrows 2007) and logarithmic Zeta (Zeta_log, Schöch et al. 2018) to compare their performance. 

References

Rotari, Gabriela, Melina Jander, and Jan Rybicki, ‘The Grimm Brothers: A Stylometric Network Analysis’, Digital Scholarship in the Humanities, 36.1 (2021), 172–86 <https://doi.org/10.1093/llc/fqz088>
Du, Keli, Julia Dudar, Cora Rok, and Christof Schöch, ‘Zeta & Eta: An Exploration and Evaluation of Two Dispersion-Based Measures of Distinctiveness’, Proceedings Computational Humanities Research 2021, 1613 (2021), 0073 <http://ceur-ws.org/Vol-2989/>
Rizvi, Pervez, ‘An Improvement to Zeta’, Digital Scholarship in the Humanities, 34.2 (2019), 419–22 <https://doi.org/10.1093/llc/fqy039>
Rizvi, Pervez, ‘The Interpretation of Zeta Test Results’, Digital Scholarship in the Humanities, 34.2 (2019), 401–18 <https://doi.org/10.1093/llc/fqy038>
Rebora, Simone, J. Berenike Herrmann, Gerhard Lauer, and Massimo Salgaro, ‘Robert Musil, a War Journal, and Stylometry: Tackling the Issue of Short Texts in Authorship Attribution’, Digital Scholarship in the Humanities, 34.3 (2019), 582–605 <https://doi.org/10.1093/llc/fqy055>
González, José Eduardo, Montserrat-Fuente Camacho, and Marcus Barbosa, ‘Detecting Modernismo’s Fingerprint: A Digital Humanities Approach to the Turn of the Century Spanish American Novel’, Review: Literature and Arts of the Americas, 51.2 (2018), 195–204 <https://doi.org/10.1080/08905762.2018.1540577>
Weidman, Sean G., and James O’Sullivan, ‘The Limits of Distinctive Words: Re-Evaluating Literature’s Gender Marker Debate’, Digital Scholarship in the Humanities, 33.2 (2018), 374–90 <https://doi.org/10.1093/llc/fqx017>
Schöch, Christof, ‘Zeta für die kontrastive Analyse literarischer Texte. Theorie, Implementierung, Fallstudie’, in Quantitative Ansätze in den Literatur- und Geisteswissenschaften. Systematische und historische Perspektiven, ed. by Toni Bernhart, Sandra Richter, Marcus Lepper, Marcus Willand, and Andrea Albrecht (Berlin: de Gruyter, 2018), pp. 77–94 <https://www.degruyter.com/view/books/9783110523300/9783110523300-004/9783110523300-004.xml>
Schöch, Christof, Daniel Schlör, Albin Zehe, Henning Gebhard, Martin Becker, and Andreas Hotho, ‘Burrows’​ ​Zeta: Exploring​ and​ Evaluating Variants​ ​and​ ​Parameters’, in Book of Abstracts of the Digital Humanities Conference (presented at the Digital Humanities Conference (DH2018), Mexico City: ADHO, 2018) <https://dh2018.adho.org/burrows-zeta-exploring-and-evaluating-variants-and-parameters/>
David L. Hoover, ‘Using the Zeta and Iota Spreadsheet’, 2017 <https://wp.nyu.edu/exceltextanalysis/zetaiotawidespectrum/usingzetaiota/> [accessed 17 September 2019]
Rybicki, Jan, ‘Vive La Différence: Tracing the (Authorial) Gender Signal by Multivariate Analysis of Word Frequencies’, Digital Scholarship in the Humanities, 31.4 (2016), 746–61 <https://doi.org/10.1093/llc/fqv023>
Hoover, David L., ‘The Tutor’s Story : A Case Study of Mixed Authorship’, English Studies, 93.3 (2012), 324–39 <https://doi.org/10.1080/0013838X.2012.668791>
Hoover, David L., ‘DH2010: The Craig Zeta Spreadsheet’, 2010 <http://dh2010.cch.kcl.ac.uk/academic-programme/abstracts/papers/html/ab-659.html> [accessed 17 September 2019]
Craig, Hugh, and Arthur F. Kinney, eds., Shakespeare, Computers, and the Mystery of Authorship, 1st edn (Cambridge University Press, 2009)
Burrows, John, ‘All the Way Through: Testing for Authorship in Different Frequency Strata’, Literary and Linguistic Computing, 22.1 (2007), 27–47 <https://doi.org/10.1093/llc/fqi067>
Jordan, Ellen, Hugh Craig, and Alexis Antonia, ‘The Brontë Sisters and the “Christian Remembrancer”: A Pilot Study in the Use of the “Burrows Method” to Identify the Authorship of Unsigned Articles in the Nineteenth-Century Periodical Press’, Victorian Periodicals Review, 39.1 (2006), 21–45 <https://www.jstor.org/stable/20084107> [accessed 7 September 2019]
Burrows, John, ‘Who Wrote Shamela? Verifying the Authorship of a Parodic Text’, Digital Scholarship in the Humanities, 20.4 (2005), 437–50 <https://doi.org/10.1093/llc/fqi049>
Burrows, John, and Hugh Craig, ‘Lucy Hutchinson and the Authorship of Two Seventeenth-Century Poems: A Computational Approach’, The Seventeenth Century, 16.2 (2001), 259–82 <https://doi.org/10.1080/0268117X.2001.10555493>
Forsyth, Rs, Di Holmes, and Ek Tse, ‘Cicero, Sigonio, and Burrows: Investigating the Authenticity of the Consolatio’, Digital Scholarship in the Humanities, 14.3 (1999), 375–400 <https://doi.org/10.1093/llc/14.3.375>