The Wilcoxon rank sum test, also known as Mann-Whitney U-test, doesn’t make any assumption concerning the statistical distribution of words in a corpus (Wilcoxon 1945, Mann & Whitney 1947). It is based on a comparison of a sum of rank orders of texts in two text collections. The rank orders of texts are defined according to a frequency of a target word, without considering to which of both corpora this text belongs (see Lijffijt 2014). In our implementation, it sums up the frequencies per segment of document; for this reason, we consider it to be a dispersion-based rather than a frequency-based measure. 


Lijffijt, Jefrey, Terttu Nevalainen, Tanja Säily, Panagiotis Papapetrou, Kai Puolamäki, and Heikki Mannila, ‘Significance Testing of Word Frequencies in Corpora’, Digital Scholarship in the Humanities, 31.2 (2014), 374–97 <>
Paquot, Magali, and Yves Bestgen, ‘Distinctive Words in Academic Writing: A Comparison of Three Statistical Tests for Keyword Extraction’, in Corpora: Pragmatics and Discourse, ed. by Andreas H. Jucker, Daniel Schreier, and Marianne Hundt (Brill | Rodopi, 2009) <>
Woolson, R. F., ‘Wilcoxon Signed-Rank Test’, in Wiley Encyclopedia of Clinical Trials, ed. by Ralph B. D’Agostino, Lisa Sullivan, and Joseph Massaro (Hoboken, NJ, USA: John Wiley & Sons, Inc., 2008), p. eoct979 <>
Zimmerman, Donald W., and Bruno D. Zumbo, ‘Relative Power of the Wilcoxon Test, the Friedman Test, and Repeated-Measures ANOVA on Ranks’, The Journal of Experimental Education, 62.1 (1993), 75–86 <>
Mann, H. B., and D. R. Whitney, ‘On a Test of Whether One of Two Random Variables Is Stochastically Larger than the Other’, The Annals of Mathematical Statistics, 18.1 (1947), 50–60 <>
Wilcoxon, Frank, ‘Individual Comparisons by Ranking Methods’, Biometrics Bulletin, 1.6 (1945), 80 <>