The Wilcoxon rank sum test, also known as Mann-Whitney U-test, doesn’t make any assumption concerning the statistical distribution of words in a corpus (Wilcoxon 1945, Mann & Whitney 1947). It is based on a comparison of a sum of rank orders of texts in two text collections. The rank orders of texts are defined according to a frequency of a target word, without considering to which of both corpora this text belongs (see Lijffijt 2014). In our implementation, it sums up the frequencies per segment of document; for this reason, we consider it to be a dispersion-based rather than a frequency-based measure. 

References

Lijffijt, Jefrey, and others, ‘Significance Testing of Word Frequencies in Corpora’, Digital Scholarship in the Humanities, 31.2 (2014), pp. 374–97, http://doi.org/10.1093/llc/fqu064
Paquot, Magali, and Yves Bestgen, ‘Distinctive Words in Academic Writing: A Comparison of Three Statistical Tests for Keyword Extraction’, in Corpora: Pragmatics and Discourse, ed. by Andreas H. Jucker, Daniel Schreier, and Marianne Hundt (Brill | Rodopi, 2009), doi:10.1163/9789042029101_014
Woolson, R. F., ‘Wilcoxon Signed-Rank Test’, in Wiley Encyclopedia of Clinical Trials, ed. by Ralph B. D’Agostino, Lisa Sullivan, and Joseph Massaro (John Wiley & Sons, Inc., 2008), p. eoct979, doi:10.1002/9780471462422.eoct979
Zimmerman, Donald W., and Bruno D. Zumbo, ‘Relative Power of the Wilcoxon Test, the Friedman Test, and Repeated-Measures ANOVA on Ranks’, The Journal of Experimental Education, 62.1 (1993), pp. 75–86, http://doi.org/10.1080/00220973.1993.9943832
Mann, H. B., and D. R. Whitney, ‘On a Test of Whether One of Two Random Variables Is Stochastically Larger than the Other’, The Annals of Mathematical Statistics, 18.1 (1947), pp. 50–60, http://doi.org/10.1214/aoms/1177730491
Wilcoxon, Frank, ‘Individual Comparisons by Ranking Methods’, Biometrics Bulletin, 1.6 (1945), p. 80, http://doi.org/10.2307/3001968