Welch’s t-test, named for its creator, Bernard Lewis Welch, is an adaptation of Student’s t-test. Unlike the Student t-test, it doesn’t assume an equal variance in the two populations (Welch 1947). It is also based on hypothesis testing, like chi-squared test and log-likelihood ratio test, but in contrast to them, it takes not only the frequency of a feature into account. Sample mean, standard deviation and sample size are included in a calculation of the t-value. That is the reason why this measure can better deal with frequent words that occur only in one text or one part of a text in a given collection.

## References

```
Paquot, Magali, and Yves Bestgen, ‘Distinctive Words in Academic Writing: A Comparison of Three Statistical Tests for Keyword Extraction’, in
```*Corpora: Pragmatics and Discourse*, ed. by Andreas H. Jucker, Daniel Schreier, and Marianne Hundt (Brill | Rodopi, 2009) <https://doi.org/10.1163/9789042029101_014>
Smucker, Mark D., James Allan, and Ben Carterette, ‘A Comparison of Statistical Significance Tests for Information Retrieval Evaluation’, in *Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management*, CIKM ’07 (New York, NY, USA: ACM, 2007), pp. 623–32 <https://doi.org/10.1145/1321440.1321528>
Welch, Bernard Lewis, ‘The Generalization of Student’s Probem When Several Different Population Variances Are Involved’, *Biometrika*, 34.1–2 (1947), 28–35 <https://doi.org/10.1093/biomet/34.1-2.28>

```
```