Welch’s T-Test – Zeta and Company

Der t-Test von Welch, benannt nach seinem Schöpfer Bernard Lewis Welch, ist eine Adaption des Student t-Tests. Im Gegensatz zum Student t-Test geht er nicht von einer gleichen Varianz in den beiden Populationen aus (Welch 1947). Er basiert ebenfalls auf Hypothesentests, wie der Chi-Quadrat-Test und der Log-Likelihood-Ratio-Test, berücksichtigt aber im Gegensatz zu ihnen nicht nur die Häufigkeit eines Merkmals. Stichprobenmittelwert, Standardabweichung und Stichprobengröße fließen in die Berechnung des t-Wertes ebenso ein. Aus diesem Grund kann dieses Maß besser mit häufigen Wörtern umgehen, die nur in einem Text oder einem Teil eines Textes in einer bestimmten Sammlung vorkommen.
Bibliografie



		2241481
		
		
		measure_t-test
		
		
        
		1
		modern-humanities-research-association
		50
		date
		desc
		
		
		
		
		
		
		
		
		
		
        
        589
		https://zeta-project.eu/wp-content/plugins/zotpress/

		
			
				%7B%22status%22%3A%22success%22%2C%22updateneeded%22%3Afalse%2C%22instance%22%3Afalse%2C%22meta%22%3A%7B%22request_last%22%3A0%2C%22request_next%22%3A0%2C%22used_cache%22%3Atrue%7D%2C%22data%22%3A%5B%7B%22key%22%3A%22BFMBZJ5T%22%2C%22library%22%3A%7B%22id%22%3A2241481%7D%2C%22meta%22%3A%7B%22creatorSummary%22%3A%22Paquot%20and%20Bestgen%22%2C%22parsedDate%22%3A%222009-01-01%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%3Cdiv%20class%3D%5C%22csl-bib-body%5C%22%20style%3D%5C%22line-height%3A%201.35%3B%20padding-left%3A%201em%3B%20text-indent%3A-1em%3B%5C%22%3E%5Cn%20%20%3Cdiv%20class%3D%5C%22csl-entry%5C%22%3EPaquot%2C%20Magali%2C%20and%20Yves%20Bestgen%2C%20%26%23x2018%3BDistinctive%20Words%20in%20Academic%20Writing%3A%20A%20Comparison%20of%20Three%20Statistical%20Tests%20for%20Keyword%20Extraction%26%23x2019%3B%2C%20in%20%3Ci%3ECorpora%3A%20Pragmatics%20and%20Discourse%3C%5C%2Fi%3E%2C%20ed.%20by%20Andreas%20H.%20Jucker%2C%20Daniel%20Schreier%2C%20and%20Marianne%20Hundt%20%28Brill%20%7C%20Rodopi%2C%202009%29%2C%20doi%3A10.1163%5C%2F9789042029101_014%3C%5C%2Fdiv%3E%5Cn%3C%5C%2Fdiv%3E%22%2C%22data%22%3A%7B%22itemType%22%3A%22bookSection%22%2C%22title%22%3A%22Distinctive%20words%20in%20academic%20writing%3A%20A%20comparison%20of%20three%20statistical%20tests%20for%20keyword%20extraction%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22editor%22%2C%22firstName%22%3A%22Andreas%20H.%22%2C%22lastName%22%3A%22Jucker%22%7D%2C%7B%22creatorType%22%3A%22editor%22%2C%22firstName%22%3A%22Daniel%22%2C%22lastName%22%3A%22Schreier%22%7D%2C%7B%22creatorType%22%3A%22editor%22%2C%22firstName%22%3A%22Marianne%22%2C%22lastName%22%3A%22Hundt%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Magali%22%2C%22lastName%22%3A%22Paquot%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Yves%22%2C%22lastName%22%3A%22Bestgen%22%7D%5D%2C%22abstractNote%22%3A%22Most%20studies%20that%20make%20use%20of%20keyword%20analysis%20rely%20on%20log-likelihood%20ratio%20or%20chi-square%20tests%20to%20extract%20words%20that%20are%20particularly%20characteristic%20of%20a%20corpus%20%28e.g.%20Scott%20and%20Tribble%202006%29.%20These%20measures%20are%20computed%20on%20the%20basis%20of%20absolute%20frequencies%20and%20cannot%20account%20for%20the%20fact%20that%20%5Cu201ccorpora%20are%20inherently%20variable%20internally%5Cu201d%20%28Gries%202006%3A%20110%29.%20To%20overcome%20this%20limitation%2C%20measures%20of%20dispersion%20are%20sometimes%20used%20in%20combination%20with%20keyness%20values%20%28e.g.%20Rayson%202003%3B%20Oakes%20and%20Farrow%202007%29.%20Some%20scholars%20have%20also%20suggested%20using%20other%20statistical%20measures%20%28e.g.%20Wilcoxon-Mann-Whitney%20test%29%20but%20these%20techniques%20have%20not%20gained%20corpus%20linguists%5Cu2019%20favour%20%28yet%3F%29.%20One%20possible%20explanation%20for%20this%20lack%20of%20enthusiasm%20is%20that%20statistical%20tests%20for%20keyword%20extraction%20have%20rarely%20been%20compared.%20In%20this%20article%2C%20we%20make%20use%20of%20the%20log-likelihood%20ratio%2C%20the%20t-test%20and%20the%20Wilcoxon-Mann-Whitney%20test%20in%20turn%20to%20compare%20the%20academic%20and%20the%20fiction%20sub-corpora%20of%20the%20British%20National%20Corpus%20and%20extract%20words%20that%20are%20typical%20of%20academic%20discourse.%20We%20compare%20the%20three%20lists%20of%20academic%20keywords%20on%20a%20number%20of%20criteria%20%28e.g.%20number%20of%20keywords%20extracted%20by%20each%20measure%2C%20percentage%20of%20keywords%20that%20are%20shared%20in%20the%20three%20lists%2C%20frequency%20and%20distribution%20of%20academic%20keywords%20in%20the%20two%20corpora%29%20and%20explore%20the%20specificities%20of%20the%20three%20statistical%20measures.%20We%20also%20assess%20the%20advantages%20and%20disadvantages%20of%20these%20measures%20for%20the%20extraction%20of%20general%20academic%20words.%22%2C%22bookTitle%22%3A%22Corpora%3A%20Pragmatics%20and%20Discourse%22%2C%22date%22%3A%222009-01-01%22%2C%22language%22%3A%22%22%2C%22ISBN%22%3A%22978-90-420-2910-1%20978-90-420-2592-9%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Fbrill.com%5C%2Fview%5C%2Fbook%5C%2Fedcoll%5C%2F9789042029101%5C%2FB9789042029101-s014.xml%22%2C%22collections%22%3A%5B%222CZHD96W%22%2C%224MZ8ZP2B%22%5D%2C%22dateModified%22%3A%222025-02-03T17%3A54%3A31Z%22%7D%7D%2C%7B%22key%22%3A%22PUCJ6MVT%22%2C%22library%22%3A%7B%22id%22%3A2241481%7D%2C%22meta%22%3A%7B%22lastModifiedByUser%22%3A%7B%22id%22%3A228821%2C%22username%22%3A%22christof.s%22%2C%22name%22%3A%22Christof%20Sch%5Cu00f6ch%22%2C%22links%22%3A%7B%22alternate%22%3A%7B%22href%22%3A%22https%3A%5C%2F%5C%2Fwww.zotero.org%5C%2Fchristof.s%22%2C%22type%22%3A%22text%5C%2Fhtml%22%7D%7D%7D%2C%22creatorSummary%22%3A%22Smucker%20et%20al.%22%2C%22parsedDate%22%3A%222007%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%3Cdiv%20class%3D%5C%22csl-bib-body%5C%22%20style%3D%5C%22line-height%3A%201.35%3B%20padding-left%3A%201em%3B%20text-indent%3A-1em%3B%5C%22%3E%5Cn%20%20%3Cdiv%20class%3D%5C%22csl-entry%5C%22%3ESmucker%2C%20Mark%20D.%2C%20James%20Allan%2C%20and%20Ben%20Carterette%2C%20%26%23x2018%3BA%20Comparison%20of%20Statistical%20Significance%20Tests%20for%20Information%20Retrieval%20Evaluation%26%23x2019%3B%2C%20in%20%3Ci%3EProceedings%20of%20the%20Sixteenth%20ACM%20Conference%20on%20Conference%20on%20Information%20and%20Knowledge%20Management%3C%5C%2Fi%3E%2C%20CIKM%20%26%23x2019%3B07%20%28ACM%2C%202007%29%2C%20pp.%20623%26%23x2013%3B32%2C%20%3Ca%20class%3D%27zp-DOIURL%27%20href%3D%27http%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F1321440.1321528%27%3Ehttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1145%5C%2F1321440.1321528%3C%5C%2Fa%3E%3C%5C%2Fdiv%3E%5Cn%3C%5C%2Fdiv%3E%22%2C%22data%22%3A%7B%22itemType%22%3A%22conferencePaper%22%2C%22title%22%3A%22A%20Comparison%20of%20Statistical%20Significance%20Tests%20for%20Information%20Retrieval%20Evaluation%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Mark%20D.%22%2C%22lastName%22%3A%22Smucker%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22James%22%2C%22lastName%22%3A%22Allan%22%7D%2C%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Ben%22%2C%22lastName%22%3A%22Carterette%22%7D%5D%2C%22abstractNote%22%3A%22Information%20retrieval%20%28IR%29%20researchers%20commonly%20use%20three%20tests%20of%20statistical%20significance%3A%20the%20Student%27s%20paired%20t-test%2C%20the%20Wilcoxon%20signed%20rank%20test%2C%20and%20the%20sign%20test.%20Other%20researchers%20have%20previously%20proposed%20using%20both%20the%20bootstrap%20and%20Fisher%27s%20randomization%20%28permutation%29%20test%20as%20non-parametric%20significance%20tests%20for%20IR%20but%20these%20tests%20have%20seen%20little%20use.%20For%20each%20of%20these%20five%20tests%2C%20we%20took%20the%20ad-hoc%20retrieval%20runs%20submitted%20to%20TRECs%203%20and%205-8%2C%20and%20for%20each%20pair%20of%20runs%2C%20we%20measured%20the%20statistical%20significance%20of%20the%20difference%20in%20their%20mean%20average%20precision.%20We%20discovered%20that%20there%20is%20little%20practical%20difference%20between%20the%20randomization%2C%20bootstrap%2C%20and%20t%20tests.%20Both%20the%20Wilcoxon%20and%20sign%20test%20have%20a%20poor%20ability%20to%20detect%20significance%20and%20have%20the%20potential%20to%20lead%20to%20false%20detections%20of%20significance.%20The%20Wilcoxon%20and%20sign%20tests%20are%20simplified%20variants%20of%20the%20randomization%20test%20and%20their%20use%20should%20be%20discontinued%20for%20measuring%20the%20significance%20of%20a%20difference%20between%20means.%22%2C%22date%22%3A%222007%22%2C%22proceedingsTitle%22%3A%22Proceedings%20of%20the%20Sixteenth%20ACM%20Conference%20on%20Conference%20on%20Information%20and%20Knowledge%20Management%22%2C%22conferenceName%22%3A%22%22%2C%22language%22%3A%22%22%2C%22DOI%22%3A%2210.1145%5C%2F1321440.1321528%22%2C%22ISBN%22%3A%22978-1-59593-803-9%22%2C%22url%22%3A%22http%3A%5C%2F%5C%2Fdoi.acm.org%5C%2F10.1145%5C%2F1321440.1321528%22%2C%22collections%22%3A%5B%224MZ8ZP2B%22%5D%2C%22dateModified%22%3A%222020-02-14T14%3A36%3A54Z%22%7D%7D%2C%7B%22key%22%3A%22C428QV33%22%2C%22library%22%3A%7B%22id%22%3A2241481%7D%2C%22meta%22%3A%7B%22lastModifiedByUser%22%3A%7B%22id%22%3A5206995%2C%22username%22%3A%22roettgermann%22%2C%22name%22%3A%22%22%2C%22links%22%3A%7B%22alternate%22%3A%7B%22href%22%3A%22https%3A%5C%2F%5C%2Fwww.zotero.org%5C%2Froettgermann%22%2C%22type%22%3A%22text%5C%2Fhtml%22%7D%7D%7D%2C%22creatorSummary%22%3A%22Welch%22%2C%22parsedDate%22%3A%221947%22%2C%22numChildren%22%3A0%7D%2C%22bib%22%3A%22%3Cdiv%20class%3D%5C%22csl-bib-body%5C%22%20style%3D%5C%22line-height%3A%201.35%3B%20padding-left%3A%201em%3B%20text-indent%3A-1em%3B%5C%22%3E%5Cn%20%20%3Cdiv%20class%3D%5C%22csl-entry%5C%22%3EWelch%2C%20Bernard%20Lewis%2C%20%26%23x2018%3BThe%20Generalization%20of%20Student%26%23x2019%3Bs%20Problem%20When%20Several%20Different%20Population%20Variances%20Are%20Involved%26%23x2019%3B%2C%20%3Ci%3EBiometrika%3C%5C%2Fi%3E%2C%2034.1%26%23x2013%3B2%20%281947%29%2C%20pp.%2028%26%23x2013%3B35%2C%20%3Ca%20class%3D%27zp-DOIURL%27%20href%3D%27http%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1093%5C%2Fbiomet%5C%2F34.1-2.28%27%3Ehttp%3A%5C%2F%5C%2Fdoi.org%5C%2F10.1093%5C%2Fbiomet%5C%2F34.1-2.28%3C%5C%2Fa%3E%3C%5C%2Fdiv%3E%5Cn%3C%5C%2Fdiv%3E%22%2C%22data%22%3A%7B%22itemType%22%3A%22journalArticle%22%2C%22title%22%3A%22The%20generalization%20of%20Student%27s%20problem%20when%20several%20different%20population%20variances%20are%20involved%22%2C%22creators%22%3A%5B%7B%22creatorType%22%3A%22author%22%2C%22firstName%22%3A%22Bernard%20Lewis%22%2C%22lastName%22%3A%22Welch%22%7D%5D%2C%22abstractNote%22%3A%22%22%2C%22date%22%3A%221947%22%2C%22language%22%3A%22en%22%2C%22DOI%22%3A%2210.1093%5C%2Fbiomet%5C%2F34.1-2.28%22%2C%22ISSN%22%3A%220006-3444%2C%201464-3510%22%2C%22url%22%3A%22https%3A%5C%2F%5C%2Facademic.oup.com%5C%2Fbiomet%5C%2Farticle-lookup%5C%2Fdoi%5C%2F10.1093%5C%2Fbiomet%5C%2F34.1-2.28%22%2C%22collections%22%3A%5B%22IUKRIB7T%22%5D%2C%22dateModified%22%3A%222025-02-12T08%3A30%3A42Z%22%7D%7D%5D%7D

				

  Paquot, Magali, and Yves Bestgen, ‘Distinctive Words in Academic Writing: A Comparison of Three Statistical Tests for Keyword Extraction’, in Corpora: Pragmatics and Discourse, ed. by Andreas H. Jucker, Daniel Schreier, and Marianne Hundt (Brill | Rodopi, 2009), doi:10.1163/9789042029101_014

				
				

  Smucker, Mark D., James Allan, and Ben Carterette, ‘A Comparison of Statistical Significance Tests for Information Retrieval Evaluation’, in Proceedings of the Sixteenth ACM Conference on Conference on Information and Knowledge Management, CIKM ’07 (ACM, 2007), pp. 623–32, http://doi.org/10.1145/1321440.1321528

				
				

  Welch, Bernard Lewis, ‘The Generalization of Student’s Problem When Several Different Population Variances Are Involved’, Biometrika, 34.1–2 (1947), pp. 28–35, http://doi.org/10.1093/biomet/34.1-2.28