Great Circle Associates

XCIN Mail-list
(December 2000)


Indexed By Date: [Previous] [Next] Indexed By Thread: [Previous] [Next]

Subject: Re: about xcin and addtsi....
From: Chih-Hao Tsai <hao520@yahoo.com>
Organization: Taiwan Linux User Group News Server
Date: Thu, 07 Dec 2000 15:19:42 -0600
To: xcin@tlug.sinica.edu.tw
Reply-To: xcin@linux.org.tw

Kuang-che Wu wrote:
> 之前我說錯了, 現在就是先上字頻比較高的字,
> 應該是先上 "獨某音, 字頻最高的字"
> 我想到的作法是, 根據詞庫以及詞庫中填的注音
> 統計每個字各種唸法出現的頻率, 用這個頻率來上字

半年多半做的:
Mandarin syllable frequency counts for Chinese characters 
http://www.geocities.com/hao510/syllable/

要注意的是在這個統計中的「頻率」,指的是某個字的某個音出現在
多少個詞;並沒有用字頻或詞頻加權。因為破音字的各種讀音都附在
後面,你拿頻率加權只能每個都加,會造成結果不準。不用頻率加權
,結果反而乾淨一點。



--
Chih-Hao Tsai | ICQ#5734422 | http://www.geocities.com/hao520
To Unsubscribe: send mail to majordomo@linux.org.tw
with "unsubscribe xcin" in the body of the message


Follow-Ups:
References:
Indexed By Date Previous: Re: about xcin and addtsi....
From: Kuang-che Wu <kcwu@camel.ck.tp.edu.tw>
Next: Re: about xcin and addtsi....
From: Kuang-che Wu <kcwu@camel.ck.tp.edu.tw>
Indexed By Thread Previous: Re: about xcin and addtsi....
From: Kuang-che Wu <kcwu@camel.ck.tp.edu.tw>
Next: Re: about xcin and addtsi....
From: Kuang-che Wu <kcwu@camel.ck.tp.edu.tw>