ITEMS BY CATEGORY
aphorism(2)
biz(11) in english(6) indian food(16) kaneko(17) korea(21) life and love(23) misc(81) salsa(3) soc(71) tech(48) things(22) travel(16)
Topics With Recent Comments
Archives
February 2008 (1)
December 2007 (1) November 2007 (1) October 2007 (2) September 2007 (3) August 2007 (2) June 2007 (2) May 2006 (1) March 2006 (1) February 2006 (3) December 2005 (1) November 2005 (2) October 2005 (1) August 2005 (8) July 2005 (6) June 2005 (8) May 2005 (5) April 2005 (10) March 2005 (9) February 2005 (5) January 2005 (8) December 2004 (6) November 2004 (5) October 2004 (7) September 2004 (5) August 2004 (3) July 2004 (6) June 2004 (3) May 2004 (19) April 2004 (10) March 2004 (20) February 2004 (7) January 2004 (6) December 2003 (6) November 2003 (7) October 2003 (7) September 2003 (5) August 2003 (8) July 2003 (6) June 2003 (12) May 2003 (12) April 2003 (15) March 2003 (14) February 2003 (11) January 2003 (12) December 2002 (14) November 2002 (15) October 2002 (7)
Recent Entries
精神科医薬とサイエントロジー
Asiajin - アジアのITに関する英語ブログ Rozerem - 全く新しい睡眠薬 A380ようやく就航 Animate! 福岡空港の増設・移転 Techcrunch20行きます mockmail.rb Embassy Suites Rails初心者講習会
Search
A-vertisement
|
March 01, 2005[Ruby] 半角かな変換Rubyで普通のカタカナをJIS X 0201カタカナ(半角カナ)に変換します。 以下、コード。
# Copyright (c) 2005, Mellowtone Inc., ARAI Shunichi
# All rights reserved.
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
# Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
# Neither the name of the Mellowtone Inc. nor the names of its contributors may
# be used to endorse or promote products derived from this software without
# specific prior written permission.
require 'jcode'
# detach dakuten/handakuten marks from composite katakana characters.
# also substitute some characters which are not supported in JIS X 0201.
def kanabreak(str)
daku = "ガギグゲゴザジズゼゾダヂヅデドバビブベボヴ"
daku2 = "カキクケコサシスセソタチツテトハヒフヘホウ"
handaku = "パピプペポ"
handaku2 = "ハヒフヘホ"
str.gsub!(/[#{daku}]/) {|c| c + "゛"}
str.gsub!(/[#{handaku}]/) {|c| c + "゜"}
str.tr!(daku,daku2)
str.tr!(handaku,handaku2)
str.tr!("ヵヶヰヱヮ","カケイエワ")
return str
end
# convert Japanese katakana characters to JIS X 0201 format.
def kanazenhan(str)
str = kanabreak(str)
zenkana = "。「」、・ヲァィゥェォャュョッーアイウエオカキクケコサシスセソタチツテトナニヌネノハヒフヘホマミムメモヤユヨラリルレロワン゛゜"
a = zenkana.split(//)
kanahash = {}
a.each_index {|i| kanahash[a[i]] = i + 0xA1 }
str.gsub!(/[#{zenkana}]/) { |c| kanahash[c].chr }
return str
end
def test
test = <<TEST
「テストデース、アライ・シュンイチ。」
ァアィイゥウェエォオ カ ガ キ ギ ク
グ ケ ゲ コ ゴ サ ザ シ ジ ス ズ セ ゼ ソ ゾ タ
ダ チ ヂ ッ ツ ヅ テ デ ト ド ナ ニ ヌ ネ ノ ハ
バ パ ヒ ビ ピ フ ブ プ ヘ ベ ペ ホ ボ ポ マ ミ
ム メ モ ャ ヤ ュ ユ ョ ヨ ラ リ ル レ ロ ヮ ワ
ヰ ヱ ヲ ン ヴ ヵ ヶ
TEST
puts kanabreak(test)
puts kanazenhan(test)
end
Posted by arai at March 1, 2005 12:19 AM
Comments
Post a comment
|