Ruby transliteration using hash -
i try make cyrillic => latin transliteration using hash, use # encoding: utf-8 , ruby 1.9.3. want code change value of file_name
. why code leave file_name
unchanged?
abc = hash.new abc = {"a" => "a", "b" => "б", "v" => "в", 'g' => "г", 'd'=> "д", 'jo' => "ё", 'zh' => "ж", 'th' => "з", 'i' => "и", 'l' => "л", 'm' => "м", 'n' => "н",'p' => "п", 'r' => "р", 's' => "с", 't' => "т", 'u' => "у", 'f' => "ф", 'h' => "х", 'c' => "ц", 'ch' => "ч", 'sh' => "ш", 'sch' => "щ", 'y' => "ы",'u' => "ю", 'ja' => "я"} file_name.each_char |c| abc.each {|key, value| if c == value c = key end } end
the problem .each_char
block variable - c
in question - not point character in string allowing alter string in situ. there are ways make per-character mapping work there (using .map
followed .join
instance) - inefficient compared .tr!
or .gsub!
purpose, because breaking string out array of characters , reconstructing involves creating many ruby objects.
i think need like
file_name.tr!( 'aбвгдилмнпрстуфхцыю', 'abvgdilmnprstufhcyu' )
which covers single letter conversions efficiently. have multi-letter conversions. use gsub!
that, , inverted copy of hash
latin_of = {"ё"=>"jo", "ж"=>"zh", "з"=>"th", "ч"=>"ch", "ш"=>"sh", "щ"=>"sch", "я"=>"ja"} file_name.gsub!( /[ёжзчшщя]/ ) { |cyrillic| latin_of[ cyrillic ] }
note, unlike each_char
, return value of block in .gsub!
used replace whatever matched in original string. above code uses inversion of original hash find correct latin replacement matched cyrillic character.
you don't need tr!
. . . instead, if prefer, use inversion of original hash in 1 pass using second syntax. cost of using 2 methods means don't gain using .tr!
. should know string#tr!
method, can handy.
edit: suggested in comments, .gsub!
can lot more here. assuming latin_of
complete hash cyrillic keys , latin values, this:
file_name.gsub!( regexp.union(latin_of.keys), latin_of )
two things note:
regexp.union(latin_of.keys)
taking array of keys want convert , ensuringgsub
find them ready replacement instring
gsub!
accepts hash second parameter, , converts each match looking key , replacing associated value - behaviour looking for.
Comments
Post a Comment