Ruby transliteration using hash -


i try make cyrillic => latin transliteration using hash, use # encoding: utf-8 , ruby 1.9.3. want code change value of file_name. why code leave file_name unchanged?

abc = hash.new abc = {"a" => "a", "b" => "б", "v" => "в", 'g' => "г", 'd'=> "д", 'jo' => "ё", 'zh' => "ж", 'th' => "з", 'i' => "и", 'l' => "л", 'm' => "м", 'n' => "н",'p' => "п", 'r' => "р", 's' => "с", 't' => "т", 'u' => "у", 'f' => "ф", 'h' => "х", 'c' => "ц", 'ch' => "ч", 'sh' => "ш", 'sch' => "щ", 'y' => "ы",'u' => "ю", 'ja' => "я"}  file_name.each_char |c|       abc.each {|key, value| if c == value c = key end } end  

the problem .each_char block variable - c in question - not point character in string allowing alter string in situ. there are ways make per-character mapping work there (using .map followed .join instance) - inefficient compared .tr! or .gsub! purpose, because breaking string out array of characters , reconstructing involves creating many ruby objects.

i think need like

file_name.tr!( 'aбвгдилмнпрстуфхцыю', 'abvgdilmnprstufhcyu' ) 

which covers single letter conversions efficiently. have multi-letter conversions. use gsub! that, , inverted copy of hash

latin_of = {"ё"=>"jo", "ж"=>"zh", "з"=>"th", "ч"=>"ch",              "ш"=>"sh", "щ"=>"sch", "я"=>"ja"} file_name.gsub!( /[ёжзчшщя]/ ) { |cyrillic| latin_of[ cyrillic ] } 

note, unlike each_char, return value of block in .gsub! used replace whatever matched in original string. above code uses inversion of original hash find correct latin replacement matched cyrillic character.

you don't need tr! . . . instead, if prefer, use inversion of original hash in 1 pass using second syntax. cost of using 2 methods means don't gain using .tr!. should know string#tr! method, can handy.


edit: suggested in comments, .gsub! can lot more here. assuming latin_of complete hash cyrillic keys , latin values, this:

file_name.gsub!( regexp.union(latin_of.keys), latin_of ) 

two things note:

  • regexp.union(latin_of.keys) taking array of keys want convert , ensuring gsub find them ready replacement in string

  • gsub! accepts hash second parameter, , converts each match looking key , replacing associated value - behaviour looking for.


Comments

Popular posts from this blog

php - regexp cyrillic filename not matches -

c# - OpenXML hanging while writing elements -

sql - Select Query has unexpected multiple records (MS Access) -