I post my normalization function. This function find all illegal characters (like: Ő, Ű, ú, í, etc. ) and replace it to latin1 chars. (like: O, U, u, i, etc.) This function works for me on hungarian words. Try it on your text and post your comment about it.
<?php
function normalize($string) {
$string = strtr($string, array('ő' => 'o', 'ű' => 'u', 'Ő' => 'O', 'Ű' => 'U'));
$a = 'ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖŐØÙÚÛÜŰÝÞßàáâãäåæçèéêëìíîïðñòóôõöőøùúûűüýýþÿŔŕ';
$b = 'AAAAAAACEEEEIIIIDNOOOOOOOUUUUUYbsaaaaaaaceeeeiiiidnooooooouuuuuyybyRr';
$string = utf8_decode($string);
$string = strtr($string, utf8_decode($a), $b);
return utf8_encode($string);
}
?>