php过滤非英文字符(ISO-8859-1转成utf-8的时候)

今天在做rss feed的时候又出问题了,还是特殊字符问题。xml真是要求多。

只能强行统一把ISO-8859-1转成utf-8,然后特殊字符替换掉。

function cleanname($thename){
 $patternCounter=0;
 
 
 $patterns[$patternCounter] = ‘/[\x7b-\xff]/u’; // remove all characters above the letter z. This will eliminate some non-English language letters
 $patternCounter++;
 
 $replacement =””;

  return preg_replace($patterns, $replacement, $thename);
}

下面是一些ISO-8859-1转成utf-8时出现的特殊字符的真正含义

$find[] = ‘“’; // left side double smart quote
 $find[] = ‘”’; // right side double smart quote
 $find[] = ‘‘’; // left side single smart quote
 $find[] = ‘’’; // right side single smart quote
 $find[] = ‘…’; // elipsis
 $find[] = ‘—‘; // em dash
 $find[] = ‘—‘; // en dash
 
 $replace[] = ‘”‘;
 $replace[] = ‘”‘;
 $replace[] = “‘”;
 $replace[] = “‘”;
 $replace[] = “…”;
 $replace[] = “-“;
 $replace[] = “-“;
 
 return str_replace($find, $replace, $text);

最后附上一个检查rss是否合法的网站,http://feedvalidator.org/

By | 2017-06-07T13:59:42+00:00 四月 29th, 2011|blog, 技术|0 Comments

About the Author: