使用PHP(垃圾文本)将希伯来文本插入MySQL [英] Inserting Hebrew text into MySQL using PHP (garbage text)

查看:131
本文介绍了使用PHP(垃圾文本)将希伯来文本插入MySQL的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我遇到了一个奇怪的问题,将希伯来文本插入到mysql。



基本上问题是:

我有一个php脚本从csv文件中启用希伯来文本,然后将其发送到mysql数据库。数据库的字符集和表的所有字段都设置为UTF8并且与utf8_bin进行整理。但是当我使用mysql插入它时,随机垃圾值出现在文本中,这使得它完全无用的输出。注意:我仍然可以看到一半的词正确显示。



这是我的家庭作业,可能会帮助你理解:

1.正如我提到的表字符集和排序规则是utf8。

2.我发送了标题('Content-Type:text / html; charset = utf-8')

3.如果我回声的文本,它出现完美。当我转换它使用utf-8_encode
它得到正确转换。 (例如,שייפת转换为ש××פת)

4.当我对转换的变量使用utf-8_decode并使用echo时,它仍然完美显示。

5.我在mysql_connect之后使用这些



mysql_query(SET character_set_client ='utf8';);

mysql_query(SET character_set_result ='utf8';);

mysql_query(SET NAMES'utf8');

mysql_set_charset('utf8' );



甚至尝试过:

mysql_query(SET character_set_results ='utf8',character_set_client ='utf8',character_set_connection = 'utf8',character_set_database ='utf8',character_set_server ='utf8',$ con)


  1. 在我的php.ini文件中添加了default_charset =UTF-8。

  2. 我不知道在csv文件中使用的编码,但是当我用notepad ++打开它的编码是utf -8无BOM。

  3. 以下是实际垃圾的示例:

    原始文本:שייפת

    text after utf8_encode :ש××פת

    在相同脚本中utf8_decode后的文本:שייפת(perfect)

    文本发送到mysql数据库:ש×? ×?פת(注意两者之间)

    text如果我们从mysql回调:ש ? ?梵语(输出为关闭)

  4. 在utf8_encoding之前使用addslashes和stripslashes。 (甚至在没有运气后尝试过)

  5. 服务器位于运行xamp 1.7.4的窗口上



    • Apache 2.2.17

    • MySQL 5.5.8(社区服务器)

    • PHP 5.3.5(VC6 X86 32位)


编辑1:为了澄清我在网站上搜索过类似问题, (SET NAME UTF8和其他选项等),但它没有工作。因此,请不要将此问题标记为重复。



编辑2:
以下是完整脚本:

 <?php 
header('Content-Type:text / html; charset = utf-8');

if(isset($ _ GET ['filename'])== true)
{
$ databasehost =localhost;
$ databasename =what_csv;


$ databaseusername =root;
$ databasepassword =;
$ databasename =csv;

$ fieldseparator =\\\
;
$ lineseparator =@ contact\\\
;


$ csvfile = $ _GET ['filename'];
/ ******************************** /


if(!file_exists($ csvfile)){
echo找不到文件。确保指定了正确的路径。
exit;
}

$ file = fopen($ csvfile,r);

if(!$ file){
echo打开数据文件错误。\\\
;
exit;
}

$ size = filesize($ csvfile);

if(!$ size){
echoFile is empty.\\\
;
exit;
}

$ csvcontent = fread($ file,$ size);

fclose($ file);

$ con = @mysql_connect($ databasehost,$ databaseusername,$ databasepassword)或die(mysql_error());

mysql_query(SET NAMES utf8);
mysql_set_charset('utf8',$ con);
/ *
mysql_query(SET character_set_client ='utf8';);
mysql_query(SET character_set_result ='utf8';);

mysql_query(SET NAMES'utf8');
mysql_set_charset('utf8');

mysql_query(SET character_set_results ='utf8',character_set_client ='utf8',character_set_connection ='utf8',character_set_database ='utf8',character_set_server ='utf8',$ con);
* /

@mysql_select_db($ databasename)或die(mysql_error());



$ lines = 0;
$ queries =;
$ linearray = array()?

foreach(explode($ lineseparator,$ csvcontent)as $ line){

$ Name =;
$ Landline1 =;
$ Landline2 =;
$ Mobile =;
$ Address =;
$ Email =;
$ IMEI =temp;
$ got_imei = false;

// echo $ line。'< br>';
$ lines ++;

$ line = trim($ line,\t);

$ line = str_replace(\r,,$ line);

$ linearray = explode($ fieldseparator,$ line);
//检查要插入的值
foreach($ linearray as $ field)
{
if(is_numeric($ field)){$ got_imei = true; $ IMEI = trim ($ field);}
if(stristr($ field,'Name:')){$ Name = trim(str_replace(Name:,,$ field) (stristr($ field,'Landline:')){$ Landline1 = trim(str_replace(Landline:,,$ field));}
if ){$ Landline2 = trim(str_replace(Landline2:,,$ field));}
if(stristr($ field,'Mobile:')){$ Mobile = trim :,,$ field));}
if(stristr($ field,'Address:')){$ Address = trim(str_replace(Address:,,$ field) }
if(stristr($ field,'Email:')){$ Email = trim(str_replace(Email:,,$ field));}



}
if($ got_imei == true)
{

$ query =UPDATE $ databasetable SET imei = $ IMEI where imei ='temp' ;
mysql_query($ query);

}



else if(($ name ==)&&($ Landline1 ==)& &($ Landline2 ==)&&&($ Mobile ==)&&($ Address ==)){echo;}
else
{
// $ Name = utf8_encode($ Name);
// $ Name = addslashes($ Name);
$ Name = utf8_encode(mysql_real_escape_string($ Name));

echo$ Name,$ Landline1,$ Landline2,$ Address,$ IMEI< br>;
$ query =insert into $ databasetable(imei,name,landline1,landline2,mobile,address,email)values('$ IMEI','$ Name','$ Landline1','$ Landline2' $ Mobile','$ Address','$ Email');;
mysql_query($ query);
$ Name = utf8_decode(($ Name));
echo $ Name。< br>;

}
}
@mysql_close($ con);



echo在此csv文件中找到总共$行记录。\\\
;

}
?>


< form>
输入文件名< input type =textname =filename/>< br />
< input type =submitvalue =Submit/>< br>
注意:文件必须与此脚本位于同一目录中。请包括完整的文件名,例如filename.csv。
< / form>

以下是csv文件的示例:

  @contact 
名称:שייפת
手机:0547939898

@IMEI
355310042074173



编辑3:



如果我直接通过cmd输入字符串此警告:

 警告代码:1366 
字符串值不正确:'\xD7\xA9\xD7\\ \\ x99 \xD7 ...'for row'name'in row 1

发现在网上可能是相关的,任何帮助?
http://bugs.mysql.com/bug.php?id=30131

解决方案

使用Text / LongText代替varchar。也使用排序规则作为utf8_general_ci



希望这将帮助你@Ajit


I'm facing a weird problem with inserting hebrew text into mysql.

Basically the problem is :
I have a php script which picks up hebrew text from a csv file then send it to mysql database. The charset of both database and all fields of tables are set to UTF8 and collation to utf8_bin. But when I insert it using mysql, random garbage value appears inside the text which renders it completely useless for output. NOTE : I can still see half of the words appear correctly.

Here is my homework which might help you in understanding :
1. As I mentioned the table charset and collation are utf8.
2. I've send header('Content-Type: text/html; charset=utf-8')
3. If I echo out the text, it appears perfectly. When I convert it using utf-8_encode it get converted properly. (eg. שי יפת get converted to ×©× ×פת)
4. When I use utf-8_decode on the converted variable and use echo, it still displays perfectly.
5. I've used these after mysql_connect

mysql_query("SET character_set_client = 'utf8';");
mysql_query("SET character_set_result = 'utf8';");
mysql_query("SET NAMES 'utf8'");
mysql_set_charset('utf8');

and even tried this :
mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $con)

  1. Added default_charset = "UTF-8" in my php.ini file.
  2. I am unaware of the encoding used in csv file but when I open it with notepad++ the encoding is utf-8 without BOM.
  3. Here is a sample of the actual garbage :
    original text : שי יפת
    text after utf8_encode : ×©× ×פת
    text after utf8_decode in same script : שי יפת (perfect)
    text send to mysql database : ש×? ×?פת (notice the ? in between)
    text if we echo from mysql : ש�? �?פת (the output is close)
  4. Used addslashes and stripslashes before utf8_encoding. (even tried after no luck)
  5. Server is on windows running xamp 1.7.4
    • Apache 2.2.17
    • MySQL 5.5.8 (Community Server)
    • PHP 5.3.5 (VC6 X86 32bit)

EDIT 1 : Just to clarify that I did searched the site for similar questions and did implemented the suggestions found (SET NAME UTF8 and alot other options etc) but it didn't work out. So please don't mark this question as repeat.

EDIT 2 : Here is the full script :

    <?php
header('Content-Type: text/html; charset=utf-8'); 

if (isset($_GET['filename'])==true)
{
$databasehost = "localhost";
$databasename = "what_csv";


$databaseusername="root";
$databasepassword="";
$databasename= "csv";

$fieldseparator = "\n";
$lineseparator = "@contact\n";


$csvfile = $_GET['filename'];
/********************************/


if(!file_exists($csvfile)) {
    echo "File not found. Make sure you specified the correct path.\n";
    exit;
}

$file = fopen($csvfile,"r");

if(!$file) {
    echo "Error opening data file.\n";
    exit;
}

$size = filesize($csvfile);

if(!$size) {
    echo "File is empty.\n";
    exit;
}

$csvcontent = fread($file,$size);

fclose($file);

$con = @mysql_connect($databasehost,$databaseusername,$databasepassword) or die(mysql_error());

mysql_query( "SET NAMES utf8" );
mysql_set_charset('utf8',$con);
/*
mysql_query("SET character_set_client = 'utf8';"); 
mysql_query("SET character_set_result = 'utf8';");

mysql_query("SET NAMES 'utf8'");
mysql_set_charset('utf8');

mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'", $con);
*/

@mysql_select_db($databasename) or die(mysql_error());



$lines = 0;
$queries = "";
$linearray = array();

foreach(explode($lineseparator,$csvcontent) as $line) {

$Name="";
$Landline1="";
$Landline2="";
$Mobile="";
$Address="";
$Email="";
$IMEI="temp";
$got_imei=false;

//echo $line.'<br>';
    $lines++;

    $line = trim($line," \t");

    $line = str_replace("\r","",$line);

    $linearray = explode($fieldseparator,$line);
    //check for values to insert
    foreach($linearray as $field)
    {
    if (is_numeric($field)){ $got_imei=true;$IMEI=trim($field);}
    if (stristr($field, 'Name:')) {$Name=trim(str_replace("Name:", "", $field));}   
    if (stristr($field, 'Landline:')) {$Landline1=trim(str_replace("Landline:", "", $field));}  
    if (stristr($field, 'Landline2:')) {$Landline2=trim(str_replace("Landline2:", "", $field));}    
    if (stristr($field, 'Mobile:')) {$Mobile=trim(str_replace("Mobile:", "", $field));} 
    if (stristr($field, 'Address:')) {$Address=trim(str_replace("Address:", "", $field));}
    if (stristr($field, 'Email:')) {$Email=trim(str_replace("Email:", "", $field));}



    }
    if ($got_imei==true)
    {

    $query = "UPDATE $databasetable SET imei=$IMEI where imei='temp'";
        mysql_query($query);

    }



    else if (($Name=="") &&  ($Landline1=="" ) && ($Landline2=="")  && ($Mobile=="")  && ($Address=="")) {echo "";}
    else
    {
        //$Name = utf8_encode("$Name");
        //$Name = addslashes("$Name");
        $Name = utf8_encode(mysql_real_escape_string("$Name"));

        echo"$Name,$Landline1,$Landline2,$Address,$IMEI<br>";
        $query = "insert into $databasetable (imei, name, landline1, landline2, mobile, address, email) values('$IMEI','$Name', '$Landline1','$Landline2','$Mobile', '$Address', '$Email');";
        mysql_query($query);
        $Name = utf8_decode(($Name));   
        echo $Name."<br>";

    }
}
@mysql_close($con);



echo "Found a total of $lines records in this csv file.\n";

}
?>


<form>
Enter file name <input type="text" name="filename" /><br />
<input type="submit" value="Submit" /><br>
NOTE : File must be present in same directory as this script. Please include full filename, for example filename.csv.
</form>

Here is a sample of csv file :

@contact
Name: שי יפת
Mobile: 0547939898

@IMEI
355310042074173

EDIT 3 :

If I directly enter the string via cmd I get this warning:

Warning Code : 1366
Incorrect string value: '\xD7\xA9\xD7\x99 \xD7...' for column 'name' at row 1

Here is something I found on the net that could be related, any help? http://bugs.mysql.com/bug.php?id=30131

解决方案

Use Text/LongText instead of varchar. Also use Collation as utf8_general_ci

Hope this will help you @Ajit

这篇关于使用PHP(垃圾文本)将希伯来文本插入MySQL的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆