ISO-8859-1字符截断文本插入utf-8 mysql列 [英] ISO-8859-1 Character truncates text inserting into utf-8 mysql column

查看:180
本文介绍了ISO-8859-1字符截断文本插入utf-8 mysql列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有一个奇怪的截断问题!



因此,基本上有一个明显的ISO字符½的问题,在使用UTF插入到列中时,会截断其余的文本-8指定。



让我们说我的字符串是:你需要添加1/2杯水。如果我:

 <$ c,则MySQL将截断为您需要添加



$ c> print iconv(ISO-8859-1,UTF-8 // IGNORE,$ text);

然后输出:

 ½

O_o



确定不起作用,因为我需要自己的1/2。如果我去phpMyAdmin并复制并粘贴句子并提交它,它的作用就像一个魅力,因为整个字符串在那里有一半的符号和剩余的文本!有什么问题,我对此感到困惑。我知道这可能会影响其他字符,所以底层问题需要解决。



我使用的语言是php,文件本身被编码为UTF-8我所带来的数据内容类型设置为ISO-8859-1。该列是utf8_general_ci,所有的mysql字符集都设置为UTF-8:php:SET character_set_result ='utf8',etc ...

解决方案

您的代码中的某些东西没有处理字符串为UTF8。它可能是您的PHP / HTML,它可能与您连接到数据库,或者它可能是数据库本身 - 一切必须一直设置为UTF8,如果不是,字符串将被截断完全一样看到当通过UTF8 /非UTF8边界时。



我将假设您的数据库符合UTF8标准 - 这是最容易检查的。请注意,排序规则可以在表中的服务器级别,数据库级别,表级别和列级别进行设置。在列上设置UTF8排序规则应该覆盖任何其他存储,但是如果他们也不是UTF8,那么其他人仍然会在与数据库通话时启动。如果您不确定,请在打开UTF8之后明确设置连接:

  $ dbh-> setAttribute(PDO :: MYSQL_ATTR_INIT_COMMAND,SET NAMES'utf8'); 

现在您的DB&连接是UTF8,确保你的网页也是。再次,这可以设置在多个地方(.htaccess,php.ini)。如果您不确定/无法访问,只需覆盖页面顶部的PHP默认值即可。

 <?php ini_set('default_charset','UTF-8'); ?> 

请注意,在您的页面输出任何文本之前,您需要在开始时进行上述权限。一旦文本输出,尝试和指定编码可能太晚了 - 您可能已经被锁定到服务器上的默认值。我也在我的标题中重复(可能是过度的):

 < head> 
< meta charset =UTF-8>
< meta http-equiv =Content-typecontent =text / html; charset = UTF-8>
< / head>

而且我将其覆盖在我正在使用数据的表单上:

 &FORM NAME =utf8-testMETHOD =POSTACTION =utf8-test.phpenctype =multipart / form-data accept-charset =UTF-8>

说实话,将编码设置在顶部,我的理解是其他覆盖不是必需的 - 但是我保留它们,因为它不会破坏任何东西,我宁愿只是明确地陈述编码,而不是让服务器做最后,你提到在phpMyAdmin你插入的字符串,它看起来像预期 - 你确定,phpMyAdmin页面是UTF8?我不认为当我从PHP代码存储UTF8数据时,它看起来像phpMyAdmin中的原始8位字符,如果我使用相同的字符串并直接存储在phpMyAdmin中,那么它看起来是正确的,所以我猜测phpMyAdmin是使用我的loc的默认字符集服务器,不一定是UTF8。



例如,从我的网页存储的以下字符串:

 我可以等待

在我的phpMyAdmin中读取这样:

 我不能等待

所以要测试的时候要小心,因为你真的不知道phpMyAdmin使用什么编码进行显示或数据库连接。



仍然有问题,请尝试下面的代码。首先我创建一个表,以UTF8存储文本:

  CREATE TABLE如果不存在`utf8_test`(
` id` int(11)NOT NULL AUTO_INCREMENT,
`my_text` varchar(8000)NOT NULL,
PRIMARY KEY(`id`)
)ENGINE = MyISAM DEFAULT CHARSET = utf8 AUTO_INCREMENT = 1 ;

这里有一些PHP来测试它。它基本上将您的输入输入到表单中,并将该输入回传给您,并从DB中存储/检索文本。就像我说的那样,如果你直接在phpMyAdmin中查看数据,你可能会发现它看起来并不正确,但是在下面的页面中,由于页面和db连接都被锁定到UTF8。

 <?php 
//覆盖php.ini中设置的所有内容
ini_set('default_charset','UTF-8');

//上述覆盖不需要以下内容:
// header('Content-Type:text / html; charset = UTF-8');

//打开数据库
$ dbh = new PDO('mysql:dbname = utf8db; host = 127.0.0.1; charset = utf8','root','password');

//设置连接到UTF8
$ dbh-> setAttribute(PDO :: MYSQL_ATTR_INIT_COMMAND,SET NAMES'utf8');
//告诉MySql做参数替换,而不是PDO
$ dbh-> setAttribute(PDO :: ATTR_EMULATE_PREPARES,false);
//如果查询不正确,则抛出异常(并中断代码)
$ dbh-> setAttribute(PDO :: ATTR_ERRMODE,PDO :: ERRMODE_EXCEPTION);

$ id = 0;
if(isset($ _ POST [StoreText]))
{
$ stmt = $ dbh->准备('INSERT INTO utf8_test(my_text)VALUES(:my_text)') ;
$ stmt-> execute(array(':my_text'=> $ _POST ['my_text']));
$ id = $ dbh-> lastInsertId();
}
?>

<!DOCTYPE html PUBLIC - // W3C // DTD XHTML 1.0 Transitional / ENhttp://www.w3.org/TR/xhtml11/DTD/xhtml11-transitional.dtd >
< html xmlns =http://www.w3.org/1999/xhtml>
< head>
< meta charset =UTF-8>
< meta http-equiv =Content-typecontent =text / html; charset = UTF-8>

< title> UTF-8测试< / title>
< / head>

< body>

<?php
//如果有东西发布,输出
if(isset($ _ POST ['my_text']))
{
echoPOSTED< br> \\\
;
echo $ _POST ['my_text']。 <峰; br> \\\
;
}

//如果有东西写入数据库,请将其读回来,并输出
if($ id> 0)
{
$ stmt = $ dbh-> prepare('SELECT my_text FROM utf8_test WHERE id =:id');
$ stmt-> execute(array(':id'=> $ id));
if($ result = $ stmt-> fetch())
{
echoSTORED< br> \\\
;
echo $ result ['my_text']。 <峰; br> \\\
;
}
}

//创建一个表单来取一些用户输入
echo&FORM NAME = \utf8-test\METHOD = \POST\ACTION = \utf8-test.php\enctype = \multipart / form-data\accept-charset = \UTF-8\>;

echo< br>;

echo< textarea name = \my_text\rows = \20 \cols = \90 \>;

//如果发布了某些内容,请将其包含在
的表单上(if(isset($ _ POST ['my_text']))
{
echo $ _POST [ 'my_text'];
}

echo< / textarea>;

echo< br>;
echo< INPUT TYPE = \Submit\Name = \StoreText\VALUE = \Store It\/>;

echo< / FORM>;
?>
< br>

< / body>

< / html>


So I have a weird truncate issue! Can't find a specific answer on this.

So basically there's an issue with an apparent ISO character ½ that truncates the rest of the text upon insertion into a column with UTF-8 specified.

Lets say that my string is: "You need to add ½ cup of water." MySQL will truncate that to "You need to add"

if I:

print iconv("ISO-8859-1", "UTF-8//IGNORE", $text);

Then it outputs:

½

O_o

OK that doesn't work because I need the 1/2 by itself. If I go to phpMyAdmin and copy and paste the sentence in and submit it, it works like a charm as the whole string is in there with half symbol and remaining text! Something is wrong and I'm puzzled at what it is. I know this will probably affect other characters so the underlying problem needs to be addressed.

The language I'm using is php, the file itself is encoded as UTF-8 and the data I'm bringing in has content-type set to ISO-8859-1. The column is utf8_general_ci and all the mysql character sets are set to UTF-8 in php: "SET character_set_result = 'utf8', etc..."

解决方案

Something in your code isn't handling the string as UTF8. It could be your PHP/HTML, it could be in your connection to the DB, or it could be the DB itself - everything has to be set as UTF8 consistently, and if anything isn't, the string will get truncated exactly as you see when passing across a UTF8/non-UTF8 boundary.

I will assume your DB is UTF8 compliant - that is easiest to check. Note that the collation can be set at the server level, database level, the table level, and the column level within the table. Setting UTF8 collation on the column should override anything else for storage, but the others will still kick in when talking to the DB if they're not also UTF8. If you're not sure, explicitly set the connection to UTF8 after you open it:

$dbh->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8'");

Now your DB & connection are UTF8, make sure your web page is too. Again, this can be set in more than one place (.htaccess, php.ini). If you're not sure / don't have access, just override whatever PHP is picking up as default at the top of your page:

<?php ini_set('default_charset', 'UTF-8'); ?>

Note that you want the above right at the start, before any text is output from your page. Once text gets output, it is potentially too late to try and specify an encoding - you may already be locked into whatever is default on your server. I also then repeat this in my headers (possibly overkill):

<head>
<meta charset="UTF-8">
<meta http-equiv="Content-type" content="text/html; charset=UTF-8">
</head>

And I override it on forms where I'm taking data as well:

<FORM NAME="utf8-test" METHOD="POST" ACTION="utf8-test.php" enctype="multipart/form-data" accept-charset="UTF-8">"

To be honest, if you've set the encoding at the top, my understanding is that the other overrides aren't required - but I keep them anyway, because it doesn't break anything either, and I'd rather just state the encoding explicitly, than let the server make assumptions.

Finally, you mentioned that in phpMyAdmin you inserted the string and it looked as expected - are you sure though that the phpMyAdmin pages are UTF8? I don't think they are. When I store UTF8 data from my PHP code, it views like raw 8-bit characters in phpMyAdmin. If I take the same string and store it directly in phpMyAdmin, it looks 'correct'. So I'm guessing phpMyAdmin is using the default character set of my local server, not necessarily UTF8.

For example, the following string stored from my web page:

I can¹t wait

Reads like this in my phpMyAdmin:

I can’t wait

So be careful when testing that way, as you don't really know what encoding phpMyAdmin is using for display or DB connection.

If you're still having issues, try my code below. First I create a table to store the text in UTF8:

CREATE TABLE IF NOT EXISTS `utf8_test` (
  `id` int(11) NOT NULL AUTO_INCREMENT,
  `my_text` varchar(8000) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=MyISAM  DEFAULT CHARSET=utf8 AUTO_INCREMENT=1 ;

And here's some PHP to test it. It basically takes your input on a form, echoes that input back at you, and stores/retrieves the text from the DB. Like I said, if you view the data directly in phpMyAdmin, you might find it doesn't look right there, but through the page below it should always appear as expected, due to the page & db connection both being locked to UTF8.

<?php
  // Override whatever is set in php.ini
  ini_set('default_charset', 'UTF-8');

  // The following should not be required with the above override
  //header('Content-Type:text/html; charset=UTF-8');

  // Open the database
  $dbh = new PDO('mysql:dbname=utf8db;host=127.0.0.1;charset=utf8', 'root', 'password');

  // Set the connection to UTF8
  $dbh->setAttribute(PDO::MYSQL_ATTR_INIT_COMMAND, "SET NAMES 'utf8'");
  // Tell MySql to do the parameter replacement, not PDO
  $dbh->setAttribute(PDO::ATTR_EMULATE_PREPARES, false);
  // Throw exceptions (and break the code) if a query is bad
  $dbh->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

  $id = 0;
  if (isset($_POST["StoreText"]))
  {
    $stmt = $dbh->prepare('INSERT INTO utf8_test (my_text) VALUES (:my_text)');
    $stmt->execute(array(':my_text' => $_POST['my_text']));
    $id = $dbh->lastInsertId();
  }
?>

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional/EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta charset="UTF-8">
<meta http-equiv="Content-type" content="text/html; charset=UTF-8">

<title>UTF-8 Test</title>
</head>

<body>

<?php
  // If something was posted, output it
  if (isset($_POST['my_text']))
  {
    echo "POSTED<br>\n";
    echo $_POST['my_text'] . "<br>\n";
  }

  // If something was written to the database, read it back, and output it
  if ($id > 0)
  {
    $stmt = $dbh->prepare('SELECT my_text FROM utf8_test WHERE id = :id');
    $stmt->execute(array(':id' => $id));
    if ($result = $stmt->fetch())
    {
      echo "STORED<br>\n";
      echo $result['my_text'] . "<br>\n";
    }
  }

  // Create a form to take some user input
  echo "<FORM NAME=\"utf8-test\" METHOD=\"POST\" ACTION=\"utf8-test.php\" enctype=\"multipart/form-data\" accept-charset=\"UTF-8\">";

  echo "<br>";

  echo "<textarea name=\"my_text\" rows=\"20\" cols=\"90\">";

  // If something was posted, include it on the form
  if (isset($_POST['my_text']))
  {
    echo $_POST['my_text'];
  }

  echo "</textarea>";

  echo "<br>";
  echo "<INPUT TYPE = \"Submit\" Name = \"StoreText\" VALUE=\"Store It\" />";

  echo "</FORM>";
?>
<br>

</body>

</html>

这篇关于ISO-8859-1字符截断文本插入utf-8 mysql列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆