删除经典ASP/VBScript中的四个字节UTF-8字符(与MySQL相关) [英] Remove four byte UTF-8 characters in classic ASP/VBScript (MySQL related)

查看:96
本文介绍了删除经典ASP/VBScript中的四个字节UTF-8字符(与MySQL相关)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我花了大约18个小时尝试各种不同的事物并进行搜索,最后我放弃了,不得不问你们.

I've spent about 18 hours of trying different things and searching around now, finally I give up and have to ask you guys.

背景故事:我终于将旧的MS Access数据库迁移到MySQL(版本5.6.16-log).

Backstory: I am finally migrating a old MS Access database to MySQL (version 5.6.16-log).

问题:Access数据库中的某些Unicode文本包含四个字节(UTF-8).

Problem: Some Unicode text in the Access database contain four bytes (UTF-8).

MySQL still 在插入四个字节的UTF-8字符时遇到问题.这个问题越来越老了,我惊讶地发现它还没有解决: http://bugs .mysql.com/bug.php?id = 67297

MySQL still has a problem with inserting four bytes UTF-8 characters. This problem is getting old and I was surprised to discover it's not fixed yet: http://bugs.mysql.com/bug.php?id=67297

我正在使用"MySQL ODBC 5.3 Unicode驱动程序"在数据库之间传输数据(最新的beta开发版本).无论我尝试什么,当我尝试插入具有4个字节UTF8字符的字符串时,该过程最终都会冻结(该线程永远使用100%CPU).尝试了Internet上所有地方建议的所有变通办法,没有任何效果.

I'm using "MySQL ODBC 5.3 Unicode Driver" to transfer data between databases (the latest beta development release). No matter what I try the process ends up freezing when I try to insert the string with 4 byte UTF8 characters (the thread uses 100% CPU forever). Have tried all workarounds suggested everywhere on the Internet, nothing works.

现在,我将只接受MySQL的局限性:我无法存储所有Unicode字符.

Now I will just accept the limitations of MySQL: I can't store all Unicode characters.

因此,在将文本插入数据库之前,我想从文本中删除所有4个字节的UTF8字符.但是我一生无法找到用经典ASP做到这一点的方法.

So I want to remove all 4 byte UTF8 characters from the text before I insert it into the database. But I can't for the life of me find a way to do it in classic ASP.

有人可以帮忙吗?

(我不能使用ASP btw,有太多的代码可以用另一种语言重写它.只是更改数据库是一项了不起的壮举;其中有数个数据库,需要几天的时间才能完成.)

(I can't not use ASP btw, there is way too much code to rewrite it in a different language. Just changing databases is a remarkable feat; there are several of them and it will take days to complete.)

JScript中的解决方案也是可以接受的,因为它可以从ASP页面运行.

A solution in JScript is also acceptable, since it can be run from ASP pages.

推荐答案

这应该有效:

Function UTF8Filter(strString)
    On Error Resume Next
    For i = 1 to Len(strString)

        charCode = AscW(Mid(strString, i, 1))
        If charCode > 32 AND charCode <= 127 then   ' here was OR 
            'Append valid character'
            strString = Mid(strString, i, 1)
        End If
    Next

    UTF8Filter = strString
    On Error Goto 0
End Function

更新的功能:

Function Remove4ByteUFT8(strString)
    Set objRegEx = CreateObject("VBScript.RegExp")
    objRegEx.Global = True   
    objRegEx.IgnoreCase = True
    objRegEx.Pattern = "/[\xF0-\xF7].../s"

    Remove4ByteUFT8 = objRegEx.Replace(strString, "")
End Function

这篇关于删除经典ASP/VBScript中的四个字节UTF-8字符(与MySQL相关)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆