从单元格中提取文本内容(使用粗体,斜体等) [英] Extract text content from cell (With bold, italic, etc)

查看:140
本文介绍了从单元格中提取文本内容(使用粗体,斜体等)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用宏从Excel中提取文本内容。这是我的代码:

I'm trying to extract text content from Excel using a macro. This is my code:

Dim i As Integer, j As Integer
Dim v1 As Variant
Dim Txt As String

v1 = Range("A2:C15")
For i = 1 To UBound(v1)
    For j = 1 To UBound(v1, 2)
        Txt = Txt & v1(i, j)
    Next j
    Txt = Txt & vbCrLf
Next i

MsgBox Txt

但它显示原始字符只意味着它不显示任何格式化信息,如粗体,斜体,下划线等。

But it is showing the raw characters only meaning that it doesn't show any formatting information like bold, italic, underline, etc..

我想提取文本以及格式信息

I want to extract the text along with the formatting information.

示例:这是示例文本

预期输出这是示例文本

实际输出文本

有人可以解释代码有什么问题,告诉是否有错误?

Can someone explain what's wrong with the code and tell if anything is wrong?

推荐答案

好的,让@stucharo的算法有点简单一些。

OK, let's have the algorithm from @stucharo a little bit simpler to extend.

Public Function getHTMLFormattedString(r As Range) As String

 isBold = False
 isItalic = False
 isUnderlined = False
 s = ""
 cCount = 0
 On Error Resume Next
 cCount = r.Characters.Count
 On Error GoTo 0

 If cCount > 0 Then

  For i = 1 To cCount

   Set c = r.Characters(i, 1)

   If isUnderlined And c.Font.Underline = xlUnderlineStyleNone Then
    isUnderlined = False
    s = s & "</u>"
   End If

   If isItalic And Not c.Font.Italic Then
    isItalic = False
    s = s & "</i>"
   End If

   If isBold And Not c.Font.Bold Then
    isBold = False
    s = s & "</b>"
   End If


   If c.Font.Bold And Not isBold Then
    isBold = True
    s = s + "<b>"
   End If

   If c.Font.Italic And Not isItalic Then
    isItalic = True
    s = s + "<i>"
   End If

   If Not (c.Font.Underline = xlUnderlineStyleNone) And Not isUnderlined Then
    isUnderlined = True
    s = s + "<u>"
   End If

   s = s & c.Text

   If i = cCount Then
    If isUnderlined Then s = s & "</u>"
    If isItalic Then s = s & "</i>"
    If isBold Then s = s & "</b>"
   End If

  Next i

 Else
  s = r.Text
  If r.Font.Bold Then s = "<b>" & s & "</b>"
  If r.Font.Italic Then s = "<i>" & s & "</i>"
  If Not (r.Font.Underline = xlUnderlineStyleNone) Then s = "<u>" & s & "</u>"
 End If

 getHTMLFormattedString = s
End Function

要清楚,此功能仅适用于包含单个单元格的范围。但是,对于每个单元格来说,这个函数应该在更大的范围内调用,并将返回的字符串连接成一个。

To be clear, this function works only with a range containing a single cell. But it should be easy calling this function for each cell in a bigger range and concatenating the returned strings into one.

由OP编辑:

我通过以下代码调用函数:

I called the function by the below code:

Sub ReplaceFormattingTags()

Dim i As Integer, j As Integer
Dim rng As Range
Dim Txt As String

Set rng = Range("A2:C15")
For i = 1 To rng.Rows.Count
    For j = 1 To rng.Columns.Count
        Txt = Txt & getHTMLFormattedString(rng(i, j)) & " "
    Next j
    Txt = Txt & vbCrLf
Next i

Debug.Print Txt

End Sub

这篇关于从单元格中提取文本内容(使用粗体,斜体等)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆