快速解析 html 的最佳实践是什么? [英] What is the best practice to parse html in swift?

查看：44 发布时间：2021/12/2 16:03:44 html swift parsing

本文介绍了快速解析 html 的最佳实践是什么?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是 Swift 新手.我需要在 Swift iOS 项目中使用 Python 的 BeautifulSoup 之类的东西.准确地说，我需要获取以 ".txt" 结尾的的所有 href.我应该采取哪些步骤?

有几个使用 Swift 和 Objective-C 的不错的 HTML 解析库代码>如下所示:

看看上面发布的四个库中的以下示例，主要使用XPath 2.0:

<块引用>hpple:
let data = NSData(contentsOfFile: path)让 doc = TFHpple(htmlData: 数据)if let elements = doc.searchWithXPathQuery("//a/@href[ends-with(.,'.txt')]") as?[TFHppleElement] {对于元素中的元素{打印(元素.内容)}}
<块引用>
NDHpple:
let data = NSData(contentsOfFile: path)！让 html = NSString(data: data, encoding: NSUTF8StringEncoding)！让 doc = NDHpple(HTMLData: html)if let elements = doc.searchWithXPathQuery("//a/@href[ends-with(.,'.txt')]") {对于元素中的元素{println(element.children?.first?.content)}}
<块引用>
Kanna(Xpath 和 CSS 选择器):
let html = "
<块引用>
Fuzi(Xpath 和 CSS 选择器):
let html = "ends-with 函数是 Xpath 2.0 的一部分.
<块引用>SwiftSoup(CSS 选择器):
做{让 doc: Document = try SwiftSoup.parse("...")let links: Elements = try doc.select("a[href]")//a with href让 pngs: Elements = try doc.select("img[src$=.png]")//以 src 结尾的 img .png让刊头:元素?= 尝试 doc.select("div.masthead").first()//带有 class=masthead 的 div让结果链接:元素?= try doc.select("h3.r > a")//在 h3 之后直接 a} catch Exception.Error(let type, let message){打印(消息)} 抓住 {打印(错误")}
<块引用>
姬 (XPath):
let jiDoc = Ji(htmlURL: URL(string: "http://www.apple.com/support")!)let titleNode = jiDoc?.xPath("//head/title")?.firstprint("title: (titleNode?.content)")//title: Optional("Apple 官方支持")
希望对你有帮助.
I'm a Swift newbie. I need for something like Python's BeautifulSoup in Swift iOS project. Precisely, I need to get all href of <a> that ends with ".txt". What are the steps that I should take?
 解决方案 
There are several nice libraries of HTML Parsing using Swift and Objective-C like the followings:


hpple
NDHpple
Kanna( old Swift-HTML-Parser)
Fuzi
SwiftSoup
Ji


Take a look in the following examples in the four libraries posted above, mainly parsed using XPath 2.0:

  hpple:


let data = NSData(contentsOfFile: path)
let doc = TFHpple(htmlData: data)

if let elements = doc.searchWithXPathQuery("//a/@href[ends-with(.,'.txt')]") as? [TFHppleElement] {
   for element in elements {
       println(element.content)
   }
}



  NDHpple:


let data = NSData(contentsOfFile: path)!
let html = NSString(data: data, encoding: NSUTF8StringEncoding)!
let doc = NDHpple(HTMLData: html)
if let elements = doc.searchWithXPathQuery("//a/@href[ends-with(.,'.txt')]") {
   for element in elements {
     println(element.children?.first?.content)
   }
}



  Kanna (Xpath and CSS Selectors):


let html = "<html><head></head><body><ul><li><input type='image' name='input1' value='string1value' class='abc' /></li><li><input type='image' name='input2' value='string2value' class='def' /></li></ul><span class='spantext'><b>Hello World 1</b></span><span class='spantext'><b>Hello World 2</b></span><a href='example.com'>example(English)</a><a href='example.co.jp'>example(JP)</a></body>"

if let doc = Kanna.HTML(html: html, encoding: NSUTF8StringEncoding) {
   var bodyNode   = doc.body

   if let inputNodes = bodyNode?.xpath("//a/@href[ends-with(.,'.txt')]") {
      for node in inputNodes {
         println(node.contents)
      }
   }
}



  Fuzi (Xpath and CSS Selectors):


let html = "<html><head></head><body><ul><li><input type='image' name='input1' value='string1value' class='abc' /></li><li><input type='image' name='input2' value='string2value' class='def' /></li></ul><span class='spantext'><b>Hello World 1</b></span><span class='spantext'><b>Hello World 2</b></span><a href='example.com'>example(English)</a><a href='example.co.jp'>example(JP)</a></body>"

do {
  // if encoding is omitted, it defaults to NSUTF8StringEncoding
  let doc = try HTMLDocument(string: html, encoding: NSUTF8StringEncoding)

  // XPath queries
  for anchor in doc.xpath("//a/@href[ends-with(.,'.txt')]") {
    print(anchor.stringValue)
  }

} catch let error {
    print(error)
}
The ends-with function is part of Xpath 2.0. 

  SwiftSoup (CSS Selectors):


do{
    let doc: Document = try SwiftSoup.parse("...")
    let links: Elements = try doc.select("a[href]") // a with href
    let pngs: Elements = try doc.select("img[src$=.png]")

    // img with src ending .png
    let masthead: Element? = try doc.select("div.masthead").first()

    // div with class=masthead
    let resultLinks: Elements? = try doc.select("h3.r > a") // direct a after h3
} catch Exception.Error(let type, let message){
    print(message)
} catch {
   print("error")
}



  Ji (XPath):


let jiDoc = Ji(htmlURL: URL(string: "http://www.apple.com/support")!)
let titleNode = jiDoc?.xPath("//head/title")?.first
print("title: (titleNode?.content)") // title: Optional("Official Apple Support")
I hope this helps you.

                        这篇关于快速解析 html 的最佳实践是什么?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

快速解析 html 的最佳实践是什么? [英] What is the best practice to parse html in swift?

问题描述

相关文章

移动开发最新文章

热门教程

热门工具

登录关闭

快速解析 html 的最佳实践是什么? [英] What is the best practice to parse html in swift?

问题描述

相关文章

移动开发最新文章

热门教程

热门工具

登录 关闭

登录关闭