R - 从 URL/HTML 对象/HTML 响应写入 HTML 文件 [英] R - Write a HTML file from URL/HTML Object/HTML Response

查看:55
本文介绍了R - 从 URL/HTML 对象/HTML 响应写入 HTML 文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用来自 R 的 URL 保存 HTML 文件.在使用 httrrvest 包的 GETread_html 函数后,我尝试保存响应对象分别在网站的URL上,我要保存的HTML.但这并不能保存网站的实际内容.

I want to save a HTML file using a URL from R. I have tried to save the response object(s) after using GET and read_html functions of httr and rvest packages respectively, on the URL of the website, I want to save the HTML of. But that didn't work out to save the actual contents of the website.

url = "https://facebook.com"
get_object = httr::GET(url); save(get_object, "file.html")
html_object = rvest::read_html(url); save(html_object, "file.html")

这些都无法在 HTML 文件中保存实际网站的正确输出(即网页的 HTML 内容在 .html 文件中).

Neither of these work to save the correct output (i.e, the HTML content of the webpage in a .html file) of the actual website in the HTML file.

推荐答案

使用 str(object) 找出您正在使用的内容.在这两种情况下,您都试图将非文本写入文本文件.

Use str(object) to figure out what you are working with. In both cases, you were trying to write non-text to a text file.

以下是获取文本并使用您的两个库编写它的方法...

Here's how to get the text and write it using both of your libraries...

url = "https://facebook.com"

library(httr)
get_object = GET(url)
cat(content(get_object, "text"), file="temp.html")

library(rvest)
html_object = read_html(url)
write_xml(html_object, file="temp.html")

这篇关于R - 从 URL/HTML 对象/HTML 响应写入 HTML 文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆