将Firefox书签JSON文件转换为Markdown [英] Convert Firefox bookmarks JSON file to markdown

查看:113
本文介绍了将Firefox书签JSON文件转换为Markdown的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在我的Hugo网站上显示部分书签.Firefox中的书签可以JSON格式保存,这就是来源.结果应以嵌套列表,树视图或手风琴的格式表示嵌套结构.网站上内容的源文件以markdown编写.我想从JSON输入生成markdown文件.

I want to show part of my bookmarks on my Hugo website. The bookmarks from Firefox can be saved in JSON format, this is the source. The result should represent the nested structure somehow, in a format of a nested list, treeview or accordion. The source files of contents on the website are written in markdown. I want to generate a markdown file from the JSON input.

搜索可能的解决方案:

  • treeview或手风琴:需要HTML,CSS和Javascript.我无法使用< details> 标记嵌套手风琴.另外,此刻似乎有点矫kill过正.
  • 无序列表:可以仅用降价促销来完成.
  • treeview or accordion: HTML, CSS and Javascript needed. I could not nest accordions with the <details> tag. Also, seems like overkill at the moment.
  • unordered list: can be done with bare markdown.

我选择从JSON生成无序嵌套列表.我想用R做到这一点.

I chose to generate an unordered nested list from JSON. I would like to do this with R.

输入示例: https://gist.github.com/hermanp/c01365b8f4931ea7ff9d1aee1cbbc391

首选输出(缩进两个空格):

Preferred output (indentation with two spaces):

- Info
  - Python
    - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
    - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
    - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
    - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
  - Frontend
    - [CodePen](https://codepen.io/)
    - [JavaScript](https://www.javascript.com/)
    - [CSS-Tricks](https://css-tricks.com/)
    - [Butterick’s Practical Typography](https://practicaltypography.com/)
    - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
    - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
    - [Client-Side Web Development](https://info340.github.io/)
  - [Stack Overflow](https://stackoverflow.com/)
  - [HUP](https://hup.hu/)
  - [Hope in Source](https://hopeinsource.com/)

奖励偏好的输出:在链接之前显示图标,如下所示(欢迎其他建议,例如从网站的服务器加载而不是链接):

Bonus preferred output: show favicons before links, like below (other suggestion welcomed, like loading them from the website's server instead of linking):

  - ![https://cdn.sstatic.net/Sites/stackoverflow/Img/apple-touch-icon.png?v=c78bd457575a][Stack Overflow](https://stackoverflow.com/)

尝试

generate_md <- function (file) {
  # Encoding problem with tidyjson::read_json
  bmarks_json_lite <- jsonlite::fromJSON(
    txt = paste0("https://gist.githubusercontent.com/hermanp/",
                 "c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
                 "33c21c88dad35145e2792b6258ede9c882c580ec/",
                 "bookmarks-example.json"))
  
  # This is the start point, a data frame
  level1 <- bmarks_json_lite$children$children[[2]]
  
  # Get the name of the variable to modify it.
  # Just felt that some abstraction needed.
  varname <- deparse(substitute(level1))
  varlevel <- as.integer(substr(varname, nchar(varname), nchar(varname)))
  
  # Get through the data frame by its rows.
  for (i in seq_len(nrow(get(varname)))) {
    
    # If the type of the element in the row is "text/x-moz-place",
    # then get its title and create a markdown list element from it.
    if (get(varname)["type"][i] == "text/x-moz-place"){
      
      # The two space indentation shall be multiplied as many times
      # as deeply nested in the lists (minus one).
      md_title <- paste0(strrep("  ", varlevel - 1),
                         "- ",
                         get(varname)["title"][i],
                         "\n")
    
      # Otherwise do this and also get inside the next level.
    } else if (get(varname)["type"][i] == "text/x-moz-place-container") {
      md_title <- paste0(strrep("  ", varlevel - 1),
                         "- ",
                         get(varname)["title"][i],
                         "\n")
      
      # I know this is not good, just want to express my thought.
      # Create the next, deeper level's variable, whoose name shall
      # represent the depth in the nest.
      # Otherwise how can I multiply the indentation for the markdown
      # list elements? It depends on the name of this variable.
      varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
                        varlevel + 1L)
      varlevel <- varlevel + 1L
      assign(varname, get(varname)["children"][[i]])
      
      # The same goes on as seen at the higher level.
      for (j in seq_len(nrow(get(varname)))){
        if (get(varname)["type"][i] == "text/x-moz-place"){
          md_title <- paste0(strrep("  ", varlevel - 1),
                             "- ",
                             get(varname)["title"][i],
                             "\n")
        } else if (get(varname)["type"][i] == "text/x-moz-place-container") {
          md_title <- paste0(strrep("  ", varlevel - 1),
                             "- ",
                             get(varname)["title"][i],
                             "\n")
          
          varname <- paste0(regmatches(varname, regexpr("[[:alpha:]]+", varname)),
                            varlevel + 1L)
          varlevel <- varlevel + 1L
          assign(varname, get(varname)["children"][[i]])
          
          for (k in seq_len(nrow(get(varname)))){
            # I don't know where this goes...
            # Also I need to paste somewhere the md_title strings to get the 
            # final markdown output...
          }
        }
      }
    }
  }
}

问题

如何从该JSON文件中递归地获取和粘贴字符串?我试图搜索递归技巧,但这是一个很难的话题.任何建议,包装,功能,链接都将受到欢迎!

Question

How can I recursively grab and paste strings from this JSON file? I tried to search for tips in recursion, but it's quite a hard topic. Any suggestion, package, function, link will be welcomed!

推荐答案

在我观看了一些有关递归的视频并看到了一些代码示例之后,我尝试了一下,手动逐步执行了代码,并设法通过递归.此解决方案独立于书签的嵌套性,因此是每个人的通用解决方案.

After I watched a few videos on recursion and saw a few code examples, I tried, manually stepped through the code and somehow managed to do it with recursion. This solution is independent on the nestedness of the bookmarks, therefore a generalized solution for everyone.

注意::所有书签都在Firefox的书签工具栏中.这在 generate_md 函数中突出显示.您可以在那里解决.如果以后再改善答案,我将使其更笼统.

Note: all the bookmarks were in the Bookmarks Toolbar in Firefox. This is highlighted in the generate_md function. You can tackle with it there. If I improve the answer later, I will make it more general.

library(jsonlite)

# This function recursively converts the bookmark titles to unordered
# list items.
recursive_func <- function (level) {
  md_result <- character()
  
  # Iterate through the current data frame, which may have a children
  # column nested with other data frames.
  for (i in seq_len(nrow(level))) {
    # If this element is a bookmark and not a folder, then grab
    # the title and construct a list item from it.
    if (level[i, "type"] == "text/x-moz-place"){
      md_title <- level[i, "title"]
      md_uri <- level[i, "uri"]
      md_iconuri <- level[i, "iconuri"]
      # Condition: the URLs all have schema (http or https) part.
      # If not, filname will be a zero length character vector.
      host_url <- regmatches(x = md_uri,
                             m = regexpr(pattern = "(?<=://)[[:alnum:].-]+",
                                         text = md_uri,
                                         perl = T))
      
      md_link <- paste0("[", md_title, "]", "(", md_uri, ")")
      md_listitem <- paste0("- ", md_link, "\n")
      
      # If this element is a folder, then get into it, call this
      # function over it. Insert two space (for indentation) in
      # the generated sting before every list item. Paste this
      # list of items to the folder list item.
    } else if (level[i, "type"] == "text/x-moz-place-container") {
      md_title <- level[i, "title"]
      md_listitem <- paste0("- ", md_title, "\n")
      md_recurs <- recursive_func(level = level[i, "children"][[1]])
      md_recurs <- gsub("(?<!(\\w ))-(?= )", "  -", md_recurs, perl = T)
      md_listitem <- paste0(md_listitem, md_recurs)
    }
    
    # Collect and paste the list items of the current data frame.
    md_result <- paste0(md_result, md_listitem)
  }
  
  # Return the (sub)list of the data frame.
  return(md_result)
}

generate_md <- function (jsonfile) {
  # Encoding problem with tidyjson::read_json
  bmarks_json_lite <- fromJSON(txt = jsonfile)
  
  # This is the start point, a data frame. It represents the
  # elements inside the Bookmarks Toolbar in Firefox.
  level1 <- bmarks_json_lite$children$children[[2]]
  
  # Do not know how to make it prettier, but it works.
  markdown_result <- recursive_func(level = level1)
  
  return(markdown_result)
}

您可以通过示例运行 generate_md 函数.

You can run the generate_md function with the example.

generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
                   "c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
                   "33c21c88dad35145e2792b6258ede9c882c580ec/",
                   "bookmarks-example.json"))

# Output
[1] "- Info\n  - Python\n    - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)\n    - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)\n    - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)\n    - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)\n  - Frontend\n    - [CodePen](https://codepen.io/)\n    - [JavaScript](https://www.javascript.com/)\n    - [CSS-Tricks](https://css-tricks.com/)\n    - [Butterick’s Practical Typography](https://practicaltypography.com/)\n    - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)\n    - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)\n    - [Client-Side Web Development](https://info340.github.io/)\n  - [Stack Overflow](https://stackoverflow.com/)\n  - [HUP](https://hup.hu/)\n  - [Hope in Source](https://hopeinsource.com/)\n"

您可以 cat 并将其写入文件,也可以使用 writeLines .但是要小心!在Windows环境中,您可能需要打开 useBytes = TRUE 才能在文件中获取正确的字符.参考: R中的UTF-8文件输出

You can cat it and write it to a file also with writeLines. But bevare! In Windows environments, you probably need to turn useBytes = TRUE to get the correct characters in the file. Reference: UTF-8 file output in R

cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
                       "c01365b8f4931ea7ff9d1aee1cbbc391/raw/",
                       "33c21c88dad35145e2792b6258ede9c882c580ec/",
                       "bookmarks-example.json")))
# Output
- Info
  - Python
    - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
    - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
    - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
    - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
  - Frontend
    - [CodePen](https://codepen.io/)
    - [JavaScript](https://www.javascript.com/)
    - [CSS-Tricks](https://css-tricks.com/)
    - [Butterick’s Practical Typography](https://practicaltypography.com/)
    - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
    - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
    - [Client-Side Web Development](https://info340.github.io/)
  - [Stack Overflow](https://stackoverflow.com/)
  - [HUP](https://hup.hu/)
  - [Hope in Source](https://hopeinsource.com/)

正则表达式部分出现问题.如果标题中包含带有 some-title (空格,连字符,空格)字符的书签,则这些连字符也将被缩进".作为列表项.

There was a problem with the regex part. If there are bookmarks with some - title (space, hyphen, space) characters in their titles, these hyphens will also be "indented" as the list items.

# Input JSON
https://gist.github.com/hermanp/381eaf9f2bf5f2b9cdf22f5295e73eb5

cat(generate_md(paste0("https://gist.githubusercontent.com/hermanp/",
                       "381eaf9f2bf5f2b9cdf22f5295e73eb5/raw/",
                       "76b74b2c3b5e34c2410e99a3f1b6ef06977b2ec7/",
                       "bookmarks-example-hyphen.json")))

# Output (two space indentation) markdown:
- Info
  - Python
    - [The Ultimate Python Beginner's Handbook](https://www.freecodecamp.org/news/the-python-guide-for-beginners/)
    - [Python Like You Mean It](https://www.pythonlikeyoumeanit.com/index.html)
    - [Automate the Boring Stuff with Python](https://automatetheboringstuff.com/)
    - [Data science Python notebooks](https://github.com/donnemartin/data-science-ipython-notebooks)
  - Frontend
    - [CodePen](https://codepen.io/)
    - [JavaScript - Wikipedia](https://en.wikipedia.org/wiki/JavaScript)  # correct
    - [CSS-Tricks](https://css-tricks.com/)
    - [Butterick’s Practical Typography](https://practicaltypography.com/)
    - [Front-end Developer Handbook 2019](https://frontendmasters.com/books/front-end-handbook/2019/)
    - [Using Ethics In Web Design](https://www.smashingmagazine.com/2018/03/using-ethics-in-web-design/)
    - [Client-Side Web Development](https://info340.github.io/)
  - [Stack Overflow](https://stackoverflow.com/)
  - [HUP](https://hup.hu/)
  - [Hope in Source](https://hopeinsource.com/)

我发布了关于此问题的另一个问题.经过一番尝试之后,我回答了我自己的问题.

I posted another question about this problem. After some hint and try I answered my own question.

这篇关于将Firefox书签JSON文件转换为Markdown的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆