在python中检测markdown文件中的所有链接,并将其替换为字符串函数的输出 [英] Detecting all links in markdown files in python and replace them with outputs of string function
问题描述
我有一个python函数 f(foo:string)->字符串
.我不写该函数的详细信息,因为它可能会更改.
I have a python function f(foo: string) -> string
. I don't write the details of the function because it could be change.
我需要从markdown文件中获取所有 链接,并将其替换为该函数的结果.
I need to get all links from markdown file and replace them with the result of that function.
示例:此链接
This is a text and this [is first link](http://example.com "Example Title") and
this [is a second](#example) link.
将替换为
This is a text and this [is first link](result1 "Example Title") and
this [is a second](result2) link.
其中 f(http://example.com)= result1
和 f(#example)= result2
.即 result1
是 f(http://example.com)
的输出,而 result2
是 f(#例如)
.
where f(http://example.com)=result1
and f(#example)=result2
. That is result1
is the output of f(http://example.com)
and result2
is the output of f(#example)
.
我们可以在python正则表达式中或使用某些具有markdown文件特征的特定程序包吗?
Can we do in python regular expressions or with some specific package which traits markdown files?
推荐答案
修改 mreinhart 对此问题的回答,这可能完成:
Modifying mreinhart response to this question, this could be done:
def find_md_links(md):
"""Returns dict of links in markdown:
'regular': [foo](some.url)
'footnotes': [foo][3]
[3]: some.url
"""
# https://stackoverflow.com/a/30738268/2755116
INLINE_LINK_RE = re.compile(r'\[([^\]]+)\]\(([^)]+)\)')
FOOTNOTE_LINK_TEXT_RE = re.compile(r'\[([^\]]+)\]\[(\d+)\]')
FOOTNOTE_LINK_URL_RE = re.compile(r'\[(\d+)\]:\s+(\S+)')
links = list(INLINE_LINK_RE.findall(md))
footnote_links = dict(FOOTNOTE_LINK_TEXT_RE.findall(md))
footnote_urls = dict(FOOTNOTE_LINK_URL_RE.findall(md))
footnotes_linking = []
for key in footnote_links.keys():
footnotes_linking.append((footnote_links[key], footnote_urls[footnote_links[key]]))
return {'regular': links, 'footnotes': footnotes_linking}
def replace_md_links(md, f):
"""Replace links url to f(url)"""
links = find_md_links(md)
newmd = md
for r in links['regular']:
newmd = newmd.replace(r[1], f(r[1]))
for r in links['footnotes']:
newmd = newmd.replace(r[1], f(r[1]))
return newmd
f
是一个函数.例如,我使用此函数仅更改 replace_md_links
f
is a function. For example, I use this function which only changes links which belong from #
in replace_md_links
def mychange(s, prefix="/static/entrades/", suffix=".md.html"):
"""Change links from tiddlywiki syntax [foo](#something) to [foo](prefix + something + suffix)"""
if s.startswith('#'):
return prefix + slugify.slugify(urllib.parse.unquote( s.replace('#', '', 1) )) + suffix
else:
return s
这篇关于在python中检测markdown文件中的所有链接,并将其替换为字符串函数的输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!