IPython Notebook从原始文本单元格读取字符串 [英] IPython notebook read string from raw text cell

查看:112
本文介绍了IPython Notebook从原始文本单元格读取字符串的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的IPython Notebook项目中有一个原始文本单元格.

I have a raw text cell in my IPython notebook project.

是否可以使用内置函数或类似方法将文本作为字符串获取?

Is there a way to get the text as a string with a build in function or something similar?

推荐答案

我的回答(可能不令人满意)分为两个部分.这是基于对iPython结构的个人调查,完全有可能我错过了一些可以直接回答问题的东西.

My (possibly unsatisfactory) answer is in two parts. This is based on a personal investigation of iPython structures, and it's entirely possible I've missed something that more directly answers the question.

在笔记本中使用列表In可获取在当前会话中输入的代码单元格的原始文本.

The raw text for code cells entered during the current session is available within a notebook using the list In.

因此,当前单元格的原始文本可以通过单元格中的以下表达式返回:

So the raw text of the current cell can be returned by the following expression within the cell:

In[len(In)-1]

例如,评估包含以下代码的单元格:

For example, evaluating a cell containing this code:

print "hello world"
three = 1+2
In[len(In)-1]

产生相应的Out[]值:

u'print "hello world"\nthree = 1+2\nIn[len(In)-1]'

因此,在活动的笔记本会话中,您可以将单元格的原始文本访问为In[n],其中n是所需单元格的显示索引.

So, within an active notebook session, you can access the raw text of cell as In[n], where n is the displayed index of the required cell.

但是,如果在上一个Notebook会话中输入了该单元格,随后又关闭并重新打开了该会话,则该单元格将不再起作用.另外,似乎只有代码单元包含在In数组中.

But if the cell was entered during a previous Notebook session, which has subsequently been closed and reopened, that no longer works. Also, only code cells seem to be included in the In array.

此外,这不适用于非代码单元格,因此不适用于原始文本单元格.

Also, this doesn't work for non-code cells, so wouldn't work for a raw text cell.

在我的研究中,我发现从以前的会议中获取原始文本的唯一方法是读取原始笔记本文件.有一个文档页面导入IPython笔记本作为模块描述了如何执行此操作.关键代码在In[4]:

In my research, the only way I could uncover to get the raw text from previous sessions was to read the original notebook file. There is a documentation page Importing IPython Notebooks as Modules describing how to do this. The key code is in In[4]:

    # load the notebook object
    with io.open(path, 'r', encoding='utf-8') as f:
        nb = current.read(f, 'json')

其中current返回的笔记本对象作为嵌套字典和列表结构访问,例如:

The notebook object returned is accessed as a nested dictionary and list structure, e.g.:

    for cell in nb.worksheets[0].cells:
        ...

为此问题所列举的cell对象具有两个关键字段:

The cell objects thus enumerated have two key fields for the purpose of this question:

  1. cell.cell_type是单元格的类型(代码",降价",原始"等).

  1. cell.cell_type is the type of the cell ("code", "markdown", "raw", etc.).

cell.input是单元格的原始文本内容,是字符串列表,每行文本都有一个条目.

cell.input is the raw text content of the cell as a list of strings, with an entry for each line of text.

通过查看构成已保存的iPython笔记本的JSON数据,可以看到其中的大部分内容.

Much of this can be seen by looking at the JSON data that constitutes a saved iPython notebook.

除了笔记本中的提示号"字段(每次重新评估该字段时似乎都会更改)之外,我找不到其他方法来创建对笔记本单元格的稳定引用.

Apart from the "prompt number" fields in a notebook, which seem to change whenever the field is re-evaluated, I could find no way to create a stable reference to a notebook cell.

我找不到原始问题的简单答案.我发现的内容已在上面介绍.不知道原始问题背后的动机,我不知道是否足够.

I couldn't find an easy answer to the original question. What I found is covered above. Without knowing the motivation behind the original question, I can't know if it's enough.

我要找的但无法识别的是一种可以在笔记本本身内部引用当前笔记本的方式(例如,通过get_ipython()之类的功能).那并不意味着它不存在.

What I looked for, but was unable to identify, was a way to reference the current notebook that can be used from within the notebook itself (e.g. via a function like get_ipython()). That doesn't mean it doesn't exist.

我的答复中另一个缺失的部分是引用特定单元格的任何一种稳定方式. (例如,查看笔记本文件格式,原始文本单元格仅由单元格类型(原始")和原始文本本身组成,尽管看起来还可能包含单元格元数据.)这建议了直接引用a的唯一方法单元格是通过其在笔记本中的位置进行的,但是在编辑笔记本时,主题也会发生变化.

The other missing piece in my response is any kind of stable way to refer to a specific cell. (e.g. Looking at the notebook file format, raw text cells consist solely of a cell type ("raw") and the raw text itself, though it appears that cell metadata might also be included.) This suggests the only way to directly reference a cell is through its position in the notebook, but that is subject too change when the notebook is edited.

(作为牛津大学参与 http://aaronswartzhackathon.org 的一部分进行了调查并回答)

(Researched and answered as part of the Oxford participation in http://aaronswartzhackathon.org)

这篇关于IPython Notebook从原始文本单元格读取字符串的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆