列表展平 [英] List Flatten

查看:90
本文介绍了列表展平的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是我的第一个Python程序(它是Luther Blissett的递归

版本的改进)。给出这样的列表:

[[" a",2,[],[3,5,(2,1),[4]]]]

它产生扁平版本:

[" a",2,3,5,(2,1),4]


我认为此操作非常重要,它类似于内置的Mathematica Flatten功能:

l = {{" a",2,{},{3,5,{ 4}}}}

Flatten [l]

=> {" a",2,3,5,4}


此程序也适用于非常深的列表,而不会像递归那样给出堆栈

溢出版本(它比

递归版本更快)。


有一个额外的循环来加速已经相当平坦的列表(我认为这是最常见的
但是这会让一些非常深的列表变得扁平化。我不知道这对于普通情况是否是最好的

妥协。

我已经尝试了将堆栈作为字典的版本,但时间

几乎是相同的,它使用了更多的内存(我已经用

列表和100万个嵌套级别进行了尝试)。


def listflatten(l):

""" Flattens a list,examples:

listflatten(" a")=> a

listflatten([])=> []

listflatten([[1,[2,[]," a"]])=> [1,2,a]

"""

如果类型(l)!= list:

return l

else:

result = []

stack = []

stack.append((l, 0))

而len(堆栈)!= 0:

序列,j = stack.pop(-1)

而j< ; len(序列):

如果是type(sequence [j])!= list:

k,j,lens = j,j + 1,len(sequence)

而j< len(序列)和\

(类型(序列[j])!= list):

j + = 1

result.extend (序列[k:j])

else:

stack.append((sequence,j + 1))

sequence,j =序列[j],0

返回结果


我认为这个程序基本上(几乎)是O(n)因为序列

list在插入堆栈时并没有真正被复制,并且在
中我发现一些测试我看到列表尾部的推送弹出几乎是

O(1)(但是列表头部中元素的追加是~O(n))。

错误:对于非常大的列表,此程序可能会因内存崩溃溢出。

我还是不知道如何解决这个问题。特殊情况MemoryError

即使在HD发生大量破坏时也不会出现......这可能是操作系统的内存管理问题...... -_ -


评论赞赏,

熊拥抱,

bearophile

This is my first Python program (it''s an improvement of a recursive
version by Luther Blissett). Given a list like this:
[["a", 2, [], [3, 5, (2,1), [4]]]]

It produces the "flatted" version:
["a", 2, 3, 5, (2, 1), 4]

I think this operation is quite important, and it''s similar to the
built-in Mathematica Flatten function:
l = {{"a", 2, {}, {3, 5, {4}}}}
Flatten[l]
=> {"a", 2, 3, 5, 4}

This program works for very deep lists too, without giving Stack
overflow like the recursive version (and it''s faster than the
recursive version).

There''s an added loop to speed up the already quite flat lists (I
think they are the most common) but this slows down a bit the
flattening of very deep lists. I don''t know if this is the best
compromise for average situations.
I''ve tried a version with the stack as a dictionary, but the timings
are almost the same and it uses a LOT more memory (I''ve tried it with
lists with a million nested levels).

def listflatten(l):
"""Flattens a list, examples:
listflatten("a") => "a"
listflatten([]) => []
listflatten([[1,[2,[],"a"]]) => [1,2,"a"]
"""
if type(l) != list:
return l
else:
result = []
stack = []
stack.append((l,0))
while len(stack) != 0:
sequence, j = stack.pop(-1)
while j < len(sequence):
if type(sequence[j]) != list:
k, j, lens = j, j+1, len(sequence)
while j < len(sequence) and \
(type(sequence[j]) != list):
j += 1
result.extend(sequence[k:j])
else:
stack.append((sequence, j+1))
sequence, j = sequence[j], 0
return result

I think this program is essentially (almost) O(n) because the sequence
list isn''t really copied when it''s inserted into the stack, and in
some tests I''ve seen that the pushes-pops in the lists tail are almost
O(1) (but the Append of an element in the head of a list is ~O(n)).
Bugs: for really big lists this program can crash for memory overflow.
I still don''t know how to fix this problem. The exception MemoryError
doesn''t come even when the HD trashes a lot... This is probably a
memory management problem of the Operative System... -_-

Comments are appreciated,
bear hugs,
bearophile

推荐答案

bearophile写道:
bearophile wrote:
def listflatten(l):
""" Flattens?* a?* list,?* examples :
listflatten(" a")?* =>?*" a"
listflatten([])?* =>?* []
listflatten([[1 ,[2,[]," a"]])?* =>?* [1,2," a"]
"""
if?* type( l)?*!=?*清单:
返回?* l
否则:
结果?* =?* []
堆栈?* =?* []
stack.append((l,0))
while?* len(stack)?*!=?* 0:
sequence,?* j?* =?* stack.pop( - 1)
而?* j?*<?* len(序列):
if?* type(sequence [j])?*!=?* list:
k,? * j,?*镜头?* =?* j,?* j + 1,?* len(序列)
而?* j?*<?* len(序列)?*和?* \
(类型(sequence [j])?*!=?* list):
j?* + =?* 1
result.extend(sequence [k:j])
否则:
stack.append((序列,?* j + 1))
序列,?* j?* =?* sequence [j],?* 0
返回? *结果
def listflatten(l):
"""Flattens?*a?*list,?*examples:
listflatten("a")?*=>?*"a"
listflatten([])?*=>?*[]
listflatten([[1,[2,[],"a"]])?*=>?*[1,2,"a"]
"""
if?*type(l)?*!=?*list:
return?*l
else:
result?*=?*[]
stack?*=?*[]
stack.append((l,0))
while?*len(stack)?*!=?*0:
sequence,?*j?*=?*stack.pop(-1)
while?*j?*<?*len(sequence):
if?*type(sequence[j])?*!=?*list:
k,?*j,?*lens?*=?*j,?*j+1,?*len(sequence)
while?*j?*<?*len(sequence)?*and?*\
(type(sequence[j])?*!=?*list):
j?*+=?*1
result.extend(sequence[k:j])
else:
stack.append((sequence,?*j+1))
sequence,?*j?*=?*sequence[j],?*0
return?*result




我看到我的新闻阅读器对于flatten的含义有不同的概念

be :-)


您是否考虑过发电机?


def flatten(项目):

如果不是isinstance(项目,清单):

收益项

stack = [iter(items)]

叠加:

for stack in [[1]:

if isinstance(item,list):

stack.append(iter(item))

break

产品项目

否则:

stack.pop()


断言列表(flatten([[[[]]], 0,1,[[[2,[3,[4]]]],[],[5,6],7,[8,9]],

10]))= =范围(11)


我还没有计时,我只是更喜欢发电机的清晰度(提供

它是正确的,我还没有测试了它)。这主要是通过将索引与迭代器对象中的基础列表一起隐藏来实现的。

生成器也可以为非常大的数据结构节省大量内存

因为你可以迭代你的嵌套列表的平_view_而不必_ $
必须_create_大单位列表。


Peter



I see my newsreader has a different notion about what flatten is meant to
be :-)

Have you considered generators?

def flatten(items):
if not isinstance(items, list):
yield items
stack = [iter(items)]
while stack:
for item in stack[-1]:
if isinstance(item, list):
stack.append(iter(item))
break
yield item
else:
stack.pop()

assert list(flatten([[[[]]], 0, 1, [[[2,[3, [4]]]], [],[5, 6], 7, [8, 9]],
10])) == range(11)

I have not timed it, I just prefer the generator for its clarity (provided
it is correct, I haven''t tested it either). This is achieved mostly by
hiding the indices together with the underlying list in iterator objects.
Generators may also save you a lot of memory for very large data structures
as you can iterate over a flat _view_ of your nested lists without ever
having to _create_ the large flat list.

Peter


Peter Otten写道:
Peter Otten wrote:
生成器也可以为非常大的数据节省大量内存
结构,你可以迭代你的嵌套列表的平面视图
,而无需创建大的平面列表。
Generators may also save you a lot of memory for very large data
structures as you can iterate over a flat view of your nested lists
without ever having to create the large flat list.




特别是如果你避免建立临时"堆叠"列表,通过使用

的递归:

def deep_iter(嵌套):

for x in nested:

if isinstance(x,(list,tuple)):

for y in deep_iter(x):

yield y

:否则:

收益率x

上面的例子可以用来压扁嵌套列表或元组。


Jeffrey



Particularly if you avoid building the interim "stack" list, by making use
of recursion:

def deep_iter(nested):
for x in nested:
if isinstance(x, (list, tuple)):
for y in deep_iter(x):
yield y
else:
yield x
The example above should work to flatten nested lists or tuples.

Jeffrey


On Sun,2004年10月17日07:21:56 -0700,Jeffrey Froman写道:
On Sun, 17 Oct 2004 07:21:56 -0700, Jeffrey Froman wrote:
特别是如果你避免建立临时堆栈。列表,通过使用递归:
Particularly if you avoid building the interim "stack" list, by making use
of recursion:




原始海报明确提到不使用递归因为

想要展平事物比调用堆栈更深入。



The original poster explicitly mentioned not using recursion due to
wanting to flatten things deeper than the call stack can go.


这篇关于列表展平的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆