Python:KeyError:0-从通过循环访问的数据帧子集创建列表的函数 [英] Python: KeyError: 0 - function that creates a list from subsets of data frame accessed via a loop
问题描述
我有一个循环,可创建较大数据帧的子集,然后将其用作分析函数的输入.此函数返回一个列表.我用印刷品看它在哪里停下来.因此,对于循环中的第一次运行,我可以看到该函数的子集,即列表输出,但是当再次启动该函数的列表输出时,我看到了第二个子集,但是在函数中,我得到了以下错误:
I have a loop that creates subsets of a larger data frame, which are then used as input for a function for analysis. This function returns a list. I used prints to see where it stops. So for the first run in the loop I can see the subset, the list output of the function, but when it starts again, second run, I see the second subset but than at the function I get the following error:
<ipython-input-11-8f6203e297e3> in ssd(x, y)
8
9 for i in range(x.shape[0]):
---> 10 spread_cumdiff += (x[i] - y[i]) **2
11
12 return spread_cumdiff
请注意,上面的部分是下面的python-error之前的最后一部分.实际上,上面还有2个类似的块,即函数a)包含函数b)包含上面的块.
Note the above part is the last part "before" the python-error below. In fact, above has 2 more similar blocks, ie function a) which contains function b) which contains the block above.
~/anaconda3/envs/thesis/lib/python3.5/site-packages/pandas/core/series.py in __getitem__(self, key)
621 key = com._apply_if_callable(key, self)
622 try:
--> 623 result = self.index.get_value(self, key)
624
625 if not is_scalar(result):
~/anaconda3/envs/thesis/lib/python3.5/site-packages/pandas/core/indexes/base.py in get_value(self, series, key)
2558 try:
2559 return self._engine.get_value(s, k,
-> 2560 tz=getattr(series.dtype, 'tz', None))
2561 except KeyError as e1:
2562 if len(self) > 0 and self.inferred_type in ['integer', 'boolean']:
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_value()
pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Int64HashTable.get_item()
KeyError: 0
我大致符合以下条件:
df_test = pd.DataFrame(np.random.randint(low=0, high=10, size=(5, 5)),
columns=['a', 'b', 'c', 'd', 'e'],
index = ['20100101', '20100102', '20100103', '20100104', '20100105']
dfs = []
N = 3
for x in np.arange(len(df_test)+1)[N:]:
df1 = df_test.iloc[np.arange(x - N, x)]
test_list = myfunc(df1) # it takes in df1, makes some computation and returns a
# list of 2-element tuples, i.e. [('a', 'b'), ('d', 'e')]
请参见下面的功能:
def ssd(x, y):
spread_cumdiff = 0
for i in range(x.shape[0]):
#print("x_i", x[i])
#print("y_i", y[i])
spread_cumdiff += (x[i] - y[i]) **2
return spread_cumdiff
我尝试使用打印功能,但是在第二次循环中它甚至还没有达到目的.
I tried to use the print function but it doesn't even come that far for the second run of the loop.
def pairs_match(df, p):
df_norm = df.assign(**df.drop('datetime', 1).pipe(lambda d: d.div(d.shift().bfill()).cumprod()))
df_norm = df_norm.replace([np.inf, -np.inf], np.nan)
df_norm.fillna(method = 'ffill', inplace = True)
df_norm.fillna(method = 'bfill', inplace = True)
ticker = df_norm.columns.values.tolist()
ticker.pop(0)
ticker_list = pd.DataFrame({'ticker': ticker})
# to be implemented: if length of list list <2, then skip the entire run!
all_pairs = list(itertools.permutations(ticker_list.ticker, 2))
squared = []
presel_pairs = []
for i in all_pairs:
squared.append(ssd(df_norm[i[0]].head(n = train_win), df_norm[i[1]].head(n = train_win))) # ssd(x,y) function from above
tbl_dist = pd.DataFrame({'Pair' : all_pairs, 'SSD' : squared})
ssd_perctl = p
ssd_thresh = stats.scoreatpercentile(tbl_dist['SSD'], ssd_perctl)
presel_pairs = tbl_dist[tbl_dist['SSD'] <= ssd_thresh]
presel_pairs_list = presel_pairs['Pair']
presel_pairs_list = presel_pairs_list.reset_index(drop = True)
return presel_pairs_list
def pairs_match(df, p)
返回一个列表,该列表然后在另一个函数中使用.
def pairs_match(df, p)
returns a list, which is then used in another function.
推荐答案
尝试分别打印x[i]
和y[i]
,以便您知道2中的哪一个引起键盘错误.另外,请发布该函数,因为没有它,我们不知道发生了什么.
try printing x[i]
and y[i]
seperaterly so you know which of the 2 causes the keyerror. Also please post the function because without it we have no clue what's going on.
这篇关于Python:KeyError:0-从通过循环访问的数据帧子集创建列表的函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!