如何修复不在所有窗口中放置数据的滑动窗口程序？ [英] How do I fix a sliding window program that does not place data in all windows?

查看：474 发布时间：2018/4/16 16:35:12 python text iterator full-text-search sliding-window

本文介绍了如何修复不在所有窗口中放置数据的滑动窗口程序？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

注意：发电机正在工作。这不是问题。

我正在处理包含具有此格式的pi的小数位的大型文本文件。请注意，标题是所有数字，并且没有字符串。

标题格式：Number_of_sequences Total_Pi_Digits File_Version_Number

  550 10000 5 
 
 * Pi序列部分1 
 1415926535897932384 
 * Pi序列部分2 
 6264338327950288419 
 * Pi序列第3部分
 1693993751058209749

我需要制作一个滑动窗口，参数（window_size，step_size和last_windowstart）。 last_windowstart是最后一个窗口的起始位置。

文件的数量由Total_Pi_Digits除以窗口决定。

如果文件有99个Total_Pi_Digits，window_size为10，step_size为0，那么自99 // 10 = 10和99％ 10在窗口11中留下9个。

lastwindow_start应该是90我猜这个例子。我不确定我需要last_window开始。

对于每个窗口，都将创建一个名为PiSubsection＃
的文件，其中＃是窗口号。

对于每个文件，每个窗口应具有相同的新标题，其中Number_of_sequences Total_Pi_Digits File_Version_Number是标题格式。

Number_of_sequences Total_Pi_Digits将根据window_size和step_size而变化，但File_Version_Number不得更改。

我的问题是，我的程序不会使用Pi的数字填充所有窗口。窗口7之后的所有文件只有pi的头和无数位。
因此，只有一半的文本文件被放置在窗口中。

问题与islice如何使用windows [0]和windows [1 ]。
由于某些原因，islice的窗口[0]和窗口[1]与我生成器生成的窗口[0]和窗口[1]不同。

为什么islice的窗口[0]和窗口[1]与我的发生器产生的窗口[0]和窗口[1]不同？如何解决这个问题？

  inputFileName =sample.txt
 
 import itertools 
导入linecache 
 $ b $ slide sliding_window（windows_size，step_size，lastwindow_start）：
 for xrange（0，lastwindow_start，step_size）：
 yield（i，i + windows_size）
 $ b $ def PiCrop（windows_size，step_size）：
 $ bf = open（inputFileName，'r'）
 first_line = f.readline（）。split ）
 
 Total_Pi_Digits = int（first_line [0]）
 
 lastwindow_start = Total_Pi_Digits-（Total_Pi_Digits％windows_size）
 
 lastcounter =（Total_Pi_Digits // windows_size ）*（windows_size / step_size）
 
 flags = [我在范围内失败（lastcounter）] 
 
 first_line [0] = str（windows_size）
 
 second_line = f.readline（）。split（）
 
 offset = int（round（float（second_line [0] .strip（'\\\
'））））
 
 first_line =.join（first_line）
 
 f。 close（）
 
以open（inputFileName，'r'）作为输入：
用于输入的行：
用于计数器，窗口用于枚举（sliding_window（windows_size，step_size，lastwindow_start ））：
 with open（'PiSubsection _ {}。txt'.format（counter），'w +'）as output：
 if（flags [counter] == False）：
 flags [counter] = True 
 headerline = float（linecache.getline（inputFileName，window [1] +1）） -  offset 
 output.write（str（windows_size）++ str（headerline）+ +'L'+'\ n'）
 
用于itertools.islice（输入，窗口[0]，窗口[1]，无）中的xline：
 newline = str （{0：.4f}。format（float（xline.strip（'\\\
'）） -  offset））
 output.write（str（newline）+'\\\
'）
 input

解决方案

我在这个问题上工作了几个小时。在一些朋友的帮助下，我决定把文本文件作为一个巨大的列表存储在内存中，并使用大块窗口[0]的循环：window [1]而不是islice。

删除islice解决了问题。

  inputFileName =sample.txt
 
导入itertools 
导入linecache 
 $ b $ def sliding_window（window_size，step_size，lastwindow_start）：
用于xrange中的i（0，lastwindow_start，step_size）：
 yield（i ，i + window_size）
 $ b $ def PiCrop（window_size，step_size）：
 
f = open（inputFileName，'r'）
 
 first_line = f。 readline（）。split（）
 
 Total_Pi_Digits = int（first_line [0]）
 
 lastwindow_start = Total_Pi_Digits-（Total_Pi_Digits％window_size）
 
 lastcounter =（Total_Pi_Digits // window_size）*（window_size / step_size）
 
 flags = [False for i in range（lastcounter）] 
 
 first_line [0] = str（window_size） 
 second_line = f.readline（）。split（）
 offset = int（round（f （second_line [0] .strip（'\\\
'））））
 first_line =.join（first_line）
 
 f。 close（）
 
 with open（inputFileName，'r'）as f：
 header = f.readline（）
 data = [line.strip（）。split（' ，'）for line in f.readlines（）] 
 
 for counter，enumerate窗口（sliding_window（window_size，step_size，lastwindow_start））：
 chunk = data [window [0]：窗口[1]] 
 
打开（'PiCrop _ {}。txt'.format（counter），'w'）作为输出：
 
 if（flags [counter] == False）：
 flags [counter] = True 
 
 headerline = float（linecache.getline（inputFileName，window [1] +1）） -  offset 
 output.write （str（window_size）++ str（{0：.4f}。format（headerline））++'L'+'\\\
'）
 
 chunk：
 newline = str（{0：.4f}。格式（float（str（item）.translate（None，['''）） -  offset））
 output.write （str（newline）+'\\\
'）
 
 PiCrop（1000,500）

Note: The generator is definitely working. It's not the problem.

I am dealing with a large text file containing the decimal places of pi that has this format. Note that the header is all numbers and does not have a string.

Header format: Number_of_sequences Total_Pi_Digits File_Version_Number

550 10000 5

*Pi Sequence Part 1
1415926535897932384
*Pi Sequence Part 2
6264338327950288419
*Pi Sequence Part 3
1693993751058209749

I need to make a sliding window that crops the file using three arguments (window_size, step_size, and last_windowstart). last_windowstart is where the last window starts.

The number of files is determined by dividing the Total_Pi_Digits by the window.

If the file had 99 Total_Pi_Digits, window_size of 10, and a step_size of zero, there would be a total of 11 windows since 99//10=10 and 99%10 leaves 9 in window 11.

lastwindow_start should be 90 I guess for this example. I am not sure that I need last_window start.

For each a window, a file will be created with the name PiSubsection# where # is the window number.

For each file, every window should have the same new header where Number_of_sequences Total_Pi_Digits File_Version_Number is the header format.

Number_of_sequences Total_Pi_Digits will change based upon window_size and step_size but File_Version_Number must not change.

My problem is that my program does not fill all windows with digits of Pi. All files after window 7 have only the header and NO digits of pi. So, only half of the text file is placed in the windows.

The problem has to do with how islice takes windows[0] and windows[1]. For some reason, islice's windows[0] and windows[1] differ from the windows[0] and windows[1] produced by my generator.

Why is islice's windows[0] and windows[1] different from the windows[0] and windows[1] produced by my generator? How do I fix this?

inputFileName = "sample.txt"

import itertools
import linecache

def sliding_window(windows_size, step_size, lastwindow_start):
    for i in xrange(0, lastwindow_start, step_size):
        yield (i, i + windows_size)

def PiCrop(windows_size, step_size):

    f = open(inputFileName, 'r')
    first_line = f.readline().split()

    Total_Pi_Digits = int(first_line[0])

    lastwindow_start = Total_Pi_Digits-(Total_Pi_Digits%windows_size)

    lastcounter = (Total_Pi_Digits//windows_size)*(windows_size/step_size)

    flags = [False for i in range(lastcounter)]

    first_line[0] = str(windows_size)

    second_line = f.readline().split()

    offset = int(round(float(second_line[0].strip('\n'))))

    first_line = " ".join(first_line)

    f. close()

    with open(inputFileName, 'r') as input:
        for line in input:
            for counter, window in enumerate(sliding_window(windows_size,step_size,lastwindow_start)):
                with open('PiSubsection_{}.txt'.format(counter), 'w+') as output:
                    if (flags[counter] == False):
                        flags[counter] = True
                        headerline = float(linecache.getline(inputFileName, window[1]+1)) - offset
                        output.write(str(windows_size) + " " + str(headerline) + " " + 'L' + '\n')

                    for xline in itertools.islice(input, window[0], window[1], None):
                        newline = str("{0:.4f}".format(float(xline.strip('\n'))-offset))
                        output.write(str(newline) + '\n')
                        input

解决方案

I worked on the problem for a few hours. With the help of some friends, I decided to store the text file in the memory as a giant list and use for loops with chunks of window[0]:window[1] instead of islice.

Removing islice fixed the problem.

inputFileName = "sample.txt"

import itertools
import linecache

def sliding_window(window_size, step_size, lastwindow_start):
    for i in xrange(0, lastwindow_start, step_size):
        yield (i, i + window_size)

def PiCrop(window_size, step_size):

f = open(inputFileName, 'r')

first_line = f.readline().split()

Total_Pi_Digits = int(first_line[0])

lastwindow_start = Total_Pi_Digits-(Total_Pi_Digits%window_size)

lastcounter = (Total_Pi_Digits//window_size)*(window_size/step_size)

flags = [False for i in range(lastcounter)]

first_line[0] = str(window_size)
second_line = f.readline().split()
offset = int(round(float(second_line[0].strip('\n'))))
first_line = " ".join(first_line)

f. close()

with open(inputFileName, 'r') as f:
    header = f.readline()
    data = [line.strip().split(',') for line in f.readlines()]

    for counter, window in enumerate(sliding_window(window_size,step_size,lastwindow_start)):
        chunk = data[window[0]:window[1]]

        with open('PiCrop_{}.txt'.format(counter), 'w') as output:

            if (flags[counter] == False):
                flags[counter] = True

                headerline = float(linecache.getline(inputFileName, window[1]+1)) - offset
                output.write(str(window_size) + " " + str("{0:.4f}".format(headerline)) + " " + 'L' + '\n')

            for item in chunk:
                newline = str("{0:.4f}".format(float(str(item).translate(None, "[]'"))-offset))
                output.write(str(newline) + '\n')

PiCrop(1000,500)

这篇关于如何修复不在所有窗口中放置数据的滑动窗口程序？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如何修复不在所有窗口中放置数据的滑动窗口程序？ [英] How do I fix a sliding window program that does not place data in all windows?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录关闭

如何修复不在所有窗口中放置数据的滑动窗口程序？ [英] How do I fix a sliding window program that does not place data in all windows?

问题描述

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭