如何将我的数据随机分成不同的迷你批[JULIA] [英] How to divide my data into distincts mini-batches randomly [JULIA]

查看:58
本文介绍了如何将我的数据随机分成不同的迷你批[JULIA]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个100000个示例的数据向量.值为-1和1. 我想从此数据中随机获取 16个不同的迷你批次,每个批次为6250.

I have a vector of data of 100000 examples. The values are -1 and 1. I want to get from this data 16 distinct mini-batches randomly, each one of 6250.

这是我的代码,用于生成存储在文件中的100000个示例的向量.

Here is my code to generate the vector of 100000 examples which is stored in a file.

Dan回答了如何将我的数据划分为不同部分的问题.

The question of how to divide my data to different parts is answered by Dan.

现在,我想将[p的x的[x [p]部分存储]]存储在p个文件中.我的意思是:如果我有3个部分,我想创建并存储p的值.我该怎么办?

workspace()
using JLD, HDF5
#import HTreeRBM

function gen_random(m,k)  

# m the length of the vector , for instance m=100000 and k the number of partitions let's set k=16

s = rand(m)
# Pkg.add("JLD"), Pkg.add("HDF5") these two packages are needed in order to store our vectors in files under the extension jld 

 # allow to convert each random number to -1 or 1

X=float_to_binary(s)



parts= kfoldperm(length(X),k)

for p in 1:length(parts)
file =jldopen(@sprintf("my path to file/mini_batch%d.jld", p),"w")
write(file, "X", [X[p] for p in parts]) 
close(file)
end
return [X[p] for p in parts]

            function float_to_binary(s,level=0.4)
      for i=1:length(s)
        s[i] = s[i] > level ? 1.0 : -1.0
      end
    file = jldopen("/home/anelmad/Desktop/stage-inria/code/HTreeRBM.jl/artificial_data/mydata.jld", "w")
    write(file, "s", s)  # alternatively, say "@write file A"
    close(file)
      return s
    end


           function kfoldperm(l,k)
    n,r = divrem(l,k)
    b = collect(1:n:l+1)
        for i in 1:length(b)
            b[i] += i > r ? r : i-1  
        end
    p = randperm(l)
       return [p[r] for r in [b[i]:b[i+1]-1 for i=1:k]]


    end

推荐答案

通过运行以下命令定义kfoldperm:

Define kfoldperm by running:

function kfoldperm(N,k)
    n,r = divrem(N,k)
    b = collect(1:n:N+1)
    for i in 1:length(b)
        b[i] += i > r ? r : i-1  
    end
    p = randperm(N)
    return [p[r] for r in [b[i]:b[i+1]-1 for i=1:k]]
end

现在

v = rand(10)
parts = kfoldperm(10,3)
[v[p] for p in parts]

将为您提供v的分区,分为三部分.

Will give you a partition of v to 3 parts.

这篇关于如何将我的数据随机分成不同的迷你批[JULIA]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆