R的Keras与Python的Keras之间的差异-准确性错误? [英] Discrepancy between R's Keras and Python's Keras -- Accuracy bug?

查看:55
本文介绍了R的Keras与Python的Keras之间的差异-准确性错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Keras进行2D CNN预测自行车共享需求.

I'm playing with some 2D CNN using Keras to predict Bike Sharing Demand.

R与Python相比,性能非常差,可以轻松达到较高的精度.我以为是因为数组形状(以及R和Python之间的一些差异),所以我玩了一段时间,最终使用了所有可能的形状.

R performs very poorly vs Python, which reach good accuracy easily. I thought it was because of arrays shape (and some differences between R and Python), so I play with that for a while, ultimately using all possible shapes.

我在其他地方创建了CombinationGrid对象,它看起来像这样:

I created the CombinationGrid object elsewhere and it looks like this:

+------+------+------+------+-------+
| Dim1 | Dim2 | Dim3 | Dim4 | Order |
+------+------+------+------+-------+
| 8887 |    3 |    2 |    1 | F     |
|    3 | 8887 |    2 |    1 | F     |
| 8887 |    2 |    3 |    1 | C     |
|    2 | 8887 |    3 |    1 | C     |
+------+------+------+------+-------+

这是一个包含第4维数组组合的表(在代码中使用,这里将更加清楚). 这是该版本的完整版本,仅用于重现性

It is a table with combinations for 4th dimensional arrays (is used in the code, where it will be more clear). And here's the full version of that, just for reproducibility


#Read data
TrainDF=read_delim(file='train.csv', delim=',')

#Subset
X_Train=TrainDF[2000:nrow(TrainDF),c('temp', 'atemp', 'humidity', 'windspeed', 'casual', 'registered')]
Y_Train=as.matrix(TrainDF[2000:nrow(TrainDF),c('count')])

#YVal
YVal=as.matrix(Y_Train)

#For loop and try all combinations
Results=list()
for(i in 1:nrow(CombinationGrid)){

  #Reshape using all possible combinations
  XVal=array_reshape(x=as.matrix(X_Train), dim=CombinationGrid[i,1:4], order=CombinationGrid[i,]$Order)

  #Keras Model
  model=keras_model_sequential() 
  model %>% 
    layer_conv_2d(filters=10, kernel_size=c(2,2), padding='same', activation='relu') %>%
    layer_conv_2d(filters=15, kernel_size=c(2,2), padding='same', activation='relu') %>%
    layer_conv_2d(filters=20, kernel_size=c(3,3), padding='same') %>%
    layer_max_pooling_2d(pool_size=c(2,2), strides=1) %>%
    layer_flatten() %>%
    layer_dense(units=30, activation='relu') %>%
    layer_dense(units=20, activation='relu') %>%
    layer_dense(units=10, activation='relu') %>%
    layer_dense(units=1)

  #Compile model
  model %>% compile(
    loss = 'mse',
    optimizer = optimizer_adam(),
    metrics = c('accuracy'))

  #Train model
  Hist=tryCatch({
    model %>% fit(XVal, YVal, epochs = 100)
  },error=function(e){
    Hist=list('metrics'=list('loss'=NA, 'acc'=NA))
  })

  #Save results
  Results[[i]]=list('Loss'=Hist$metrics$loss[length(Hist$metrics$loss)], 'Acc'=Hist$metrics$acc[length(Hist$metrics$acc)])

}

这是Python代码:

#Read Combination Gird
CombinationGrid=pd.read_table('CombinationGrid.txt')

#Read Dataset
TrainDF = pd.read_csv('train.csv', parse_dates=["datetime"])

#Subset training data
X_Train= TrainDF[1999:]

#Create responser variable
YVal = X_Train[['count']]

#Turn into numpy array
YVal=np.array(YVal)

#Select only usefull parameters
X_Train = X_Train[['temp', 'atemp', 'humidity', 'windspeed', 'casual', 'registered']]

#For loop to try all combinations
Results=[]
for i in range(0,CombinationGrid.shape[0]):
    XVal = np.array(X_Train, dtype=np.float32).reshape(tuple(CombinationGrid.iloc[i,])[0:4], order=tuple(CombinationGrid.iloc[i,])[4])

    model=keras.Sequential()
    model.add(keras.layers.Conv2D(filters=10, kernel_size=[2,2], padding='same', activation='relu'))
    model.add(keras.layers.Conv2D(filters=15, kernel_size=[2,2], padding='same', activation='relu'))
    model.add(keras.layers.Conv2D(filters=20, kernel_size=[3,3], padding='same'))
    model.add(keras.layers.MaxPooling2D(pool_size=[2,2], strides=1))
    model.add(keras.layers.Flatten())
    model.add(keras.layers.Dense(units=30, activation='relu'))
    model.add(keras.layers.Dense(units=20, activation='relu'))
    model.add(keras.layers.Dense(units=10, activation='relu'))
    model.add(keras.layers.Dense(units=1))

    model.compile(optimizer='adam', loss='mse', metrics=['accuracy'])

    #Save results
    try:
        Hist=model.fit(XVal, YVal, epochs=100)
        Results.append((Hist.history['loss'][len(Hist.history['loss'])-1],Hist.history['accuracy'][len(Hist.history['accuracy'])-1]))
    except:
        Results.append((np.nan, np.nan))
pass


结果:

我保存了R和Python结果,它们在这里.数据的所有其他数组形状在Python和R中均失败(可能是因为Y的形状不适合与预测变量匹配)


Results:

I saved both R and Python results and here they are. All the other array shapes for the data failed in both Python and R (probably because of Y's not having suitable shape to match predictors):

+------+------+------+------+-------+-------------+-------------+-------------+-------------+
| Dim1 | Dim2 | Dim3 | Dim4 | Order |   R Loss    |    R Acc    | Python Loss |  Python Acc |
+------+------+------+------+-------+-------------+-------------+-------------+-------------+
| 8887 |    3 |    2 |    1 | F     | 0.257986314 | 0.004726004 | 0.264519099 |  0.86125803 |
| 8887 |    2 |    3 |    1 | F     | 1.922012638 | 0.004726004 | 0.375910975 | 0.780578375 |
| 8887 |    3 |    2 |    1 | C     | 0.062438282 | 0.004726004 |  4.27717965 | 0.700686395 |
| 8887 |    2 |    3 |    1 | C     | 0.171041382 | 0.004726004 | 0.054061489 |  0.95262742 |
+------+------+------+------+-------+-------------+-------------+-------------+-------------+

如您所见,最后的损失看起来很相似,但是最后记录的准确性在两者之间有很大的不同. 我知道我在R和Python中对尺寸和形状的理解以及它们之间的区别方面存在一些缺陷,但是在尝试了每种可能的形状并且没有获得相似结果后,它变得很奇怪. 另外,R中的Keras准确性似乎永远不会改变!

As you can see, the last Losses look similar, but the last recorded Accuracy is hugely different between both. I know I have some fault regarding dimension and shape understandment in both R and Python and how they differ, but after trying every possible shape and getting no similar result, it turns weird. Also, Keras Accuracy in R seems to never change!

我找不到这件事的更多信息,只有另一篇陈述相反情况的帖子.

I couldn't find more info on the matter, only another post stating the contrary situation.

所以,某些事情正在发生,这可能是我的错,但是我不知道为什么,如果我使用相同的数据,那么在R中使用Keras不能像在Python中那样获得高分.有什么想法吗?

So, something is happening, it may be my fault but I don't know why, if I use the same data, can't get a good score using Keras in R as I do in Python. Any ideas?

推荐答案

Skeydan在我打开的问题,准确性的差异在于所使用的Keras 版本.

Well, as Skeydan explained to me in the issue I opened, the difference in accuracy falls in the Keras version used.

在Python代码中,将import keras更改为import tensorflow.keras as keras可使R和Python两者匹配的准确性.

In the Python code, changing from import keras to import tensorflow.keras as keras makes the accuracy to match between both R and Python.

我在此处中找到了更多信息>和

I found more information about this here and here.

这篇关于R的Keras与Python的Keras之间的差异-准确性错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆