Python Google表格API [英] Python Google Sheets API

查看:112
本文介绍了Python Google表格API的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我有这个Google Sheets API,我正在从中获取数据并运行KS测试.但是,我只想对一个数字运行KS测试.但是,字符串也包含单词.例如,在这里

So I have this google sheets API, and I am grabbing data from it and running a KS test. However, I only want to run the KS test on a number. But, the string consists of words as well. For instance, here you go

 2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,
2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,
2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,
2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,
2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,
2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,

如果我将其作为字符串,我将如何仅对每行的最后一个数字运行KS测试.为了方便起见,我只想在-.51,-.75,-1.23,-1.23,-.94,-1.16

If I have this as a string, How would I run a KS test on just the last numbers of each line. For instsnace, I only want to run the KS test on -.51,-.75,-1.23,-1.23,-.94,-1.16

这是我的Google工作表的屏幕截图:

Here is a screenshot of my Google sheet:

这是我的一些代码:

from scipy import stats
import numpy as np
import gspread
from oauth2client.service_account import  ServiceAccountCredentials
import re


np.seterr(divide='ignore', invalid='ignore')
def estimate_cdf (col,bins=10,):
    print (col)
    # 'col'
    # 'bins'

    hist, edges = np.histogram(col)
    csum = np.cumsum(hist)



    return csum/csum[-1], edges
    print (csum)



scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)

client = gspread.authorize(creds)

sheet = client.open("sheet1").sheet1  # Opens the spreadhseet

data = sheet.get_all_records()


row = sheet.row_values(3)  # Grab a specific row






number_regex = r'^-?\d+\.?\d*$'





col = sheet.col_values(3)  # Get a specific column print (col)

col2= sheet.col_values(4)
dolphin= estimate_cdf(adjusted := [float(i) for i in col if re.match(i, number_regex)], len(adjusted))



print(col)
print(col2)




shtest =stats.shapiro(col)
print(shtest)




#thelight= sheet.update_cell(5,6,col)
#print(thelight)

k2test =stats.ks_2samp(col, col2, alternative='two-sided', mode='auto')
print(k2test)

这是我的一些错误消息:

And here is some of my error message:

temperature,64.79599999999999,65.03830769230765','2020-09-25 11:38:51,metsense,htu21d,温度,64.85,65.01338461538458','2020-09-25 11:39:16,metsense,htu21d,温度,64.994,64.99538461538458','2020-09-25 11:39:42,metsense,htu21d,温度,65.066,64.98015384615381','2020-09-25 11:40:06,metsense,htu21d,温度,64.94,64.95799999999996 ','2020-09-25 11:40:31,metsense,htu21d,温度,64.976,64.93861538461535','2020-09-25 11:40,57,metsense,htu21d,温度,65.066,64.93307692307688','2020 -09-25 11:41:22,metsense,htu21d,温度65.048,64.93584615384611','2020-09-25 11:41:48,metsense,htu21d,温度,64.994,64.92753846153843','2020-09-25 11:42:12,metsense,htu21d,温度,64.976,64.93169230769227','2020-09-25 11:42:37,metsense,htu21d,温度,64.94,64.9441538461538','2020-09-25 11:43: 03,metsense,htu21d,温度,64.994,64.95523076923072','2020-09-25 11:43:28,metsense,htu21d,温度64.9'] 追溯(最近一次通话): 文件"C:/Users/james/PycharmProjectsfreshproj/shapiro wilks.py",第60行,在 shtest = stats.shapiro(col) 位于Shapiro的文件"C:\ Users \ james \ PycharmProjectsfreshproj \ venv \ lib \ site-packages \ scipy \ stats \ morestats.py",行1676 a,w,pw,ifault = statlib.swilk(y,a [:N//2],init) ValueError:无法将字符串转换为float:"、、、、"

temperature,64.79599999999999,65.03830769230765', '2020-09-25 11:38:51,metsense,htu21d,temperature,64.85,65.01338461538458', '2020-09-25 11:39:16,metsense,htu21d,temperature,64.994,64.99538461538458', '2020-09-25 11:39:42,metsense,htu21d,temperature,65.066,64.98015384615381', '2020-09-25 11:40:06,metsense,htu21d,temperature,64.94,64.95799999999996', '2020-09-25 11:40:31,metsense,htu21d,temperature,64.976,64.93861538461535', '2020-09-25 11:40:57,metsense,htu21d,temperature,65.066,64.93307692307688', '2020-09-25 11:41:22,metsense,htu21d,temperature,65.048,64.93584615384611', '2020-09-25 11:41:48,metsense,htu21d,temperature,64.994,64.92753846153843', '2020-09-25 11:42:12,metsense,htu21d,temperature,64.976,64.93169230769227', '2020-09-25 11:42:37,metsense,htu21d,temperature,64.94,64.9441538461538', '2020-09-25 11:43:03,metsense,htu21d,temperature,64.994,64.95523076923072', '2020-09-25 11:43:28,metsense,htu21d,temperature,64.9'] Traceback (most recent call last): File "C:/Users/james/PycharmProjectsfreshproj/shapiro wilks.py", line 60, in shtest =stats.shapiro(col) File "C:\Users\james\PycharmProjectsfreshproj\venv\lib\site-packages\scipy\stats\morestats.py", line 1676, in shapiro a, w, pw, ifault = statlib.swilk(y, a[:N//2], init) ValueError: could not convert string to float: ',,,,,'

以退出代码1完成的过程

Process finished with exit code 1

推荐答案

问题

给出来自Google Sheets API的字符串,对每个字符串的最后一个数字运行kstest.

Problem

Given strings coming from Google Sheets API, run kstest on the last number of each string.

一种更好的方法是直接从Google Sheets API中获取数字,将其存储并提供给stats.kstest.

A better way would be getting the numbers straight from Google Sheets API, store them and feed to stats.kstest.

您可以使用 str.split 拆分字符串然后隐蔽它浮动.

You can split the string using str.split then covert the it to float.

>>> s = '2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,'

>>> s.split(',')
['2020-09-15 00:05:43', 'chemsense', 'co', 'concentration', '-0.75889', '']

>>> s.split(',')[4] # get the number (5th item in the list)
'-0.75889'

>>> float(s.split(',')[4]) # convert to float type
-0.75889

>>> round(float(s.split(',')[4]), 2) # round to 2 decimal place
-0.76

from scipy import stats

# Assuming strings coming back from API are in a list
str = [
'2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,',
'2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,',
'2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,',
'2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,',
'2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,',
'2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,'
]

x = []

for s in str:
  x.append(float(s.split(',')[4]))

stats.kstest(x, 'norm')

这篇关于Python Google表格API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆