Python Google表格API [英] Python Google Sheets API
问题描述
所以我有这个Google Sheets API,我正在从中获取数据并运行KS测试.但是,我只想对一个数字运行KS测试.但是,字符串也包含单词.例如,在这里
So I have this google sheets API, and I am grabbing data from it and running a KS test. However, I only want to run the KS test on a number. But, the string consists of words as well. For instance, here you go
2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,
2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,
2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,
2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,
2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,
2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,
如果我将其作为字符串,我将如何仅对每行的最后一个数字运行KS测试.为了方便起见,我只想在-.51,-.75,-1.23,-1.23,-.94,-1.16
If I have this as a string, How would I run a KS test on just the last numbers of each line. For instsnace, I only want to run the KS test on -.51,-.75,-1.23,-1.23,-.94,-1.16
这是我的Google工作表的屏幕截图:
Here is a screenshot of my Google sheet:
这是我的一些代码:
from scipy import stats
import numpy as np
import gspread
from oauth2client.service_account import ServiceAccountCredentials
import re
np.seterr(divide='ignore', invalid='ignore')
def estimate_cdf (col,bins=10,):
print (col)
# 'col'
# 'bins'
hist, edges = np.histogram(col)
csum = np.cumsum(hist)
return csum/csum[-1], edges
print (csum)
scope = ["https://spreadsheets.google.com/feeds",'https://www.googleapis.com/auth/spreadsheets',"https://www.googleapis.com/auth/drive.file","https://www.googleapis.com/auth/drive"]
creds = ServiceAccountCredentials.from_json_keyfile_name("creds.json", scope)
client = gspread.authorize(creds)
sheet = client.open("sheet1").sheet1 # Opens the spreadhseet
data = sheet.get_all_records()
row = sheet.row_values(3) # Grab a specific row
number_regex = r'^-?\d+\.?\d*$'
col = sheet.col_values(3) # Get a specific column print (col)
col2= sheet.col_values(4)
dolphin= estimate_cdf(adjusted := [float(i) for i in col if re.match(i, number_regex)], len(adjusted))
print(col)
print(col2)
shtest =stats.shapiro(col)
print(shtest)
#thelight= sheet.update_cell(5,6,col)
#print(thelight)
k2test =stats.ks_2samp(col, col2, alternative='two-sided', mode='auto')
print(k2test)
这是我的一些错误消息:
And here is some of my error message:
temperature,64.79599999999999,65.03830769230765','2020-09-25 11:38:51,metsense,htu21d,温度,64.85,65.01338461538458','2020-09-25 11:39:16,metsense,htu21d,温度,64.994,64.99538461538458','2020-09-25 11:39:42,metsense,htu21d,温度,65.066,64.98015384615381','2020-09-25 11:40:06,metsense,htu21d,温度,64.94,64.95799999999996 ','2020-09-25 11:40:31,metsense,htu21d,温度,64.976,64.93861538461535','2020-09-25 11:40,57,metsense,htu21d,温度,65.066,64.93307692307688','2020 -09-25 11:41:22,metsense,htu21d,温度65.048,64.93584615384611','2020-09-25 11:41:48,metsense,htu21d,温度,64.994,64.92753846153843','2020-09-25 11:42:12,metsense,htu21d,温度,64.976,64.93169230769227','2020-09-25 11:42:37,metsense,htu21d,温度,64.94,64.9441538461538','2020-09-25 11:43: 03,metsense,htu21d,温度,64.994,64.95523076923072','2020-09-25 11:43:28,metsense,htu21d,温度64.9'] 追溯(最近一次通话): 文件"C:/Users/james/PycharmProjectsfreshproj/shapiro wilks.py",第60行,在 shtest = stats.shapiro(col) 位于Shapiro的文件"C:\ Users \ james \ PycharmProjectsfreshproj \ venv \ lib \ site-packages \ scipy \ stats \ morestats.py",行1676 a,w,pw,ifault = statlib.swilk(y,a [:N//2],init) ValueError:无法将字符串转换为float:"、、、、"
temperature,64.79599999999999,65.03830769230765', '2020-09-25 11:38:51,metsense,htu21d,temperature,64.85,65.01338461538458', '2020-09-25 11:39:16,metsense,htu21d,temperature,64.994,64.99538461538458', '2020-09-25 11:39:42,metsense,htu21d,temperature,65.066,64.98015384615381', '2020-09-25 11:40:06,metsense,htu21d,temperature,64.94,64.95799999999996', '2020-09-25 11:40:31,metsense,htu21d,temperature,64.976,64.93861538461535', '2020-09-25 11:40:57,metsense,htu21d,temperature,65.066,64.93307692307688', '2020-09-25 11:41:22,metsense,htu21d,temperature,65.048,64.93584615384611', '2020-09-25 11:41:48,metsense,htu21d,temperature,64.994,64.92753846153843', '2020-09-25 11:42:12,metsense,htu21d,temperature,64.976,64.93169230769227', '2020-09-25 11:42:37,metsense,htu21d,temperature,64.94,64.9441538461538', '2020-09-25 11:43:03,metsense,htu21d,temperature,64.994,64.95523076923072', '2020-09-25 11:43:28,metsense,htu21d,temperature,64.9'] Traceback (most recent call last): File "C:/Users/james/PycharmProjectsfreshproj/shapiro wilks.py", line 60, in shtest =stats.shapiro(col) File "C:\Users\james\PycharmProjectsfreshproj\venv\lib\site-packages\scipy\stats\morestats.py", line 1676, in shapiro a, w, pw, ifault = statlib.swilk(y, a[:N//2], init) ValueError: could not convert string to float: ',,,,,'
以退出代码1完成的过程
Process finished with exit code 1
推荐答案
问题
给出来自Google Sheets API的字符串,对每个字符串的最后一个数字运行kstest.
Problem
Given strings coming from Google Sheets API, run kstest on the last number of each string.
一种更好的方法是直接从Google Sheets API中获取数字,将其存储并提供给stats.kstest
.
A better way would be getting the numbers straight from Google Sheets API, store them and feed to stats.kstest
.
您可以使用 str.split 拆分字符串然后隐蔽它浮动.
You can split the string using str.split then covert the it to float.
>>> s = '2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,'
>>> s.split(',')
['2020-09-15 00:05:43', 'chemsense', 'co', 'concentration', '-0.75889', '']
>>> s.split(',')[4] # get the number (5th item in the list)
'-0.75889'
>>> float(s.split(',')[4]) # convert to float type
-0.75889
>>> round(float(s.split(',')[4]), 2) # round to 2 decimal place
-0.76
from scipy import stats
# Assuming strings coming back from API are in a list
str = [
'2020-09-15 00:05:13,chemsense,co,concentration,-0.51058,',
'2020-09-15 00:05:43,chemsense,co,concentration,-0.75889,',
'2020-09-15 00:06:09,chemsense,co,concentration,-1.23385,',
'2020-09-15 00:06:33,chemsense,co,concentration,-1.23191,',
'2020-09-15 00:06:58,chemsense,co,concentration,-0.94495,',
'2020-09-15 00:07:23,chemsense,co,concentration,-1.16024,'
]
x = []
for s in str:
x.append(float(s.split(',')[4]))
stats.kstest(x, 'norm')
这篇关于Python Google表格API的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!