从公式创建 statsmodels 模型时出现语法错误 [英] Syntax error when creating statsmodels Model from formula

查看:31
本文介绍了从公式创建 statsmodels 模型时出现语法错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下线性回归代码:

I have the following code for linear regression:

# building a base model

# INSTANTIATING a model type
lm_practice = smf.ols(formula = """  Open_AAL ~
                                        High_AAL +
                                        Low_AAL +
                                        Close_AAL +
                                        Adj Close_AAL +
                                        Volume_AAL +
                                        Open_SP +
                                        High_SP +
                                        Low_SP +
                                        Close_SP +
                                        Adj Close_SP+
                                        Volume_SP
                                        """,
                                     data = fin)

# telling Python to FIT the data to the blueprint
results = lm_practice.fit()

# printing a summary of the results
print(results.summary())

但是有一个语法错误:

Traceback (most recent call last):
  File "C:\Users\Home\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 3326, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-24-0a39fdb04edd>", line 17, in <module>
    data = fin)
  File "C:\Users\Home\Anaconda3\lib\site-packages\statsmodels\base\model.py", line 159, in from_formula
    missing=missing)
  File "C:\Users\Home\Anaconda3\lib\site-packages\statsmodels\formula\formulatools.py", line 65, in handle_formula_data
    NA_action=na_action)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 310, in dmatrices
    NA_action, return_type)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 165, in _do_highlevel_design
    NA_action)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\highlevel.py", line 70, in _try_incr_builders
    NA_action)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\build.py", line 689, in design_matrix_builders
    factor_states = _factors_memorize(all_factors, data_iter_maker, eval_env)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\build.py", line 354, in _factors_memorize
    which_pass = factor.memorize_passes_needed(state, eval_env)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 474, in memorize_passes_needed
    subset_names = [name for name in ast_names(self.code)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 474, in <listcomp>
    subset_names = [name for name in ast_names(self.code)
  File "C:\Users\Home\Anaconda3\lib\site-packages\patsy\eval.py", line 105, in ast_names
    for node in ast.walk(ast.parse(code)):
  File "C:\Users\Home\Anaconda3\lib\ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)
  File "<unknown>", line 1
    Adj Close_SP
               ^
SyntaxError: invalid syntax

出了什么问题,我该如何解决?

What is wrong and how can I fix this?

推荐答案

您需要替换列名中的空格,并在公式中使用更新后的列名,例如:

You need to replace the spaces in your column names, and use the updated column names in the formula, for example:

import statsmodels.formula.api as smf
import pandas as pd
import numpy as np

fin = pd.DataFrame({'Open_AAL':np.random.uniform(0,1,100),
                    'Adj Close_AAL':np.random.uniform(0,1,100),
                   'High_AAL':np.random.uniform(0,1,100)})

    Open_AAL    Adj Close_AAL   High_AAL
0   0.260162    0.515144    0.995558
1   0.381395    0.187687    0.106275
2   0.016885    0.381614    0.797739
3   0.772720    0.388308    0.856932

fin.columns = fin.columns.str.replace(" ","_")

fin.columns
Index(['Open_AAL', 'Adj_Close_AAL', 'High_AAL'], dtype='object')
    
lm_practice = smf.ols("Open_AAL ~ Adj_Close_AAL + High_AAL",data = fin)
results = lm_practice.fit()

这篇关于从公式创建 statsmodels 模型时出现语法错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆