教育支出对犯罪的影响,多层次混合模型结构 [英] Effects of Education Spending on Crime, multi-level mixed-model structure

查看:75
本文介绍了教育支出对犯罪的影响,多层次混合模型结构的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究每个学区的教育支出对这些学区服务超过 15 年的城镇内犯罪率的影响.(DV 有 1,676,191 次对这 15 年城市/城镇犯罪数据的观察.

I’m looking at the effect of education_expenditure per school district on crime rate within the cities and towns those school districts serve over a fifteen year period. (The DV has 1,676,191 observations of city/town crime data over those fifteen years).

从技术上讲,城市与学区是交叉的,一个城市可能有多个学区.这意味着一个城市的每个学生的支出可能有多个值.但是,学区与县重叠.

Cities are technically crossed with school district, in that one city might attend multiple school districts. This means that one city could have multiple values for expenditure per student. School districts, however, overlap with counties.

城市嵌套在县内,但鉴于每个城市/城镇都有不同的 PLACE_ID,我的理解是这可以表示为 (1|PLACE_ID) + (1|COUNTY_ID) 或 (1|PLACE_ID/COUNTY_ID).

Cities are nested within county, but given that each city/town has a distinct PLACE_ID, my understanding is that this could be represented as (1|PLACE_ID) + (1|COUNTY_ID) or (1|PLACE_ID/COUNTY_ID).

我对混合效应模型非常熟悉,而且我浏览了一些清晰且内容丰富的帖子,例如:https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-他们如何不同以及他们如何指定 ;然而,我仍然有点犹豫是否可以创建一个像下面这样的交叉效应模型:

I’m pretty familiar with mixed-effect models, and I’ve looked through clear and informative posts such as this one: https://stats.stackexchange.com/questions/228800/crossed-vs-nested-random-effects-how-do-they-differ-and-how-are-they-specified ; however, I’m still a little stuck with whether I can create a crossed effect model like the one below:

glmer.total <- glmer(CRIME_TOTAL ~ cent.log.pop + cent.log.pop.dens + year + cent.log.unemployment_rate + cent.schooldist.prop5.17.pov + cent.log.per.cap + diff.dem + cent.log.enforcement +cent.EXP_STUDENT + (year|PLACE_ID/COUNTY_ID) + (year|full_district_id) + (1|STATE), family = "poisson", control = glmerControl(optimizer = "nloptwrap", calc.derivs = FALSE), REML = FALSE, total.years, na.action = "na.omit")

变量居中并记录:每个城市的人口数量、每个城市的人口密度、年份、每个县的失业率、每个学区的贫困儿童比例、每个县的人均收入、差异在每个县的总统选举中投票给民主党的人,每个城市/城镇的日志执行情况,每个学生的集中支出/1000(每个学区).PLACE_ID 对应城市和城镇,COUNTY_ID 对应县,full_district_id 对应学区和州.

Variables are centered and logged: pop per city, pop.dens per city, year, unemployment rate per county, proportion children living in poverty per school district, per capita income per county, difference in those who voted democrat in presidential elections per county, log enforcement per city/town, centered expenditure per student/ 1000 (per school district). PLACE_ID corresponds to cities and towns, COUNTY_ID to counties, full_district_id to school districts, and state.

  • 我是否需要选择按年平均每个城市的支出,或者是上面的代码合法吗?我收到以下错误:
extra argument(s) ‘REML’ disregardedError in pwrssUpdate(pp, resp, tol = tolPwrss, GQmat = GHrule(0L), compDev = compDev,  : 
  (maxstephalfit) PIRLS step-halvings failed to reduce deviance in pwrssUpdate

我提供了以下数据的简短片段(但仅选择了一些变量以使其更小,作为 dput.

I’ve provided a brief snippet of the data below (but selected only a number of variables to make it much smaller, as a dput.

谢谢!

structure(list(STATE = c("alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama", "alabama", "alabama", 
"alabama", "alabama", "alabama", "alabama"), state = c("AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", "AL", 
"AL"), full_district_id = c("0100510", "0100510", "0100510", 
"0100510", "0100090", "0100540", "0101860", "0102635", "0102760", 
"0100090", "0100540", "0101860", "0102635", "0102760", "0100090", 
"0100540", "0101860", "0102635", "0102760", "0100090", "0100540", 
"0101860", "0102635", "0102760", "0100090", "0100540", "0101860", 
"0102635", "0102760", "0100090", "0100540", "0101860", "0102635", 
"0102760", "0100090", "0100540", "0101860", "0102635", "0102760", 
"0100090", "0100540", "0101860", "0102635", "0102760", "0100090", 
"0100540", "0101860", "0102635", "0102760", "0100090", "0100540", 
"0101860", "0102635", "0102760", "0100090", "0100540", "0101860", 
"0102635", "0102760", "0100090", "0100540", "0101860", "0102635", 
"0102760", "0100090", "0100540", "0101860", "0102635"), SCHOOL_DISTRICT.x = c("butler county school district", 
"butler county school district", "butler county school district", 
"butler county school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district", "piedmont city school district", 
"anniston city school district", "calhoun county school district", 
"jacksonville city school district", "oxford city school district", 
"piedmont city school district", "anniston city school district", 
"calhoun county school district", "jacksonville city school district", 
"oxford city school district"), COUNTY = c("butler ", "butler ", 
"butler ", "butler ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", 
"calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun ", "calhoun "
), year = c(2006, 2007, 2008, 2009, 2006, 2006, 2006, 2006, 2006, 
2007, 2007, 2007, 2007, 2007, 2008, 2008, 2008, 2008, 2008, 2009, 
2009, 2009, 2009, 2009, 2010, 2010, 2010, 2010, 2010, 2011, 2011, 
2011, 2011, 2011, 2012, 2012, 2012, 2012, 2012, 2013, 2013, 2013, 
2013, 2013, 2014, 2014, 2014, 2014, 2014, 2015, 2015, 2015, 2015, 
2015, 2016, 2016, 2016, 2016, 2016, 2017, 2017, 2017, 2017, 2017, 
2007, 2007, 2007, 2007), COUNTY_ID = c("01013", "01013", "01013", 
"01013", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015", "01015", "01015", "01015", "01015", "01015", 
"01015", "01015"), SCHOOL_DISTRICT.y = c("BUTLER CO SCH DIST", 
"BUTLER CO SCH DIST", "BUTLER COUNTY SCHOOL DISTRICT", "BUTLER COUNTY SCHOOL DISTRICT", 
"ANNISTON CTY SCH DST", "CALHOUN CO SCH DIST", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCH DIST", "PIEDMONT CTY SCH DIST", "ANNISTON CTY SCH DST", 
"CALHOUN CO SCH DIST", "JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCH DIST", 
"PIEDMONT CTY SCH DIST", "ANNISTON CTY SCH DST", "CALHOUN COUNTY SCHOOL DISTRICT", 
"JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCHOOL DISTRICT", 
"PIEDMONT CITY SCHOOL DISTRICT", "ANNISTON CITY SCHOOL DISTRICT", 
"CALHOUN COUNTY SCHOOL DISTRICT", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCHOOL DISTRICT", "PIEDMONT CITY SCHOOL DISTRICT", 
"ANNISTON CITY SCHOOL DISTRICT", "CALHOUN COUNTY SCHOOL DISTRICT", 
"JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCHOOL DISTRICT", 
"PIEDMONT CITY SCHOOL DISTRICT", "ANNISTON CITY SCHOOL DISTRICT", 
"CALHOUN COUNTY SCHOOL DISTRICT", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCHOOL DISTRICT", "PIEDMONT CITY SCHOOL DISTRICT", 
"ANNISTON CITY SCHOOL DISTRICT", "CALHOUN COUNTY SCHOOL DISTRICT", 
"JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCHOOL DISTRICT", 
"PIEDMONT CITY SCHOOL DISTRICT", "ANNISTON CITY SCHOOL DISTRICT", 
"CALHOUN COUNTY SCHOOL DISTRICT", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCHOOL DISTRICT", "PIEDMONT CITY SCHOOL DISTRICT", 
"ANNISTON CITY SCHOOL DISTRICT", "CALHOUN COUNTY SCHOOL DISTRICT", 
"JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCHOOL DISTRICT", 
"PIEDMONT CITY SCHOOL DISTRICT", "ANNISTON CITY SCHOOL DISTRICT", 
"CALHOUN COUNTY SCHOOL DISTRICT", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCHOOL DISTRICT", "PIEDMONT CITY SCHOOL DISTRICT", 
"ANNISTON CITY SCHOOL DISTRICT", "CALHOUN COUNTY SCHOOL DISTRICT", 
"JACKSONVILLE CITY SCHOOL DISTRICT", "OXFORD CITY SCHOOL DISTRICT", 
"PIEDMONT CITY SCHOOL DISTRICT", "ANNISTON CITY SCHOOL DISTRICT", 
"CALHOUN COUNTY SCHOOL DISTRICT", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCHOOL DISTRICT", "PIEDMONT CITY SCHOOL DISTRICT", 
"ANNISTON CTY SCH DST", "CALHOUN CO SCH DIST", "JACKSONVILLE CITY SCHOOL DISTRICT", 
"OXFORD CITY SCH DIST"), EXPENDITURE_PER_STUDENT = c(7593, 8334, 
9281, 9147, 8240, 7216, 6822, 7554, 7155, 8758, 8186, 7323, 8380, 
7710, 9070, 8707, 8070, 8853, 8054, 9364, 8212, 7787, 8560, 7760, 
10006, 8264, 7964, 8771, 8442, 10154, 8209, 7955, 8967, 7911, 
10661, 8157, 8096, 9097, 7660, 11480, 8415, 9351, 8829, 8102, 
12057, 8479, 8708, 8965, 8443, 10988, 8930, 8932, 9118, 8706, 
11277, 9134, 9223, 9347, 8524, 11277, 9134, 9223, 9347, 8524, 
8758, 8186, 7323, 8380), PLACE_ID = c("0101345496", "0101345496", 
"0101345496", "0101345496", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101501852", "0101501852", "0101501852", 
"0101501852", "0101501852", "0101529992", "0101529992", "0101529992", 
"0101529992"), CITY = c("mckenzie", "mckenzie", "mckenzie", "mckenzie", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"anniston", "anniston", "anniston", "anniston", "anniston", "anniston", 
"glencoe", "glencoe", "glencoe", "glencoe"), POPULATION.EST = c(548, 
542, 536, 526, 23470, 23470, 23470, 23470, 23470, 23360, 23360, 
23360, 23360, 23360, 23313, 23313, 23313, 23313, 23313, 23262, 
23262, 23262, 23262, 23262, 23106, 23106, 23106, 23106, 23106, 
22849, 22849, 22849, 22849, 22849, 22644, 22644, 22644, 22644, 
22644, 22457, 22457, 22457, 22457, 22457, 22280, 22280, 22280, 
22280, 22280, 22107, 22107, 22107, 22107, 22107, 21926, 21926, 
21926, 21926, 21926, 21770, 21770, 21770, 21770, 21770, 32, 32, 
32, 32), CRIME_VIOLENT = c(1, 0, 2, 1, 521, 521, 521, 521, 521, 
541, 541, 541, 541, 541, 572, 572, 572, 572, 572, 584, 584, 584, 
584, 584, 0, 0, 0, 0, 0, 411, 411, 411, 411, 411, 504, 504, 504, 
504, 504, 461, 461, 461, 461, 461, 536, 536, 536, 536, 536, 607, 
607, 607, 607, 607, 735, 735, 735, 735, 735, 754, 754, 754, 754, 
754, 9, 9, 9, 9), CRIME_PROPERTY = c(6, 9, 9, 7, 3044, 3044, 
3044, 3044, 3044, 2912, 2912, 2912, 2912, 2912, 2429, 2429, 2429, 
2429, 2429, 2379, 2379, 2379, 2379, 2379, 2038, 2038, 2038, 2038, 
2038, 2323, 2323, 2323, 2323, 2323, 2484, 2484, 2484, 2484, 2484, 
1988, 1988, 1988, 1988, 1988, 1711, 1711, 1711, 1711, 1711, 1645, 
1645, 1645, 1645, 1645, 1712, 1712, 1712, 1712, 1712, 1352, 1352, 
1352, 1352, 1352, 106, 106, 106, 106), CRIME_TOTAL = c(7, 9, 
11, 8, 3565, 3565, 3565, 3565, 3565, 3453, 3453, 3453, 3453, 
3453, 3001, 3001, 3001, 3001, 3001, 2963, 2963, 2963, 2963, 2963, 
2038, 2038, 2038, 2038, 2038, 2734, 2734, 2734, 2734, 2734, 2988, 
2988, 2988, 2988, 2988, 2449, 2449, 2449, 2449, 2449, 2247, 2247, 
2247, 2247, 2247, 2252, 2252, 2252, 2252, 2252, 2447, 2447, 2447, 
2447, 2447, 2106, 2106, 2106, 2106, 2106, 115, 115, 115, 115), 
    prop.5.17.pov = c(30.2, 33.1, 28.1, 33.4, 25.6, 25.6, 25.6, 
    25.6, 25.6, 24, 24, 24, 24, 24, 20.1, 20.1, 20.1, 20.1, 20.1, 
    23.7, 23.7, 23.7, 23.7, 23.7, 30.8, 30.8, 30.8, 30.8, 30.8, 
    31.1, 31.1, 31.1, 31.1, 31.1, 30.5, 30.5, 30.5, 30.5, 30.5, 
    27, 27, 27, 27, 27, 25.4, 25.4, 25.4, 25.4, 25.4, 29.8, 29.8, 
    29.8, 29.8, 29.8, 24.2, 24.2, 24.2, 24.2, 24.2, NA, NA, NA, 
    NA, NA, 24, 24, 24, 24)), row.names = c(NA, -68L), class = c("tbl_df", 
"tbl", "data.frame"))

推荐答案

非常片面的回答:

  • (year|PLACE_ID/COUNTY_ID) 向后看(嵌套语法是 larger/smaller),但根据您对编码的描述,(year|PLACE_ID) + (year|COUNTY_ID)(year|COUNTY_ID/PLACE_ID) 都可以.(或 (year|STATE/COUNTY_ID/PLACE_ID) ?)
  • 跨越地区和州/县似乎是合理的,尽管我可以想象如果某些 PLACE_ID 构成一个单一学区,事情会变得有点奇怪......
  • 给出每个地区的支出应该没问题
  • (year|PLACE_ID/COUNTY_ID) looks backwards (the nesting syntax is larger/smaller), but given your description of the coding, either (year|PLACE_ID) + (year|COUNTY_ID) or (year|COUNTY_ID/PLACE_ID) would work. (Or (year|STATE/COUNTY_ID/PLACE_ID) ?)
  • crossing districts and states/counties seems reasonable, although I can imagine things getting a bit weird if some PLACE_IDs constitute a single school district ...
  • it should be fine to give expenditure per district

很难说完整模型出了什么问题.您可以从数据中获取多小的子样本并复制问题?该模型是否作为线性混合模型工作?如果是这样,拟合模型是否会提示您数据/模型可能有什么奇怪之处?

It's hard to say what's going wrong with the full model. How small a subsample can you take of the data and replicate the problem? Does the model work as a linear mixed model? If so, does the fitted model give you any hints about what might be weird with the data/model?

这篇关于教育支出对犯罪的影响,多层次混合模型结构的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆