有分析NHANES数据的大佬吗？有一个数据加权的问题

穿林夜雨 · 发表于 2021-2-4 11:07:48

注册后推荐绑定QQ，之后方才可以使用下方的“用QQ帐号登录”。

您需要登录才可以下载或查看，没有账号？立即注册

x

我目前需要整合五个周期的数据，包括demographic，questionnaire，Laboratory的数据一起分析，然而NHANES是提供了interview和MEC（流动检查站）两个加权变量，Laboratory里有的指标还有不同的sample weight A或B等等，这些加权变量之间数值都有略微的差异，如果整合分析的话我应该加权哪个变量啊

。查阅既往的NHANES有关分析的文章几乎都不会提自己加权了什么变量

{:6_406:}{:6_406:}

穿林夜雨 · 发表于 2021-2-4 11:14:14

有时候Laboratory data不同年份同一个指标还有weight命名，例如WTSA2YR，WTSB2YR，WTSC2YR，也就是这些可能拿的是这三年检测这个指标的时候拿的是A、B、C个样，难道还要单独对其进行加权后再整合进库里？？？（抱歉我自己也混乱了。。。。）

穿林夜雨 · 发表于 2021-2-4 11:19:10

National Health and Nutrition Examination Survey: Analytic Guidelines, 2011-2014 and 2015-2016
December 14, 2018中提到：
Various sample weights are available on the data release files. Use of the correct sample weight for NHANES analyses depends on the variables being used. A good rule of thumb is to use “the least common denominator” where the variable of interest that was collected on the smallest number of respondents is the “least common denominator.” The sample weight that applies to that variable is the appropriate one to use for that particular analysis.
Sampled participants who completed the interview and were eligible for the examination, but did not respond, were assigned non-zero interview weights and examination weights of zero. Records with a zero examination weight should be treated as missing when the exam data are analyzed. For example, if all variables come from the interview and exam, then the sample used in the analysis should reflect only those with non-zero exam weights and exam weights should be used in the analysis. Similarly, if any variable used comes from a specific subsample, then the sample used in the analysis should only represent those with a non-zero subsample weight and the subsample weights should be used in the analysis.
也就是说虽然我整合了包括demographic、questionnaire、laboratory的数据，但是因为我是整体分析的，只要加权一个统一有的WTINT2Y或者WTMEC2Y就行了？

DY1225 · 发表于 2021-2-10 09:59:16

一般是用样本量最少的那个加权，比如~实验室检测的某项指标~一般只会检测总抽样人群的1/3，你要用那项指标的话，加权就按那项指标带的加权算

穿林夜雨 · 发表于 2021-2-18 17:07:21

DY1225 发表于 2021-2-10 09:59
一般是用样本量最少的那个加权，比如~实验室检测的某项指标~一般只会检测总抽样人群的1/3，你要用那项指标 ...

感谢大佬解答，今天刚好也看到一篇文献（10.1016/j.envpol.2020.115489）印证了您的回答。但是我还有一个疑问：例如用某检测指标甲，在多年份数据合并使用时，他们使用的样本有区别（A样和B样），2007-2008年甲指标数据中加权变量备注的是sample A，2009-2010年却又变成sample B，这两个加权变量是否可以直接合并呢

穿林夜雨 · 发表于 2021-2-18 17:17:38

DY1225 发表于 2021-2-10 09:59
一般是用样本量最少的那个加权，比如~实验室检测的某项指标~一般只会检测总抽样人群的1/3，你要用那项指标 ...

比如全氟烷基化合物的多年份权重变量就有如以下所示
2007-2008变量名：WTSC2YR（标签：Two-year MEC weights of subsample C）；
2007-2008变量名：WTSC2YR（标签：Two-year MEC weights of subsample C）；
2009-2010变量名：WTSC2YR（标签：Subsample A weights）；
2009-2010变量名：WTSC2YR（标签：Subsample B weights）。
能否直接合并为一个变量呢？

DY1225 · 发表于 2021-2-18 17:36:24

穿林夜雨发表于 2021-2-18 17:07
感谢大佬解答，今天刚好也看到一篇文献（10.1016/j.envpol.2020.115489）印证了您的回答。但是我还有一个 ...

我感觉这个就是命名的问题，具体他们怎么抽样的我不太清楚，我猜想可能是抽了A和B然后选择用A或者B，但是都是用1/3来推全人群的，应该是可以合并的

DY1225 · 发表于 2021-2-18 17:38:13

穿林夜雨发表于 2021-2-18 17:17
比如全氟烷基化合物的多年份权重变量就有如以下所示
2007-2008变量名：WTSC2YR（标签：Two-year MEC wei ...

MEC的和子样本的权重应该不是同一个意思，MEC一般是前面检测或者说体检的1的人群，子样本的一般是1/3的人群的权重

穿林夜雨 · 发表于 2021-2-18 20:03:07

DY1225 发表于 2021-2-18 17:38
MEC的和子样本的权重应该不是同一个意思，MEC一般是前面检测或者说体检的1的人群，子样本的一般是1/3的人 ...

多谢表妹

DY1225 · 发表于 2021-2-19 08:32:43

穿林夜雨发表于 2021-2-18 20:03
多谢表妹

啧，这地位~一下就下来了

账号		自动登录	找回密码
密码			立即注册

[论文写作] 有分析NHANES数据的大佬吗？有一个数据加权的问题

注册后推荐绑定QQ，之后方才可以使用下方的“用QQ帐号登录”。

评分

评分