公卫人

 找回密码
 立即注册

QQ登录

只需一步,快速开始

查看: 3436|回复: 1

[分享] winsor2 异常值处理

[复制链接]
epiman 发表于 2016-2-11 18:56:36 | 显示全部楼层 |阅读模式

注册后推荐绑定QQ,之后方才可以使用下方的“用QQ帐号登录”。

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
winsor
! a0 f4 P0 I: T: Cwinsor2    winsor2 can winsorize a varlist, operate with the by prefix, and offers a replace option.
7 l7 p2 q( E' @2 l6 l8 l! J
5 ~: O% A3 Z( g3 M4 Rtrimplot  i  X7 _- Y: ]% s) m/ m; Q9 Y( }; P
trimmean & m5 M% c2 J( E

  k( u/ Y4 M9 I. G! Q( p* g, r$ }Winsorizing is not equivalent to simply excluding data, which is a simpler procedure, called trimming or truncation.  In a trimmed estimator, the extreme values are discarded;: Y6 S5 R8 A8 u, S
in a Winsorized estimator, the extreme values are instead replaced by certain percentiles, specified by option cuts(# #).
. O& D  s' c5 T" y+ \! d0 y; {+ v) F9 [8 j' x. B5 p; j1 q
  . sysuse nlsw88, clear
/ _9 u) {% Q& G2 h  . sum wage, detail
: |1 @) L' ^4 Q3 K3 `( ~% j; T/ o! ^( E( \  P3 T3 e: U& d
Winsorizing- w8 ~  X5 t. g: _5 @; [
, P( C: s3 M; w% o5 H' D; o4 W1 |
    In defult, winsor2 winsorize wage at 1th and 99th percentiles,5 S, P) c6 n# ?' `8 P% p, B

- d/ o, {8 \2 Z  n        . winsor2 wage, replace cuts(1 99)
: U" P0 V* [) ^& v/ G, G& U% S* d+ r6 q% d. v0 N
    which can be done by hands:9 P# w* u! f- z* E

. A1 M, n) ]* k        . replace wage=1.930993 if wage<1.9309935 M+ P3 ?0 U- \
        . replace wage=38.70926 if wage>38.709264 `* @# e3 T8 n# @  O

' P/ o9 T8 I3 j1 l2 K9 K; P' H# ^    Note that, values smaller than the 1th percentile is repalce by the 1th percentile, and the similar thing is done with the 99th percentile.' \- \5 G7 F5 v- j1 L

: ?% t8 R* J$ d, ]. Rtrimming
( t* c. M& o9 g  A- a
* l  A" J: k& `3 h% ~" g: K! [Things change when -trim- option is specified:
6 ?3 a7 z' U5 e, O/ l# Z" V9 v
( N1 ]* X! z/ r+ E- ~; ~5 W        . winsor2 wage, replace cuts(1 99) trim
+ E6 y* F- P3 j& }; j+ @7 o8 R* Z' s4 |0 |& m. c* U; D  M( O
    which can also be done by hands:
! ]* M7 L- [/ r! _+ f' T( w/ [2 B% x7 K, I; c4 t) i0 Z
        . replace wage=. if wage<1.930993+ K$ q3 O: B( Z; Z
        . replace wage=. if wage>38.70926
+ H! G' y! M- M2 ^+ O0 A3 Q; J6 f& K% {1 y. d+ X' f5 A( V
In this case, we discard values smaller than 1th percentile or greater than 99th percentile.  This is trimming.  K3 u3 R6 a& H0 @) I% T) T

7 C! c: E, I  ]% V! f简介:winsor2 winsorize or trim (if trim option is specified) the variables in varlist at particular percentiles specified by option cuts(# #). In defult, new variables will be generated with a suffix "_w" or "_tr", which can be changed by specifying suffix() option. The replace option replaces the variables with their winsorized or trimmed ones.
; }5 N. C# v8 k0 K/ T
; U3 H0 L9 _: t- \$ x% K$ y相比于winsor命令的改进:
+ W, i, G& t4 Z; _- B' J(1) 可以批量处理多个变量;$ i# ]4 y* u7 U, p7 b7 R
(2) 不仅可以 winsor,也可以 trimming;% j  ]% ~& y" f: k5 `3 z6 D7 g; K  v
(3) 附加了 by() 选项,可以分组 winsor 或 trimming;
' z+ [* g7 D$ o6 j(4) 增加了 replace 选项,可以不必生成新变量,直接替换原变量。
/ f0 q* [0 V' w' j% R& T* r: z/ s
: \2 U0 _7 O% i$ I8 n6 ^& T& S下载:
8 K2 K* c+ t) F, h, [. Q& C% jssc install winsor2, replace( m: N$ ?" }7 P0 b+ v* J. u

+ z( G+ p! p1 A7 {( I# a

评分

参与人数 1钢镚 +5 收起 理由
异香菲 + 5 多多分享,互相学习

查看全部评分

异香菲 发表于 2019-10-7 10:10:02 | 显示全部楼层
哇哦,好棒
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

充值|至尊会员|接种|公卫人 ( 沪ICP备06060850号-3 )

GMT+8, 2020-6-6 19:38 , Processed in 0.186164 second(s), 34 queries , Gzip On.

Powered by Discuz! X3.4

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表