search for




 

이중 구간 중도절단 자료에 대한 순위 기반 회귀 추정법 연구
Rank regression inferences on doubly interval-censored data
Korean J Appl Stat 2024;37(6):769-782
Published online December 31, 2024
© 2024 The Korean Statistical Society.

박서현a, 최상범1, a
Seohyeon Parka, Sangbum Choi1,a

a고려대학교 통계학과

aDepartment of Statistics, Korea University
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MSIT) (Grant No. 2022M3J6A1063595, 2022R1A2C1008514).
1Corresponding author : Department of Statistics, Korea University, 145 Anam-ro, Seongbuk-gu, Seoul 02841, Korea.
E-mail: choisang@korea.ac.kr
Received August 23, 2024; Revised October 7, 2024; Accepted October 9, 2024.
Abstract
많은 의학 연구, 특히 질병 진행 연구에서는 서로 연관된 두 개의 연속적인 사건이 발생하는 경우를 자주 접하게 되며, 두 사건 시간 모두 정기적인 검진으로 인하여 구간 중도절단되는 경향이 있다. 이러한 자료를 이중 구간 중도절단(DIC) 되었다고 하며, 본 연구의 주요 관심사는 두 연속적인 사건 사이의 경과 시간이다. 본 논문에서는 준모수적 가속 수명 시간 모형을 바탕으로 DIC 자료를 분석하기 위한 순위 기반 회귀 방법론을 제안하였다. 복잡한 DIC 자료를 경과 시간이 포함하는 단일 구간 중도절단 자료로 바꾸어 관측되는 잔차들 간 비교 가능한 짝의 크기를 비교하여 게한 및 로그 순위 가중치를 고려한 추정 방정식을 바탕으로 회귀계수를 추정하였다. 일변량 DIC 자료에서 군집의 크기가 유의미한 정보가 있는 경우까지 고려한 군집 DIC 자료를 하여 군집 크기의 역가중치를 추가한 방법론으로 확장하였다. 추정량의 분산을 추정하기 위하여 효율적인 방법을 사용하였으며, 유한 표본에서의 방법론 성능을 평가하기 위한 다양한 모의실험을 수행하였다. 마지막으로, 본 연구의 활용성을 살펴보기 위하여 제안한 방법론을 군집 DIC 구조를 갖는 실제 데이터에 적용하고 그 결과를 제시하였다.
In many biomedical fields, especially in studies of disease progressions, we frequently encounter two sequential events, both of which are often interval-censored due to regular examinations. Such a structure is called doubly interval-censoring (DIC), and our primary interest is the elapsed time between two consecutive events. In this paper, we propose a weighted rank regression approach for DIC data under the semiparametric accelerated failure time model. After transforming DIC data into simple interval-censored data where the true elapsed times may lie, we develop estimation procedures with a Gehan-type weight by gathering all comparable pairs of observed residuals from transformed data. Moreover, we generalize this approach with data-dependent weights and extend it to clustered DIC data, where the cluster size is potentially informative, using an inverse weighting strategy. An ecient technique for variance estimation as an alternative to resampling techniques is considered. We establish asymptotic properties and conduct numerical studies to demonstrate finite sample performances. Finally, we illustrate our method with a real dataset for clustered DIC data.
주요어 : 가속 수명 시간, 군집 자료, 이중 구간 중도절단, Gehan 통계량, 순위 기반 회귀
Keywords : accelerated lifetime, clustered data, doubly interval-censoring, Gehan statistic, rank regression
References
  1. Besag J, Green P, Higdon D, and Mengersen K (1995). Bayesian computation and stochastic systems (with discussion), Statistical Science, 10, 3-66.
    CrossRef
  2. Bogaerts K, Komárek A, and Lesaffre E (2017). Survival Analysis with Interval-censored Data: A Practical Approach with Examples in R, SAS, and BUGS, Chapman and Hall/CRC, New York.
    CrossRef
  3. Choi T, Choi S, and Bandyopadhyay D (2024+). Rank estimation for the accelerated failure time model with partially interval-censored data, Statistica Sinica (under revision).
  4. Cong XJ, Yin G, and Shen Y (2007). Marginal analysis of correlated failure time data with informative cluster sizes, Biometrics, 63, 663-672.
    Pubmed CrossRef
  5. Dejardin D and Lesaffre E (2013). Stochastic EM algorithm for doubly interval-censored data, Biostatistics, 14, 766-778.
    Pubmed CrossRef
  6. De Gruttola V and Lagakos SW (1989). Analysis of doubly-censored survival data, with application to AIDS, Biometrics, 45, 1-11.
    Pubmed CrossRef
  7. Efron B and Tibshirani RJ (1994). An Introduction to the Bootstrap, Chapman and Hall/CRC, New York.
    CrossRef
  8. Fan J and Datta S (2011). Fitting marginal accelerated failure time models to clustered survival data with potentially informative cluster size, Computational Statistics & Data Analysis, 55, 3295-3303.
    CrossRef
  9. Fang HB and Sun J (2001). Consistency of nonparametric maximum likelihood estimation of a distribution function based on doubly interval-censored failure time data, Statistics & Probability Letters, 55, 311-318.
    CrossRef
  10. Fygenson M and Ritov YA (1994). Monotone estimating equations for censored data, The Annals of Statistics, 22, 732-746.
    CrossRef
  11. Jara A, Lesaffre E, De Iorio M, and Quintana F (2010). Bayesian semiparametric inference for multivariate doubly-interval-censored data, The Annals of Applied Statistics, 4, 2126-2149.
    CrossRef
  12. Jin Z, Lin DY,Wei LJ, and Ying Z (2003). Rank-based inference for the accelerated failure time model, Biometrika, 90, 341-353.
    CrossRef
  13. Jin Z, Lin DY, and Ying Z (2006). Rank regression analysis of multivariate failure time data based on marginal linear models, Scandinavian Journal of Statistics, 33, 1-23.
    CrossRef
  14. Jin Z, Ying Z, and Wei LJ (2001). A simple resampling method by perturbing the minimand, Biometrika, 88, 381-390.
    CrossRef
  15. Kalbfleisch JD and Lawless JF (1985). The analysis of panel data under a Markov assumption, Journal of the American Statistical Association, 80, 863-871.
    CrossRef
  16. Kim MY, De Gruttola VG, and Lagakos SW (1993). Analyzing doubly censored data with covariates, with application to AIDS, Biometrics, 49, 13-22.
    Pubmed CrossRef
  17. Kim YJ (2010). Regression analysis of clustered interval-censored data with informative cluster size, Statistics in Medicine, 29, 2956-2962.
    Pubmed CrossRef
  18. Komárek A, Lesaffre E, Härkänen T, Declerck D, and Virtanen JI (2005). A Bayesian analysis of multivariate doubly-interval-censored dental data, Biostatistics, 6, 145-155.
    Pubmed CrossRef
  19. Komárek A and Lesaffre E (2006). Bayesian semi-parametric accelerated failure time model for paired doubly interval-censored data, Statistical Modelling, 6, 3-22.
    CrossRef
  20. Komárek A and Lesaffre E (2008). Bayesian accelerated failure time model with multivariate doubly intervalcensored data and flexible distributional assumptions, Journal of the American Statistical Association, 103, 523-533.
    CrossRef
  21. Li Z and Owzar K (2016). Fitting cox models with doubly censored data using spline-based sieve marginal likelihood, Scandinavian Journal of Statistics, 43, 476-486.
    Pubmed KoreaMed CrossRef
  22. Lin Y and Chen K (2013). Efficient estimation of the censored linear regression model, Biometrika, 100, 525-530.
    CrossRef
  23. Reich NG, Lessler J, Cummings DA, and Brookmeyer R (2009). Estimating incubation period distributions with coarse data, Statistics in Medicine, 28, 2769-2784.
    Pubmed CrossRef
  24. Spiekerman CF and Lin DY (1998). Marginal regression models for multivariate failure time data, Journal of the American Statistical Association, 93, 1164-1175.
    CrossRef
  25. Sun J (2006). The Statistical Analysis of Interval-censored Failure Time Data, Springer, New York.
  26. Sun J and Zhao X (2013). Statistical Analysis of Panel Count Data, Springer, New York.
    CrossRef
  27. Turnbull BW (1974). Nonparametric estimation of a survivorship function with doubly censored data, Journal of the American Statistical Association, 69, 169-173.
    CrossRef
  28. Turnbull BW (1976). The empirical distribution function with arbitrarily grouped, censored and truncated data, Journal of the Royal Statistical Society: Series B (Methodological), 38, 290-295.
    CrossRef
  29. van der Vaart AW and Wellner JA (1996). Weak Convergence and Empirical Processes with Applications to Statistics, Springer, New York.
    CrossRef
  30. Wang YG and Zhao Y (2008). Weighted rank regression for clustered data analysis, Biometrics, 64, 39-45.
    Pubmed CrossRef
  31. Zeng D and Lin DY (2008). Efficient resampling methods for nonsmooth estimating functions, Biostatistics, 9, 355-363.
    Pubmed KoreaMed CrossRef
  32. Zhang X and Sun J (2010). Regression analysis of clustered interval-censored failure time data with informative cluster size, Computational Statistics & Data Analysis, 4, 1817-1823.
    Pubmed KoreaMed CrossRef


December 2024, 37 (6)