Email updates

Keep up to date with the latest news and content from Arthritis Research & Therapy and BioMed Central.

Open Access Research article

Improved responsiveness and reduced sample size requirements of PROMIS physical function scales with item response theory

James F Fries1*, Eswar Krishnan1, Matthias Rose2, Bharathi Lingala1 and Bonnie Bruce1

Author affiliations

1 Department of Medicine, Stanford University School of Medicine, 1000 Welch Road, Suite 203, Palo Alto, CA 04304, USA

2 University of Massachusetts Boston, 100 Morrissey Blvd., Boston, MA 02125-3393, USA

For all author emails, please log on.

Citation and License

Arthritis Research & Therapy 2011, 13:R147  doi:10.1186/ar3461

Published: 14 September 2011

Abstract

Introduction

The Health Assessment Questionnaire Disability Index (HAQ) and the SF-36 PF-10, among other instruments, yield sensitive and valid Disability (Physical Function) endpoints. Modern techniques, such as Item Response Theory (IRT), now enable development of more precise instruments using improved items. The NIH Patient Reported Outcomes Measurement Information System (PROMIS) is charged with developing improved IRT-based tools. We compared the ability to detect change in physical function using original (Legacy) instruments with Item-Improved and PROMIS IRT-based instruments.

Methods

We studied two Legacy (original) Physical Function/Disability instruments (HAQ, PF-10), their item-improved derivatives (Item-Improved HAQ and PF-10), and the IRT-based PROMIS Physical Function 10- (PROMIS PF 10) and 20-item (PROMIS PF 20) instruments. We compared sensitivity to detect 12-month changes in physical function in 451 rheumatoid arthritis (RA) patients and assessed relative responsiveness using P-values, effect sizes (ES), and sample size requirements.

Results

The study sample was 81% female, 87% Caucasian, 65 years of age, had 14 years of education, and had moderate baseline disability. All instruments were sensitive to detecting change (< 0.05) in physical function over one year. The most responsive instruments in these patients were the Item-Improved HAQ and the PROMIS PF 20. IRT-improved instruments could detect a 1.2% difference with 80% power, while reference instruments could detect only a 2.3% difference (P < 0.01). The best IRT-based instruments required only one-quarter of the sample sizes of the Legacy (PF-10) comparator (95 versus 427). The HAQ outperformed the PF-10 in more impaired populations; the reverse was true in more normal populations. Considering especially the range of severity measured, the PROMIS PF 20 appears the most responsive instrument.

Conclusions

Physical Function scales using item improved or IRT-based items can result in greater responsiveness and precision across a broader range of physical function. This can reduce sample size requirements and thus study costs.