New England Journal of Medicine, March, 2009
STUDY 1: Mortality Results from a Randomized Prostate-Cancer Screening Trial (US)
Background The effect of screening with prostate-specific–antigen (PSA) testing and digital rectal examination on the rate of death from prostate cancer is unknown.
This is the first report from the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial on prostate-cancer mortality.
Methods From 1993 through 2001, we randomly assigned 76,693 men at 10 U.S. study centers to receive either annual screening (38,343 subjects) or usual care as the control (38,350 subjects). Men in the screening group were offered annual PSA testing for 6 years and digital rectal examination for 4 years. The subjects and health care providers received the results and decided on the type of follow-up evaluation. Usual care sometimes included screening, as some organizations have recommended. The numbers of all cancers and deaths and causes of death were ascertained.
Results In the screening group, rates of compliance were 85% for PSA testing and 86% for digital rectal examination. Rates of screening in the control group increased from 40% in the first year to 52% in the sixth year for PSA testing and ranged from 41 to 46% for digital rectal examination. After 7 years of follow-up, the incidence of prostate cancer per 10,000 person-years was 116 (2820 cancers) in the screening group and 95 (2322 cancers) in the control group (rate ratio, 1.22; 95% confidence interval [CI], 1.16 to 1.29). The incidence of death per 10,000 person-years was 2.0 (50 deaths) in the screening group and 1.7 (44 deaths) in the control group (rate ratio, 1.13; 95% CI, 0.75 to 1.70). The data at 10 years were 67% complete and consistent with these overall findings.
Conclusions After 7 to 10 years of follow-up, the rate of death from prostate cancer was very low and did not differ significantly between the two study groups. (ClinicalTrials.gov number, NCT00002540 [ClinicalTrials.gov] .)
The benefit of screening for prostate cancer with serum prostate-specific–antigen (PSA) testing, digital rectal examination, or any other screening test is unknown. There has been no comprehensive assessment of the trade-offs between benefits and risks. Despite these uncertainties, PSA screening has been adopted by many patients and physicians in the United States and other countries. The use of PSA testing as a screening tool has increased dramatically in the United States since 1988.1
Numerous observational studies have reported conflicting findings regarding the benefit of screening.2 As a result, the screening recommendations of various organizations differ. The American Urological Association and the American Cancer Society recommend offering annual PSA testing and digital rectal examination beginning at the age of 50 years to men with a normal risk of prostate cancer and beginning at an earlier age to men at high risk.3,4
The National Comprehensive Cancer Network recommends a risk-based screening algorithm, including family history, race, and age.5 In contrast, the U.S. Preventive Services Task Force recently concluded that there was insufficient evidence in men under the age of 75 years to assess the balance between benefits and side effects associated with screening, and the panel recommended against screening men over the age of 75 years.6
Evidence from randomized trials would be of great assistance in making decisions about whether to pursue prostate-cancer screening. One randomized trial of PSA-based screening reported a benefit, but the results have been generally discounted because of serious methodologic concerns, including a lack of intention-to-screen analysis.7
Two ongoing randomized, controlled trials of prostate-cancer screening are being conducted to determine the effect of screening on prostate-cancer mortality: the Prostate, Lung, Colorectal, and Ovarian (PLCO) Cancer Screening Trial in the United States and the European Randomized Study of Screening for Prostate Cancer (ERSPC).8,9 In the United Kingdom, another ongoing trial, the Comparison Arm for the PROTECT (Prostate Testing for Cancer and Treatment) study (CAP), combines the assessment of screening and treatment.10
The prostate component of the PLCO trial was designed to determine the effect of annual PSA testing and digital rectal examination on mortality from prostate cancer.11 Previous reports have described the results of the baseline round and three later rounds of screening12,13 and the characteristics of men undergoing biopsy14 in the intervention group. This report provides information on prostate-cancer incidence, staging, and mortality in both study groups during the first 7 to 10 years of the study.
The design of the PLCO trial has been described previously.11 From 1993 through 2001, men and women between the ages of 55 and 74 years were enrolled at 10 study centers across the United States. Each institution obtained annual approval from its institutional review board to carry out the study, and all subjects provided written informed consent. Individual randomization was performed within blocks stratified according to center, age, and sex. The primary exclusion criteria at study entry were a history of a PLCO cancer, current cancer treatment, and, starting in 1995, having had more than one PSA blood test in the previous 3 years.
Subjects who were assigned to the screening group were offered annual PSA testing for 6 years and annual digital rectal examination for 4 years. PSA tests were analyzed with the Tandem-R PSA assay until January 1, 2004, and with the Access Hybritech PSA after that date (both assays were manufactured by Beckman Coulter). All tests were performed at a single laboratory.
As was standard in the United States at the time of the trial’s initiation, a serum PSA level of more than 4.0 ng per milliliter was considered to be positive for prostate cancer. Digital rectal examinations were performed by physicians, qualified nurses, or physician assistants. The results of the examinations were deemed to be suspicious for cancer if there was nodularity or induration of the prostate or if the examiner judged the prostate to be suspicious for cancer on the basis of other criteria, including asymmetry. At study entry, subjects completed a baseline questionnaire that inquired about demographic characteristics and medical and screening histories. In addition, a biorepository for the collection and storage of blood and tissue samples was an integral component of the trial.15
All men who underwent screening and their health care providers were notified of the PSA value and the results of the digital rectal examination. Men with positive results for the PSA test or suspicious findings on the digital rectal examination were advised to seek diagnostic evaluation. In accordance with standard U.S. practice, diagnostic evaluation was decided by the patients and their primary physicians. Staff members at the PLCO study centers obtained medical records related to diagnostic follow-up of positive screening results, and medical-record abstractors recorded information on relevant diagnostic procedures.
The rate of compliance with screening was calculated as the number of subjects who were screened divided by the number of those who were expected to be screened. Screening outside the trial protocol in the control group was assessed through random surveys. The reasons for and frequency of use of various procedures, including the screening tests under evaluation in the trial, were queried every 1 to 2 years. In each survey, a new random sample of 1% of subjects was chosen.
Two groups were identified from responses on the baseline questionnaire: those who had undergone repeated prostate screening in the 3 years before trial entry and those who had not. For the latter, the proportion who reported having had a PSA test as part of a routine physical examination in the previous year was computed; those who had had repeated PSA screenings, who comprised 9.8% of the control group, did not receive the annual surveys during the PLCO study years of screening, and screening was assumed to persist at 100% each year. A weighted average of these two percentages was calculated to provide an estimated overall “contamination” rate for subjects in the control group who underwent screening.
Primary and Secondary End Points
Cause-specific mortality for each of the PLCO cancers was the primary end point. In addition, data on PLCO cancer incidence, staging, and survival were collected and monitored as secondary end points. All diagnosed cancers, both PLCO and non-PLCO, and all deaths occurring during the trial were ascertained, primarily by means of a mailed annual questionnaire, which asked about the type of cancer and the date of diagnosis in the previous year. Subjects who did not return the questionnaire were contacted by repeat mailing or telephone.
This active follow-up was supplemented by periodic linkage to the National Death Index to enhance completeness of end-point ascertainment. Clinical stage was determined with the use of the tumor–node–metastasis staging system and categorized according to the fifth edition of the AJCC [American Joint Committee on Cancer] Cancer Staging Manual.16 Death certificates were obtained to confirm the death and to provisionally determine the underlying cause.
Since the true underlying cause may not always be evident or accurately recorded on the death certificate, the trial used a special end-point adjudication process to assign the cause of death in a uniform and unbiased manner.17 All deaths from causes that were potentially related to one of the PLCO cancers were reviewed, including any cause of death in which the subject had a PLCO cancer or a possible metastasis from a PLCO cancer and all deaths of unknown or uncertain cause. Reviewers of these deaths were unaware of study-group assignments for deceased subjects.
The primary analysis was an intention-to-screen comparison of prostate-cancer mortality between the two study groups. Event rates were defined as the ratio of the number of events (cancer diagnoses or deaths) in a given time period to the person-years at risk for the event. Person-years were measured from randomization to the date of diagnosis, death, or data censoring (whichever came first) for incidence rates and to the date of death or censoring (whichever came first) for death rates. Confidence intervals for rate ratios for incidence and mortality were calculated with the use of asymptotic methods, assuming a normal distribution for the logarithm of the ratio and a Poisson distribution for the number of events.18
From the initiation of the trial, an independent data and safety monitoring board considered reports every 6 months and reviewed the accumulating data. In November 2008, the board unanimously recommended that the current results on prostate-cancer mortality be reported, after notification of study investigators and subjects, on the basis of data showing a continuing lack of a significant difference in the death rate between the two study groups at 10 years (with complete follow-up at 7 years) and information suggesting harm from screening.
This recommendation was not the result of crossing a statistical futility boundary but, rather, was triggered by concern that men and their physicians were making decisions on screening on the basis of inadequate information, that the data available from the trial were complete up to 7 years and consistent up to at least 10 years, and that public health considerations dictated that the available results should be made known. However, the monitoring board also supported follow-up of the subjects until all of them had reached at least 13 years of follow-up.
The baseline characteristics of the subjects were virtually identical in the two study groups (Table 1). At 7 years, vital status was known for 98% of the men in the two groups (see the Supplementary Appendix, available with the full text of this article at NEJM.org). At 10 years, vital status was known for 67% of the subjects, although 23% had not been enrolled for 10 years. The median duration of follow-up was 11.5 years (range, 7.2 to 14.8) in the two groups.
Compliance with the screening protocol overall was 85% for PSA testing and 86% for digital rectal examination. These findings are similar to the design estimates of 90% for each test. Screening results for the first four rounds were reported previously.13 In the control group, the rate of PSA testing was 40% in the first year and increased to 52% in the sixth year; for subjects who reported having undergone no more than one PSA test at baseline (89% of subjects), the rate of PSA testing was 33% in the first year and 46% in the sixth year. The rate of screening by digital rectal examination in the control group ranged from 41 to 46%.
Figure 1A shows the accumulation of cases of prostate cancer in the two study groups. At 7 years, 2 years after the cessation of screening, prostate cancer had been diagnosed in more subjects in the screening group (2820) than in the control group (2322) (rate ratio, 1.22; 95% confidence interval [CI], 1.16 to 1.29). At 10 years, with follow-up complete for 67% of subjects, the excess in the screening group persisted, with 3452 subjects versus 2974 subjects (rate ratio, 1.17; 95% CI, 1.11 to 1.22).
Table 2 shows the characteristics of subjects with prostate cancer in each group, according to the circumstances of detection, through 10 years of follow-up. The large majority of prostate cancers were stage II at diagnosis, regardless of the mode of detection in the screening group; nearly all were adenocarcinomas, and more than 50% had a Gleason score of 5 to 6 (on a scale from 2 to 10, with higher scores indicating more aggressive disease).
Overall, the numbers of subjects with advanced (stage III or IV) tumors were similar in the two groups, with 122 in the screening group and 135 in the control group, though the number of subjects with a Gleason score of 8 to 10 was higher in the control group (341 subjects) than in the screening group (289 subjects).
The treatment distributions were similar in the two groups within each tumor stage. For example, among subjects with stage II tumors, as their primary treatment, 44% of the screening group and 40% of the control group underwent prostatectomy, 22% of the screening group and 21% of the control group underwent irradiation alone, and 18% and 21%, respectively, underwent irradiation and hormonal therapy.
Among subjects with stage III tumors, 24% of the screening group and 16% of the control group underwent irradiation alone, and 47% and 52%, respectively, underwent irradiation plus hormone therapy. Among subjects with stage IV tumors, 75% of the screening group and 72% of the control group received hormone therapy only. Overall, nearly 11% of the subjects in the screening group and 10% of those in the control group did not undergo any known treatment.
At 7 years, there were 50 deaths attributed to prostate cancer in the screening group and 44 in the control group (rate ratio, 1.13; 95% CI, 0.75 to 1.70) (Figure 1B and Table 3).
Through year 10, with follow-up complete for 67% of the subjects, the numbers of prostate-cancer deaths were 92 in the screening group and 82 in the control group (rate ratio, 1.11; 95% CI, 0.83 to 1.50). At 10 years, the median follow-up time for subjects with prostate cancer was 6.3 years in the screening group and 5.2 years in the control group.
There was little difference between the two groups in terms of the proportion of deaths according to tumor stage. In the screening group, 60% of the subjects had stage I or II tumors, 2% had stage III tumors, and 36% had stage IV tumors; in the control group, 52% of the subjects had stage I or II tumors, 4% had stage III tumors, and 39% had stage IV tumors.
Analyses within strata according to the screening status at baseline showed no indication of any reduction in prostate-cancer mortality in the screening group, as compared with the control group, in any of the subgroups.
Thus, at 7 years, among the 34,755 men in the screening group and 34,590 in the control group who reported having undergone no more than one PSA test at baseline, there were 48 prostate-cancer deaths in the screening group and 41 deaths in the control group (rate ratio, 1.16; 95% CI, 0.76 to 1.76); at 10 years, there were 83 deaths in the screening group and 75 in the control group (rate ratio, 1.09; 95% CI, 0.80 to 1.50).
Similarly, among 3588 men in the screening group and 3760 men in the control group who reported having had two or more PSA tests in the previous 3 years at baseline, there were two deaths in the screening group and three deaths in the control group at 7 years (rate ratio, 0.70; 95% CI, 0.12 to 4.17) and nine deaths in the screening group and seven in the control group at 10 years (rate ratio, 1.34; 95% CI, 0.50 to 3.59).
At 7 years, the total numbers of deaths (excluding those from prostate, lung, or colorectal cancers) were 2544 in the screening group and 2596 in the control group (rate ratio, 0.98; 95% CI, 0.92 to 1.03); at 10 years, the numbers of such deaths were 3953 and 4058, respectively (rate ratio, 0.97; 95% CI, 0.93 to 1.01). The distribution of the causes of death was similar in the two groups (Table 4).
Risks incurred from a screening process can result from the screening itself or from downstream diagnostic or treatment interventions. In the screening group, the complications associated with screening were mild and infrequent.
Digital rectal examination led to very few episodes of bleeding or pain, at a rate of 0.3 per 10,000 screenings. The PSA test led to complications at a rate of 26.2 per 10,000 screenings (primarily dizziness, bruising, and hematoma) and included three episodes of fainting per 10,000 screenings.
Medical complications from the diagnostic process occurred in 68 of 10,000 diagnostic evaluations after positive results on screening. These complications were primarily infection, bleeding, clot formation, and urinary difficulties. Treatment-related complications, which are generally more serious, include infection, incontinence, impotence, and other disorders. Such complications are now being catalogued in a quality-of-life study and are particularly pertinent in cases of overdiagnosis.
We are reporting here for the first time on the PLCO trial with respect to prostate-cancer mortality. At 7 years, screening was associated with a relative increase of 22% in the rate of prostate-cancer diagnosis, as compared with the control group.
This increase occurred even though the rate of compliance in screening (85%) was slightly below the level we anticipated in the study design (90%) and there was more-than-expected screening in the control group.
Screening was associated with no reduction in prostate-cancer mortality during the first 7 years of the trial (rate ratio, 1.13), with similar results through 10 years, at which time 67% of the data were complete.
However, the confidence intervals around these estimates are wide. The results at 7 years were consistent with a reduction in mortality of up to 25% or an increase in mortality of up to 70%; at 10 years, those rates were 17% and 50%, respectively.
There was little difference between the two study groups in the number of deaths from other causes. However, among men with prostate cancer at 10 years, 312 in the screening group and 225 in the control group died from causes other than prostate cancer, and the excess in the screening group was possibly associated with overdiagnosis of prostate cancer.
There are several possible explanations for the lack of a reduction in mortality so far in this trial. First, annual screening with the PSA test using the standard U.S. threshold of 4 ng per milliliter and digital rectal examination to trigger diagnostic evaluation may not be effective.
In the ERSPC trial, a PSA cutoff level of 3 ng per milliliter was used, with potentially increased sensitivity but reduced specificity. In our trial, a lower cutoff level might have resulted in the diagnosis of more prostate cancers earlier by screening. It has been shown that cancers that are detected by PSA screening at a level of less than 4 ng per milliliter have a favorable prognosis.9
Since increased detection of more of such good-prognosis tumors might have increased the rate of overdiagnosis, such a change probably would have had little or no effect on the rate of death from prostate cancer.
Second, the level of screening in the control group could have been substantial enough to dilute any modest effect of annual screening in the screening group.
Although the estimated rate of screening in the control group was higher than the original design estimate of 20%, it was similar to the 38% level anticipated in the protocol revision in 1998.11
To be included in our definition of “PSA contamination,” a subject in the control group needed to have had a PSA test within the past year as part of a routine physical examination. It was thought that such a situation would most closely represent the experience of PSA screening among compliant men in the screening group.
However, this definition could be overly restrictive, since PSA testing that occurred outside these measures could still have had an effect on prostate-cancer incidence and mortality in the control group.
Nonetheless, in the early years of the study, the level of testing in the screening group was substantially higher than that in the control group, and although the difference lessened later, testing levels remained distinctly higher in the screening group. The screening that occurred in the control group was not enough to eliminate the expected effects of annual screening — such as earlier diagnosis and a persistent excess of cases, largely due to overdiagnosis — in the screening group.
Third, approximately 44% of the men in each study group had undergone one or more PSA tests at baseline, which would have eliminated some cancers detectable on screening from the randomized population, especially in health-conscious men (who tend to be screened more often, a form of selection bias); thus, the cumulative death rate from prostate cancer at 10 years in the two groups combined was 25% lower in those who had undergone two or more PSA tests at baseline than in those who had not been tested.
Fourth, and potentially most important, improvement in therapy for prostate cancer during the course of the trial probably resulted in fewer prostate-cancer deaths in the two study groups, which blunted any potential benefits of screening.19,20 It is important to note that our policy of not mandating specific therapies after cancer detection on screening resulted in substantial similarities in treatment according to tumor stage between the two study groups.
Finally, the follow-up may not yet be long enough for benefit from the earlier detection of an increased number of prostate cancers in the screening group to emerge. Data are accruing on the natural history of screen-detected prostate cancer.
Thus, a report from the Rotterdam component of the ERSPC trial suggests a lead time of 12.3 years at the age of 55 years and 6 years at the age of 75 years, with estimated overdiagnosis rates of 27% and 56%, respectively.21
Wider application of improvements in prostate-cancer treatment is probably at least in part responsible for declining death rates from prostate cancer in most countries.22 For example, if a patient’s life is prolonged by the use of hormone therapy, the opportunities for competing causes of death increase, especially among older men.
Computations of lead time provide little information on prognosis, except to the extent that patients with long lead times are likely to have a better prognosis than those with short lead times.
In our study, the average lead time achieved by increased early diagnosis through screening was approximately 2 years (Figure 1A). At 7 years, 73% of prostate cancers had been screen-detected in the screening group. In addition, the possibly emerging reduction in the incidence of tumors with a Gleason score of 8 to 10 in the screening group might portend a future reduction in mortality.
However, we now know that prostate-cancer screening provided no reduction in death rates at 7 years and that no indication of a benefit appeared with 67% of the subjects having completed 10 years of follow-up. Thus, our results support the validity of the recent recommendations of the U.S. Preventive Services Task Force, especially against screening all men over the age of 75 years.6
Risks incurred by screening, diagnosis,23,24 and resulting treatment25,26,27,28,29,30,31 of prostate cancer are both substantial and well documented in the literature.
To the extent that overdiagnosis occurs with prostate-cancer screening, many of these risks occur in men in whom prostate cancer would not have been detected in their lifetime had it not been for screening.
The effect of screening on quality of life is a subject of an ongoing substudy and should be completed within the next several years. Follow-up in the PLCO trial is planned to continue until all subjects reach at least 13 years. A final report will be presented once the planned duration of follow-up is completed.
STUDY 2: Screening and Prostate-Cancer Mortality in a Randomized European Study
ABSTRACT: Background The European Randomized Study of Screening for Prostate Cancer was initiated in the early 1990s to evaluate the effect of screening with prostate-specific–antigen (PSA) testing on death rates from prostate cancer.
Methods We identified 182,000 men between the ages of 50 and 74 years through registries in seven European countries for inclusion in our study. The men were randomly assigned to a group that was offered PSA screening at an average of once every 4 years or to a control group that did not receive such screening.
The predefined core age group for this study included 162,243 men between the ages of 55 and 69 years. The primary outcome was the rate of death from prostate cancer. Mortality follow-up was identical for the two study groups and ended on December 31, 2006.
Results In the screening group, 82% of men accepted at least one offer of screening. During a median follow-up of 9 years, the cumulative incidence of prostate cancer was 8.2% in the screening group and 4.8% in the control group.
The rate ratio for death from prostate cancer in the screening group, as compared with the control group, was 0.80 (95% confidence interval [CI], 0.65 to 0.98; adjusted P=0.04). The absolute risk difference was 0.71 death per 1000 men. This means that 1410 men would need to be screened and 48 additional cases of prostate cancer would need to be treated to prevent one death from prostate cancer.
The analysis of men who were actually screened during the first round (excluding subjects with noncompliance) provided a rate ratio for death from prostate cancer of 0.73 (95% CI, 0.56 to 0.90).
Conclusions: PSA-based screening reduced the rate of death from prostate cancer by 20% but was associated with a high risk of overdiagnosis. (Current Controlled Trials number, ISRCTN49127736 [controlled-trials.com] .)
Measurement of serum prostate-specific antigen (PSA), a biomarker for prostate cancer,1 is useful for the detection of early prostate cancer.2 Nevertheless, the effect of PSA-based screening on prostate-cancer mortality remains unclear.3 The European Randomized Study of Screening for Prostate Cancer (ERSPC) was initiated in the early 1990s to determine whether a reduction of 25% in prostate-cancer mortality could be achieved by PSA-based screening.4 Preliminary data from this study have been published and can be accessed at www.erspc.org. Another randomized screening trial in the United States, the Prostate, Lung, Colon, and Ovarian (PLCO) Cancer Screening Trial, was initiated around the same time, and interim results are also reported in this issue of the Journal.5
We designed the ERSPC as a randomized, multicenter trial of screening for prostate cancer, with the rate of death from prostate cancer as the primary outcome.
An independent data and safety monitoring committee reviewed the trial, and interim analyses were carried out according to a monitoring and evaluation plan in which the outcome of the trial was to be presented to the research group once a statistically significant result corrected for interim analyses was reached.6,7
The study’s protocol was reviewed by local and governmental ethics committees (for details, see Supplementary Appendix 4, available with the full text of this article at NEJM.org).
Recruitment and randomization procedures differed among countries and were developed in accordance with national regulations.
In Finland, Sweden, and Italy, the trial subjects were identified from population registries and underwent randomization before written informed consent was provided (population-based effectiveness trial).
In the Netherlands, Belgium, Switzerland, and Spain, the target population was also identified from population lists, but when the men were invited to participate in the trial, only those who provided consent underwent randomization (efficacy trial).
The results of analyses from two participating countries were not included in this analysis: investigators in Portugal discontinued their participation in October 2000 because they were unable to provide the necessary data, and investigators in France decided to participate in 2001, so data from their analyses were not included because of the short duration of follow-up.
Men in whom prostate cancer had been diagnosed (according to data from questionnaires or registries) were ineligible. Within each country, men were assigned to either the screening group or the control group, without the use of blocks of numbers or stratification on the basis of random-number generators (Figure 1).
At all study centers, the core age group included men between the ages of 55 and 69 years at entry.
In addition, in Sweden, study investigators included men between the ages of 50 and 54 years, and investigators in the Netherlands, Italy, Belgium, and Spain included men up to the age of 74 years at entry.
In Switzerland, men between the ages of 55 and 69 years were included, with screening up to the age of 75 years. In Finland, men were recruited at the ages of 55, 59, 63, and 67 years and were screened until the age of 71 years.
Screening was discontinued in all other centers when the chosen upper age limit was reached. The validity of randomization was determined by comparing the age distributions and the rates of death from any cause in the two study groups.
At centers in all countries except Finland, subjects were randomly assigned in a 1:1 ratio to the screening group or the control group. In Finland, the size of the screening group was fixed at 32,000 subjects. Because the whole birth cohort underwent randomization, this led to a ratio, for the screening group to the control group, of approximately 1:1.5.
Each center reported data on recruitment, screening, and mortality twice a year to a central data center. Several task forces and working groups were responsible for quality assurance, including an epidemiology committee, a quality-control committee, a pathology committee, and a PSA committee.7
The data and safety monitoring committee had oversight of the trial, with a mandate to stop the trial on demonstrating a significant difference between the groups or adverse effects of screening. The monitoring committee received reports on the progress of the trial, including prostate-cancer mortality.
Causes of death, which were obtained from registries and individual chart review, were assigned according to definitions and procedures developed for the trial. A committee that analyzed causes of death was formed at each center, and an international committee coordinated the work of these national committees.8,9
Screening Tests and Indications for Biopsy
Total PSA was measured with the use of Hybritech assay systems (Beckman Coulter). From 1994 through 2000, the Tandem E assay was used, and thereafter the Access assay, with the original Hybritech calibration always applied.10
Most centers used a PSA cutoff value of 3.0 ng per milliliter as an indication for biopsy.
In Finland, a PSA value of 4.0 ng per milliliter or more was defined as positive and the men were referred for biopsy; those with a value of 3.0 to 3.9 ng per milliliter underwent an ancillary test — digital rectal examination until 1998 and calculation of the ratio of the free PSA value to the total PSA value (with a value of 0.16) starting in 1999 — and were referred for biopsy if the test was positive.
In Italy, a PSA value of 4.0 ng per milliliter or more was defined as positive, but men with a PSA value of 2.5 to 3.9 ng per milliliter also underwent ancillary tests (digital rectal examination and transrectal ultrasonography).
In the Dutch and Belgian centers, up to February 1997, a combination of digital rectal examination, transrectal ultrasonography, and PSA testing (with a cutoff value of 4.0 ng per milliliter) was used for screening; in 1997, this combination was replaced by PSA testing only.7,11,12
In Belgium, where the results of a pilot study (from 1991 to 1994) were included in the final data set up to 1995, a PSA cutoff value of 10.0 ng per milliliter was used initially.
Most centers used sextant biopsies guided by transrectal ultrasonography. As of June 1996, lateralized sextant biopsies were recommended.13
In Italy, transperineal sextant biopsies were used. In Finland, a biopsy procedure with 10 to 12 biopsy cores was adopted in 2002 as a general policy for the two study groups.
The screening interval at six of the seven centers was 4 years (accounting for 87% of the subjects); Sweden used a 2-year interval. In Belgium, the interval between the first and second rounds of screening was 7 years because of an interruption in funding.
The primary evaluation of specimens from biopsies and radical prostatectomies was performed by local pathologists. Central review of the pathological analyses was not carried out. However, standardization of procedures was coordinated and achieved by the work of the international pathology committee. (For details on the committee and its functions, see Supplementary Appendix 3.)
The treatment of prostate cancer was performed according to local policies and guidelines. The equality of distribution of treatments that were applied to the screening group and the control group has been evaluated, with little indication of differences between the two study groups after adjustment for disease stage, tumor grade, and age (data not shown).14
Follow-up for mortality analyses began at randomization and ended at death, emigration, or a uniform censoring date (December 31, 2006), with identical follow-up in the two study groups.
Causes of death were evaluated in a blinded fashion and according to a standard algorithm9 or, after validation, on the basis of official causes of death. The causes were classified by the independent committees as definite prostate cancer, causes related to screening, probable or possible prostate cancer, and other intercurrent causes (with or without prostate cancer as a contributory factor).
Decision points that were used for determining the cause of death have been described previously.9 For this analysis, we have combined the categories of definite and probable prostate cancer and the category of causes related to screening.
Aspects of quality of life were evaluated in several study centers. A complete evaluation of all the steps of screening was conducted in the Netherlands (data not shown).15,16,17,18,19,20,21
The statistical analysis was based on the core age group (including men between the ages of 55 and 69 years at randomization) and on the intention-to-screen principle.
Overall mortality was studied to evaluate the correctness of randomization. Poisson regression analysis was used to estimate the ratio of mortality in the intervention group to mortality in the control group, stratified according to study center and age group at randomization. The Nelsen–Aalen method was used for the calculation of cumulative hazard.22
All P values are two-sided. Interim analyses were conducted for follow-up in 2002, 2004, and 2006, with an alpha spending curve with a division of uneven weights.23
A preliminary analysis included men who had actually undergone screening in the first round (with adjustment for noncompliance). The number that would need to be screened to prevent one death from prostate cancer was calculated as the inverse of the absolute difference in cumulative mortality from prostate cancer between the two study groups.
The study had a power of 86% to show a statistically significant difference of 25% or more in prostate-cancer mortality with a P value of 0.05 among men who underwent screening, on the basis of follow-up through 2008.4
The sample-size calculation, which was part of the power calculation, took into account noncompliance in the screening group in each study center and the use of PSA tests outside the protocol assignment in the control group (termed contamination of the control group).
On the basis of an overall level of compliance of 82% and 20% contamination in the control group, a 25% reduction in the number of men who underwent screening would be equivalent to a 14% reduction in an intention-to-treat analysis. This assumes that men who were screened and those who were not screened had the same underlying risk and that screening in the control group was as effective as that in the screening group.
Figure 1 shows trial enrollment, study-group assignments, and follow-up of all subjects and of the core age group. A total of 162,387 men in the core age group underwent randomization; of these men, 72,952 were assigned to the screening group and 89,435 to the control group. A total of 62 men in the screening group and 82 men in the control group died between identification and randomization.
Table 1 summarizes the characteristics of the subjects according to the center and the results of screening. The mean age at randomization was 60.8 years (range, 59.6 to 63.0), with little variation among the seven countries.
In total, 82.2% of the men in the screening group were screened at least once. Compliance was higher in study centers that obtained consent before randomization (88 to 100%) than in those in which subjects underwent randomization before providing consent (62 to 68%) (for details concerning all age groups, see Table 1A in Supplementary Appendix 5).
During the trial, 126,462 PSA-based tests were performed, an average of 2.1 per subject who underwent screening. Overall, 16.2% of all tests were positive, with a range of 11.1 to 22.3% among the centers. The average rate of compliance with biopsy recommendations was 85.8% (range, 65.4 to 90.3). Of the men who underwent biopsy for an elevated PSA value, 13,308 (75.9%) had a false positive result.
We detected 5990 prostate cancers in the screening group and 4307 in the control group. These numbers correspond to a cumulative incidence of 8.2% and 4.8%, respectively.
The positive predictive value of a biopsy (the number of cancers detected on screening divided by the number of biopsies expressed as a percentage) was on average 24.1% (range, 18.6 to 29.6).
The cumulative incidence of local prostate cancer was higher in the screening group than in the control group (for details about tumor stage, grade distribution, and treatment, see Supplementary Appendixes 6 and 7). For example, the number of men with positive results on a bone scan (or a PSA value of more than 100 ng per milliliter in those without bone-scan results) was 0.23 per 1000 person-years in the screening group, as compared with 0.39 per 1000 person-years in the control group, a 41% reduction in the screening group (P<0.001).
The proportions of men who had a Gleason score of 6 or less were 72.2% in the screening group and 54.8% in the control group, and the proportions with a Gleason score of 7 or more were 27.8% in the screening group and 45.2% in the control group.
As of December 31, 2006, with average and median follow-up times of 8.8 and 9.0 years in the screening and control groups, respectively, there were 214 prostate-cancer deaths in the screening group and 326 in the control group in the core age group.
Deaths that were associated with prostate-cancer–related interventions were categorized as deaths from prostate cancer. The unadjusted rate ratio for death from prostate cancer in the screening group was 0.80 (95% confidence interval [CI], 0.67 to 0.95; P=0.01); after adjustment for sequential testing with alpha spending due to two previous interim analyses (based on Poisson regression analysis), the rate ratio was 0.80 (95% CI, 0.65 to 0.98; P=0.04). The rates of death in the two study groups began to diverge after 7 to 8 years and continued to diverge further over time (Figure 2).
Figure 2. Cumulative Risk of Death from Prostate Cancer.
As of December 31, 2006, with an average follow-up time of 8.8 years, there were 214 prostate-cancer deaths in the screening group and 326 in the control group. Deaths that were associated with interventions were categorized as being due to prostate cancer. The adjusted rate ratio for death from prostate cancer in the screening group was 0.80 (95% CI, 0.65 to 0.98; P=0.04). The Nelsen–Aalen method was used for the calculation of cumulative hazard.
In the intention-to-screen analysis, the absolute difference between the screening group and the control group was 0.71 prostate-cancer death per 1000 men. This means that in order to prevent one prostate-cancer death, the number of men who would need to be screened would be 1410 (95% CI, 1142 to 1721), with an average of 1.7 screening visits per subject during a 9-year period.
The additional prostate cancers diagnosed by screening resulted in an increase in cumulative incidence of 34 per 1000 men, as compared with the control group. In other words, 48 additional subjects (1410÷1000×34) would need to be treated to prevent one death from prostate cancer.
In an analysis of men who were actually screened during the first round (which was adjusted for noncompliance), the rate ratio for prostate-cancer death after 9 years was 0.73 (95% CI, 0.56 to 0.90), which meant that 1068 men would need to be screened and 48 would need to be treated to prevent one death from prostate cancer. The number of men who would need to be treated (48) remained unchanged in the per-protocol analysis because the same number of deaths were prevented and the same number of additional cases were diagnosed in men who actually underwent screening.
Effect of Age on Mortality
In an exploratory analysis of mortality according to age group, there was no evidence of heterogeneity among age groups (Table 2). Among men between the ages of 50 and 54 years at baseline, the number of events was small, with no obvious screening effect.
Heterogeneity of Rate Ratios
In an exploratory analysis of heterogeneity according to study center (which was carried out in accordance with the monitoring plan6), the decrease in the rate of death from prostate cancer in the screening group could not be attributed to any single center, as evidenced by rate ratios ranging between 0.74 and 0.84 after the exclusion of each center, one at a time. There was no significant difference in overall mortality (Table 3).
No deaths were reported as a direct complication (e.g., septicemia or bleeding) associated with a biopsy procedure. Complications associated with screening procedures (including prostate biopsy) have been reported previously.24,25
In an intention-to-screen analysis of data from seven European centers, PSA screening was associated with a significant absolute reduction of 0.71 prostate-cancer death per 1000 men after an average follow-up of 8.8 years (median, 9.0).
This finding corresponds to a relative reduction of 20% in the rate of death from prostate cancer among men between the ages of 55 and 69 years at study entry, given an average screening interval of 4 years and a compliance rate of 82% of those who accepted the offer of screening (rate ratio, 0.80; adjusted P=0.04).
To prevent one prostate-cancer death, 1410 men (or 1068 men who actually underwent screening) would have to be screened, and an additional 48 men would have to be treated.
The high number of men who would need to be treated could be improved by avoiding the diagnosis and treatment of indolent cancers during screening or by improving treatment in the remaining men with cancer. The number needed to screen in our study is similar to that in studies of mammographic screening for breast cancer and fecal occult-blood testing for colorectal cancer.26,27
Our analysis shows that the results were generally similar in all participating study centers considered individually (Table 3). The trial was not powered to evaluate mortality differences between centers or for age subgroups. The results were based on a combined analysis of data from centers sharing a common core protocol, which defined the minimal criteria for inclusion and the scope of the primary analysis but allowed wider age ranges or shorter screening intervals.
Because of various recruitment approaches, the estimate of a 20% reduction in prostate-cancer mortality does not represent the effect of a screening program at the population level or the effect on individual subjects but instead represents a mixture of such estimates. Despite some variation in screening procedures, the results from each center were compatible with the main result: a lowering of the death rate from prostate cancer associated with screening.
The screening interval of 4 years was chosen on the basis of the mean lead time of 5 to 10 years in PSA-based screening.28,29 However, the lead time of aggressive cancers, which may be the most important target of screening, is likely to be much shorter.
The benefit of screening was restricted to the core age group of subjects who were between the ages of 55 and 69 years at the time of randomization. The results that were seen in other age groups are preliminary and inconclusive.
Our findings are early results of the trial, and continued follow-up will provide further information. Adjustment for noncompliance resulted in a greater effect among men who actually underwent screening, and after adjustment for both noncompliance and contamination, the effect of screening in the intention-to-screen analysis is likely to be further enhanced.
The rate of overdiagnosis of prostate cancer (defined as the diagnosis in men who would not have clinical symptoms during their lifetime) has been estimated to be as high as 50% in the screening group.30
Consistent estimates of overdiagnosis (a third of cancers detected on screening) have also been obtained by identifying potentially indolent prostate cancers on the basis of clinical and pathological characteristics.31,32,33
Overdiagnosis and overtreatment are probably the most important adverse effects of prostate-cancer screening and are vastly more common than in screening for breast, colorectal, or cervical cancer.34
Although the results of our trial indicate a reduction in prostate-cancer mortality associated with PSA screening, the introduction of population-based screening must take into account population coverage, overdiagnosis, overtreatment, quality of life, cost, and cost-effectiveness.
The ratio of benefits to risks that is achievable with more frequent screening or a lower PSA threshold than we used remains unknown. Further analyses are needed to determine the optimal screening interval in consideration of the PSA value at the first screening and of previously negative results on biopsy.35,36,37,38