Where can I find this paper?
What is this paper about (what is the research question)?
How do the test characteristics of US and CT change according to the duration of abdominal pain in the diagnosis of appendicitis in children?
Summary of the Paper
Design: secondary analysis of multicentre observational study
Outcome: presence or absence of appendicitis as determined by pathologist’s report of findings at surgery or composite telephone/medical record follow-up where surgery was not undertaken
Primary objective: to determine test performance characteristics of CT and ultrasonography according to the duration of abdominal pain in children being assessed for appendicitis
Population: ED patients aged 3-18 years presenting with acute abdominal pain of <96hrs duration
- Inclusion: “possible appendicitis”, defined as patients who had blood tests, radiological studies (CT or USS or both), or surgical consultation for the purpose of diagnosing appendicitis
- Exclusion: pregnancy, previous abdominal surgery, chronic GI conditions, severe developmental delay, CT or USS prior to ED assessment, pain >72h, no radiological examination performed.
Results: 2,349 patients in parent study, 1,810 in subgroup (1,216 had CT, 832 had USS).
38% had appendicitis (n=680).
With equivocal cases (radiology) removed:
- OR of trend in CT: sensitivity 0.98, specificity 0.91, PPV 1.02, NPV 0.88
- OR of trend in USS: sensitivity 1.40, specificity 1.15, PPV 1.26, NPV 1.26
With equivocal cases (radiology) included as positive:
- OR of trend in CT: sensitivity 0.96, specificity 1.07, PPV 1.19, NPV 0.88
- OR of trend in USS: sensitivity 1.39, specificity 1.10, PPV 1.19, NPV 1.26
The sensitivity and negative predictive value of ultrasonography increase with the duration of pain, and CT is less likely to be indeterminate with a longer duration of pain.
On the study design
This is a secondary analysis; this means the data used in the study was originally collected as part of another study, and is being analysed in new ways to answer different clinical questions. This approach is not uncommon; large studies generate a lot of data and it may be possible to identify related clinical patterns by subgroup and secondary analysis. Just remember that this is not the purpose for which these patients were recruited.
The ascertainment of duration of symptoms was completed on a standardised form before knowledge of CT or US results. This is a subjective outcome, so standardised forms help to increase objectivity. The kappa score (for inter-rater reliability) was 0.73 (95% CI 0.67-0.78) – not brilliant, particularly if we consider that the “true” K value could be as low as 67% agreement.
Abstraction rules were generated to help code US and CT findings to “normal”, “positive” or “equivocal”. The same is not true of the outcome measures: there is no mention of if and how uncertainty in the reports of the pathologist or surgeon was managed. Ambiguity in the reference standard would impact on the validity of the study. It is also unclear whether the pathologist and surgeon were blinded to the pre-operative radiological findings; this might also introduce an observation bias.
What were the results and what does this mean?
The authors give us various sensitivity, specificity, positive and negative predictive values for both CT and US at different time points, with odds ratios to express the relationship between increasing duration of pain and each test characteristic.
Table 2 shows us that as duration of symptom (pain) increases, whether equivocal cases are excluded or presumed positive, the sensitivity, specificity and negative predictive value of US increases. For CT, there is an improvement in positive predictive value when equivocal cases are included. But look at the confidence intervals. We can only be 95% sure that the true odds ratio is greater than 1 (i.e. the test statistic value increases) for sensitivity and NPV in US.
What can we take from this paper into clinical practice?
Serial US might be useful in equivocal cases, where clinical signs do not immediately necessitate surgery. The longer the history of symptoms (but <72h), the better a a negative ultrasound is at ruling appendicitis out. However, CT seems to offer more acceptable test characteristics regardless of duration of symptoms.
More questions to ask
- What are the test characteristics when prospectively ascertained for clinically equivocal cases (since these are the patients we would not immediately take to theatre)?
- How do these US findings compare to ED US?
Follow us on twitter: @PEMLit