Category Archives: Critical Appraisal

Critical Appraisal of a Paper

What are WEE Waiting for? The Quick-Wee Method for Faster Clean Catch Urine Collection

Where can I find this paper?

Please note this paper is OPEN ACCESS. You are strongly advised to read the original paper before reading any further.

What is this paper about (what is the research question)?

Does suprapubic cutaneous stimulation with cold fluid-soaked gauze (the “Quick-Wee” method) reduce the amount of time spent waiting for clean catch urine?

Summary of the Paper

Design: single centre, randomised, prospective non-blinded trial

Objective: to evaluate the efficacy of the Quick-Wee method

Outcome of interest: voiding of urine within five minutes (binary outcome)

Intervention: genital cleaning for 10 seconds with sterile water at room temperature, followed by continued rubbing of the suprapubic area in a circular pattern with gauze (soaked in cold saline) held by forceps

Reference standard: genital cleaning for 10 seconds with sterile water at room temperature (standard practice)

Participants: patients presenting to an Australian paediatric emergency department between September 2015-April 2016

Inclusions: pre-continent infants aged 1-12 months in whom clean catch urine sample was required

  • Exclusions: neonates (defined as <1 month of age); infants with anatomical or neurological abnormalities affecting voiding of urine or sensation; those patients with need for an immediate sample by invasive method

Results: 354 subjects were recruited of whom 344 participated in the analysis; 175 in the control group and 179 in the intervention group (5 patients were excluded from each group after randomisation, giving 170 in the control group and 174 in the intervention group).

54/174 (31%) of patients voided within five minutes in the Quick-Wee group

20/170 (12%) of patients voided within five minutes in the control group

The difference in proportions was 19% (95% confidence interval for difference 11-28%).

This gave an NNT of 4.7 to successfully catch one additional sample within five minutes (95% confidence interval 3.4-7.7).

Authors’ Conclusions:

The Quick-Wee method requires minimal resources and is a simple way to trigger faster voiding for clean catch urine from infants in the acute care setting.

On the study design

Firstly it is important to note that this was a single-centre study in which trained clinicians were identifying and recruiting potential test subjects in addition to performing the intervention. This introduces a potential for innovation or novelty bias, whereby new treatments or procedures are preferred (or possibly considered less favourably) than traditional treatments or methods. This could be exacerbated by a lack of blinding, such as in this study, although it would be practically impossible to blind subjects to the treatment they are receiving in this particular case. In an ideal world, the clinicians recruiting and randomising patients would be different from those performing the procedure, and the results would be interpreted by people blinded to the groups to which patients were randomised – but research rarely occurs under ideal circumstances (if ever).

That said, a considerable effort has been made to overcome this through blinding which was carried out in a 1:1 ratio of consecutive patients using random permuted blocks of different sizes and allocation concealment (opaque envelopes) selected sequentially.

The Quick-Wee procedure itself was well standardised; teaching was delivered through face-to-face intervention and written instruction and standardised packs were used for the initial cleaning phase. A separate pack was prepared for the Quick-Wee intervention itself.

Several secondary outcomes were considered, including successful catch of the specimen, contamination of sample, parental and clinician satisfaction with method.

A sample size calculation was performed, requiring 322 patients (161 in each group) to achieve 80% power to detect a difference in the primary outcome; based on pilot study data, the expected change in proportions was 15% with a baseline expected proportion of 21% in the control group and therefore 35% in the intervention arm (a small inconsistency in these percentages is likely due to rounding). The authors performed an intention-to-treat analysis and planned to recruit an additional 10% of subjects beyond the sample size calculation to account for anticipated attrition.

What were the results and what does this mean?

The study achieved the required sample size due in part to the forethought of including 10% more patients to account for attrition.

The 344 subjects analysed were divided into control (170 patients) and intervention (174 patients) groups and in each case successful voiding was determined if it occurred within five minutes of the initial cleaning step. The data collection section mentions paper case record forms but it is not clear whether these were standardised for the research study or the usual clinical documentation. In addition, interobserver reliability is inferred through the use of a timer but in practice there is an opportunity for bias here if the observer is not independent of the clinician carrying out the procedure (forgetting to press “start” and adding a few extra seconds, for example).

The results are certainly impressive; 54/174 patients voided within five minutes with the Quick-Wee method (31% – 95% confidence interval 24%-39%) compared with 20/170 in the control group (12% – 95% confidence interval 7%-18%). The difference in proportions was 19% with a 95% confidence interval of 11%-28% and a P value of <0.001 using the χ2 test.

The use of binary data here certainly makes for simpler analysis rather than looking at specific timings for each subject; five minutes is not an unreasonable amount of time to wait for a sample but it should be recalled that there is a member of staff tied up in undertaking the Quick-Wee method for potentially the entire five minute duration – this might prove challenging in busy Emergency Departments.

The authors also looked at voiding with successful catch and found similar proportions (Quick-Wee 52/174 [30%: 95% confidence interval 23%-37%]; control 15/170 [9%: 95% confidence interval 5%-4%]). Does the Quick-Wee method make missed voids less likely? Perhaps, due to increased attention focus on the relevant anatomical area..!

The difference in rates of contamination was not statistically significant (27% in the Quick-Wee group [95% confidence interval 15%-43%], 46% in the control group [95% confidence interval 17%-77%] – this could be an area for further work in a larger sample, given high contamination rates in both groups.

Finally, the satisfaction scores of both parents and clinicians were better in the Quick-Wee group. The data is given in a slightly counter-intuitive way (the Likert scale runs from 1=very satisfied to 5=very unsatisfied) which they have called “higher rate” of satisfaction – it is worth noting that this does not correspond to a higher number! In the Quick-Wee group, median parental and clinician satisfaction was 2, while in the control group the median for both was 3.

What can we take from this paper into clinical practice?

This method appears to be reliable from this pragmatic and robust study. It is certainly appealing as a first-line technique over invasive methods such as suprapubic aspiration or catheterisation. It certainly seems worthy of adoption into clinical practice provided you can spare the staff.

More questions to ask

  • Would this technique work in older children, given its theoretical basis in the neonatal cutaneous voiding reflex?
  • Would warmer water work as reliably?
  • Would time be further reduced with a pre-emptive feed (or oral hydration) as in the study by Herreros et al?
  • Could this method also reduce contamination rates?


Probing Questions: Lung Ultrasound in Diagnosis and Management of Bronchiolitis

Screen Shot 2015-10-16 at 15.28.20

Thanks to Casey Parker of Broomedocs for this guest contribution – his review is cross-posted here.

Where can I find this paper?

What is this paper about (what is the research question)?

This paper aimed to correlate sonographic lung findings with clinically diagnosed bronchiolitis in infants.  The authors also attempted to provide some prognostic information [the need for oxygen support] based on sonographic lung features.

Summary of the Paper

The subjects were infants admitted for clinically suspected bronchiolitis.  There was also a cohort of “normal controls” used as a comparison.  The children underwent a clinical scoring by the treating Paediatrician and lung ultrasound by both a radiologist and Paediatrician sonographer.  The scans were all completed by two of the authors.

Design: Single-centre, observational cohort study conducted in an Italian Paediatric unit.

Objective: to evaluate the accuracy of lung ultrasonography in the diagnosis and management of bronchiolitis in infants.

Outcome of interest:  correlation between clinical and sonographic lung findings in bronchiolitic infants.  Can LUS findings be used to predict the need for supplemental oxygen requirements?

Participants: One hundred six infants, aged from 9 to 239 days old were enrolled.

  • Inclusions: clinically “suspected bronchiolitis” in infants.  Unclear as to whether these were consecutive cases – only 106 over a 3 year study period.
  • Exclusions: radiological pneumonia, other “concomitant pathology” or the unavailability of the study sonographer.

Results: There was a high level [ ~90%] of agreement between the clinician’s severity rating and the predetermined sonographic severity scores.  There was also a high level of agreement between the two sonographers scoring of the LUS findings (K = 89.6%).  The lung US scoring predicted the need for oxygen supplementation with good accuracy [sensitivity: 96.6 %, specificity 98.7 % ] although there were wide confidence intervals as a result of the small numbers in this trial.

Authors’ Conclusions:

In summary, this pilot study demonstrates that the use of LUS in bronchiolitis can be considered as an extension of the clinical evaluation and could be incorporated into clinical algorithms to aid decision-making. Our promising data needs to be confirmed in larger cohort studies also involving critical patients.

On the study design

 This study design is typical of many pilot ultrasound papers.  Small numbers of patients in which sonography is compared to a gold-standard that may not be entirely accurate of itself.  Bronchiolitis is a clinical diagnosis, with no really objective diagnostic standard.  The use of just 2 experienced Paediatric sonographers in a single centre does raise questions about the external validity of the results and there is a high likelihood of bias here.  The clinicians were blinded to the sonographic findings – and therefore the risk of bias here was removed.  The use of “normal cohort” and the “RSV swabs” in the study design was a little confusing and doesn’t really add to the results.

What were the results and what does this mean?

The results suggest that clinically diagnosed bronchiolitis looks like…. sonographic bronchiolitis as per the defined criteria used in this paper.  The protocol used did identify infants with more severe lung disease.  The need for supplemental oxygen was consistent with more severe LUS changes.  However, given the “standard” was clinical examination it is unclear exactly what LUS would add to the prognostication by paediatricians.  The high degree of agreement between the two study sonographers is difficult to extrapolate given they are both highly skilled, ultrasound enthusiasts – a larger mix of observers would be needed to draw any conclusions about our ability to utilise LUS in small kids.

What can we take from this paper into clinical practice?

Lung ultrasound for the diagnosis and severity scoring of bronchiolitis is reasonably accurate.  Does it add anything?  Probably not, unless you are currently using CXR to ‘diagnose’ bronchiolitis.  This paper does provide some useful descriptions of the spectrum of disease and their sonographic appearance.

I think this paper is interesting in that it describes the sonographic spectrum of a common disease of infants.  The study is not really large enough, nor does it have the external validity to make it a “practice changer”.   This pilot can help inform us about the appearance of bronchiolitis – and in the future this may become a more commonplace part of our clinical assessment of children – but for now I am not sure it adds to our quiver.

More questions to ask

  • Can ultrasound reliably differentiate bronchiolitis from important differential diagnoses in infants ? (e.g.. pneumonia, heart failure, upper airway obstruction… )
  • Are the sonographic findings in bronchiolitis consistent when obtained by sonographers of various experience?
  • Previous papers have compared LUS to conventional CXR for the diagnosis of bronchiolitis – and LUS was favourable.  It would be nice to see a paper looking at children with severe disease in which clinicians often turn to CXR to “reconfirm the working diagnosis” in order to ascertain its utility at that end of the spectrum.

Follow us on twitter: @PEMLit

Bouncing Back: Repeated ED Visits Among Children With Meningitis or Septicaemia

Screen Shot 2015-10-08 at 16.17.39

Where can I find this paper?

What is this paper about (what is the research question)?

How often have children, subsequently diagnosed with meningitis or septicaemia, attended an ED and been discharged in the preceding five days?

Summary of the Paper

Design: retrospective cohort study using pan-Toronto hospital database

Objective: to ascertain the proportion of children with an ultimate diagnosis of meningitis and septicaemia who had attended an Emergency Department in the five preceding days

Outcome of interest: proportion of reattendances; ED factors in the group with preceding attendance compared with those admitted at first attendance

Participants: children (aged 30 days to 5 years) with a diagnosis of meningitis or septicaemia with linked data regarding prior attendances in the period 06/04/2005-01/03/2010.

  • Inclusions: children with an ultimate diagnosis of meningitis or septicaemia and a minimum inpatient stay of 4 days (or death in hospital)
  • Exclusions: length of stay <4 days, patients discharged within the preceding 14 days of admission with meningitis/septicaemia

Results: 521 children were admitted with a final diagnosis of meningitis/septicaemia during the study period. 125 had attended an ED in the preceding 5 days with 114 attending with apparent infection. Those with repeated visits had similar lengths of stay, critical care use and 30-day mortality.

Authors’ Conclusions:

Our study reveals that despite the imperative to provide early diagnosis and treatment to children and infants with critical infections, current practices differ markedly from this goal, with 1 in 5 children having repeated ED presentations before admission with meningitis or septicaemia.

On the study design

This was a retrospective cohort study which depended on ICD-10 reporting of diagnoses and database correlation to link admissions with meningitis or septicaemia with prior ED attendances. As with all such studies, findings are dependent on the quality of data recorded, even more so when the analysis is performed on retrospective data.

Nonetheless the study asks a valid question about how good we are at identifying serious bacterial illness the first time around.

What were the results and what does this mean?


The low prevalence of serious bacterial infection is interesting; there is no data given about the number of ED attendances for children who were not given a diagnosis of meningitis or septicaemia, so this reinforces the “needle-in-a-haystack” feeling we have in the UK. These diseases are thankfully rare but identifying them early is a clinical priority.

That 125 children reattended (after not being admitted at first attendance) does not resonate with me in the same way as they authors. I feel this rather reflects my experiences that patients who have severe illness do not always suddenly present acutely unwell but rather at a time point along a clinical trajectory, at which reliable clinical signs may or may not be present. Notably children who reattended had lower acuity scores at first presentation, which supports this.

Unfortunately much of the analysis is focused on whether attending a department with dedicated paediatric consultants made a difference. I suspect that this is association rather than causation and would be difficult to prove. In any case we would need to see the background rates of paediatric attendances to each unit to determine whether these district general hospitals were genuinely outliers. There may also be a parental tendency to reattend at a “specialist” hospital or a clinician tendency to admit more patients at a specialist hospital due to a higher acuity presenting there – the paper does not answer this question.

What can we take from this paper into clinical practice?

What this study seems to tell us is that diagnosis is tricky and that time and observation is valuable – and that we should not only make the most of opportunities to observe and review patients but that we should safety-net properly. Any child with any apparently benign illness may re-present with a deterioration in condition and we must ensure that parents feel confident in returning to us if that occurs.

More questions to ask

  • How on earth can we identify serious bacterial illness in children? Answers on a postcard for a Nobel prize… 🙂

Follow us on twitter: @PEMLit

Talking Heads: S100B For Detection of Intracranial Injury in Mild Head Trauma in Children

Screen Shot 2015-10-08 at 16.11.48

Where can I find this paper?

What is this paper about (what is the research question)?

Does S100B, a calcium-binding protein located in the cytoplasm and nucleus of astrocytes and Schwann cells, have a role in predicting intracranial injury (or its absence) for mild head trauma in children?

Summary of the Paper

Design: multicentre prospective cohort study

Objective: to determine the test characteristics for S100B in mild head trauma in children with determination of a cutoff to provide diagnostic utility.

Outcome of interest: diagnostic/predictive performance of S100B biomarker for intracranial injury in children with mild head trauma

Reference Standard: presence of intracranial injury (any collection of blood within the cranial vault or cerebral oedema) on CT scan

Participants: children aged <16 years presenting to one of three Swiss paediatric EDs between January 2009 and December 2011

  • Inclusions: patients with mild head injury (acute head trauma with confusion or LOC <30mins or amnesia or transient neurological abnormality) for whom a CT was performed and blood obtained for S100B assay.
  • Exclusions: children arriving >6h after head trauma, children with Down syndrome, patients with a history of seizure in the preceding 28/7

Results: 80 children were enrolled of whom 73 were included in the analysis. 20 (27.4%) had evidence of intracranial injury on CT although none required surgical intervention.

The area under the Receiver Operator Characteristic (ROC) curve for S100B was 0.73 (95% CI 0.60-0.86) which improved to 0.77 (95% CI 0.65-0.89) when under 2s were excluded.

Using a cutoff of 0.14micrograms/L gave a sensitivity of 95% (95% CI 77%-100%) for all children [100% (95% CI 81%-100%) with under 2s excluded] and specificity 34.0% (95% CI 27%-36%).

Authors’ Conclusions:

The biomarker S100B is a valuable tool to help the physician decide whether head CT is indicated for children aged <16 years with mild head trauma. Its excellent sensitivity indicates that it could be an accurate tool to “rule out” an intracranial injury.

On the study design

This was a small prospective study in which blood samples were taken from children presenting with mild head injury deemed by clinicians to require CT scan and analysed independently of the CT findings to permit calculation of test characteristics for the biomarker S100B.

The authors included patients under 16 presenting to one of three Swiss paediatric EDs with mild head injury (acute head trauma with confusion or LOC <30mins or amnesia or transient neurological abnormality) for whom a CT was requested; these subjects also had a venous blood sample for S100B level which was not available before CTs had been reported. They then determined test characteristics for S100B in the context of CT findings. The sample size was pretty small – 80 children were enrolled of whom 7 were excluded, either because they didn’t have the blood test at all, within 6h or they didn’t have the CT scan. This affects the applicability of the study.

Performing bloods on children in the ED is a tricky one; children with major trauma presentations frequently have blood tests taken but these children might not. It’s worth considering how many additional blood tests we might be performing if S100B is adopted into everyday practice.

The other interesting thing is the classification of “mild head injury”. These children were selected because they were having CT head (the reference standard for determining the presence or absence of intracranial injury) but the population does not completely correlate with those head injured children who would have a CT indicated according to the NICE head injury guidelines – which is going to affect whether we can directly extrapolate the results to our ED head injured population as there may be some children we would want to CT who would not have been included in this study.

What were the results and what does this mean?

Only 73/80 were included in the analysis, of whom 20 had an intracranial injury. No surgical interventions were required in any case so we may be missing this proportion of severely head injured patients which, combined with the inclusion of only “mild head injuries” means that we have really only looked at a slice of our PED head injury population.

The ROC curve for S100B had an AUC of 0.73 (95% CI 0.60-0.86) which improved to 0.77 (95% CI 0.65-0.89) when under 2s were excluded.

Using a cutoff of 0.14micrograms/L gave a sensitivity of 95% (95% CI 77%-100%) for all children (100% (95% CI 81%-100%) with under 2s excluded) and specificity 34.0% (95% CI 27%-36%). This looks good, but look at the width of those confidence intervals, reflective of the small sample size. If the true sensitivity is 77% that’s no good at all – so we definitely need confirmation with a bigger study and ideally wider inclusion, so we can apply the findings to all our head injured patients.

What can we take from this paper into clinical practice?

There’s definitely potential for S100B to be used as a lesser evil compared with radiation exposure on the developing brain. However the evidence (and, in all likelihood, the assays in your laboratory) isn’t there yet. Watch this space… I suspect there is more to come on S100B.

More questions to ask

  • How would S100B perform for all head injured children in a bigger study?
  • Do we need to exclude the under-2s to improve test characteristics – and what should we do with those children?
  • What is the level of sensitivity we will accept at the cost of specificity?

Follow us on twitter: @PEMLit

Clinician Suspicion in Blunt Torso Trauma – Place Your Bets

Screen Shot 2015-10-08 at 11.34.30

Where can I find this paper?

What is this paper about (what is the research question)?

Are clinicians better at predicting intra-abdominal injuries in children with blunt torso trauma than a derived clinical prediction rule?

Summary of the Paper

Design: Secondary analysis of some existing PECARN group data from a prospective cohort study of children with blunt torso trauma

Objective: to compare the test characteristics of clinician suspicion with a derived clinical prediction rule to identify children at very low risk of intra-abdominal injuries undergoing acute intervention

Outcome: test characteristics for clinician suspicion, measured against presence or absence of need for acute intervention for intra-abdominal injury.

Comparison: test characteristics of a derived clinical prediction rule from the same population.

Participants: 12044 patients recruited between May 2007-January 2010 and eligible to participate in the parent study ( underwent secondary analysis.

  • Inclusions: children <18 years old with blunt torso trauma presenting to participating PECARN Emergency Departments
  • Exclusions: injury >24h prior to attendance; pre-existing neurological disorders affecting examination findings; pregnancy; transfer from another institution.


3016/9252 deemed low risk (<1%) for clinician suspicion had CT abdomen performed; 35 patients  subsequently had acute intervention. Of the remaining patients with clinician suspicion ≥1%, 168/2667 had an acute intervention.

Negative clinician suspicion had the following test characteristics;

  • sensitivity 82.8% (95% CI 77.0-87.3)
  • specificity 78.7% (95% CI 77.9-79.4%)
  • NPV 99.6 (95% CI 99.5-99.7%)
  • LR- 0.2 (95% CI 0.2-0.3)

Low risk on the prediction rule had the following test characteristics;

  • sensitivity 97.0% (95% CI 93.7-98.6)
  • specificity 42.5% (95% CI 41.6-43.4%)
  • NPV 99.9 (95% CI 99.7-99.9%)
  • LR- 0.1 (95% CI 0.0-0.2)

Authors’  conclusions

A clinical prediction rule had a significantly higher sensitivity for identifying intra-abdominal injury undergoing acute intervention, but a lower specificity. The higher specificity of clinician suspicion did not translate into clinical practice as clinicians frequently obtained abdominal CT scans in patients they considered to be at very low risk.

On the study design


This was a secondary analysis of data collected as part of an original PECARN study on abdominal trauma in children. It’s always worth remembering that while secondary analysis can reveal some very useful information and trends, this was not the original purpose for which the study group was recruited or the study powered (although the authors tell us this study was preplanned, and the standardised data collection forms used to collect information about clinician decision making supports this).

The study has an issue in that the “gold standard” abdominal CT was not applied to all patients, only those deemed to be at risk of injury. This means there is a large portion of patients who had no imaging and no intervention who may still have had intra-abdominal injury although without a need for clinical intervention the significance of this is doubtful.

Good attempts were made to follow subjects up to ensure no clinically important outcomes were omitted.

What were the results and what does this mean?

There is an important distinction in this paper between the presence of an abdominal injury and one requiring intervention (specified as death, therapeutic intervention at laparotomy, angiographic embolisation, blood transfusion for anaemia or administration of intravenous fluids for at least two nights). This composite reference standard is pragmatic but we could argue about whether intra-abdominal injuries not requiring intervention are also clinically relevant or not, considering the comparative risks of radiation exposure with abdominal CT.

It is worth noting that not all of the 12044 subjects enrolled had CT abdomen performed. 11919 were deemed to have no suspicion of injury, which we must doubt given the fact that neither clinician suspicion nor clinical prediction rule achieved 100% sensitivity.

The study found that in patients with intra-abdominal injury requiring intervention, the clinician correctly identified the risk as ≥1% in 82.8% (95% CI 77.0-87.3) of cases, and in patients who did not have intra-abdominal injury requiring intervention, the clinician correctly identified that the risk was <1% in 78.7% (95% CI 77.9-79.4%) of cases. Unfortunately this shows that clinician judgement alone is neither sensitive nor specific enough to support decision making in isolation. This is borne out in a high CT abdomen rate in the population, despite a high proportion of low risk patients.

The decision rule, which determined risk as “not low” in the presence of any one of:

  • no evidence of abdominal wall trauma or seat belt sign
  • GCS >13
  • no abdominal tenderness
  • no evidence of thoracic wall trauma
  • no complaints of abdominal pain
  • no decreased breath sounds
  • no history of vomiting after the injury

had better sensitivity (so the absence of these signs performs better as a predictor of the lack of need for CT and intervention) but poorer specificity (i.e. the presence of any sign does not accurately predict a need for intervention).

Of note there were three patients whose injuries were not identified by clinician prediction or derived clinical prediction rule, so neither predictor achieved 100% sensitivity.

What can we take from this paper into clinical practice?

We as clinicians rely a lot on clinical judgement but that alone is a poor predictor of the need for intervention for intra-abdominal injury, especially when compared with this non-validated derived prediction rule. Following validation the prediction rule may have some diagnostic utility, especially when combined with observation.

More questions to ask

  • How will this decision rule perform when validated?
  • How would the rule perform if the specificity of clinician judgement was incorporated?

See Also:

St Emlyns – RCR Guidelines on imaging in paediatric trauma Imaging in Paediatric Trauma – RCR Guidelines – St.Emlyn’s

Follow us on twitter: @PEMLit

Oxygen Saturation Targets in Bronchiolitis – Magic Numbers?

Screen Shot 2015-10-08 at 09.02.54

Where can I find this paper? – this paper is currently open access

What is this paper about (what is the research question)?

Is a target oxygen saturation of 90% or higher equivalent to 94% or higher for resolution of illness in acute viral bronchiolitis?

Summary of the Paper

Design: multicentre, parallel group, randomised controlled equivalence trial with allocation concealment.

Objective: to determine whether accepting a reduced lower limit target oxygen saturation in infants with viral bronchiolitis affected time to resolution of illness

Primary outcome measure: time to resolution of cough (parental reporting)

Intervention: subjects were randomised following decision to admit, to either standard SpO2 monitoring or a modified oximeter which skewed the reading such that SpO2 90% read as 94%. All other care was standard.

Participants: 615 subjects randomised between 03/10/2011-30/03/2012 and 01/10/2012-29/03/2013. 308 randomised to standard group, 307 to modified oximeter group

  • Inclusions: infants aged 6 weeks to 12 months (corrected gestational age) with clinically diagnosed bronchiolitis admitted to hospital for supportive care following presentation to the Emergency Department or Acute Assessment Area
  • Exclusions: preterm (<37 weeks) who had received oxygen in past 4 weeks; cyanotic or haemodynamically significant heart disease; CF or interstitial lung disease; documented immunodeficiency; direct admission to HDU/ICU; previously randomised

Results: Median time to cough resolution was 15.0 days in both groups with a median difference of 1.0 days (95% CI -1 to 2). This fell between the prespecified equivalence limits of plus and minus two days.


Authors’  conclusions

In children with acute viral bronchiolitis, the time taken for symptoms to resolve was the same whether they were managed to a target oxygen saturation of 90% or 94%.

On the study design


This study used eight centres to recruit a sample with 80% power to detect non-equivalence of greater than two days in time to resolution of cough. Cough resolution was determined by parents at pre-determined follow-up phonecalls (7, 14, 28 days and 6 months). Some allowances were made for inaccurate recording of this data using random selection of a date between the last time the cough was known to be present and the first date it was noted to be absent (if available). This method of reporting does still leave the outcome open to some parental bias and accuracy of reporting cannot be guaranteed.

Allocation to a group was concealed until definite enrolment, and the allocation was masked to study staff, hospital staff and parents. It’s not clear why the authors have chosen to use the work “masking” rather than “blinding”.

Several interesting secondary outcomes were also recorded although it is always worth remembering that studies are designed and powered to detect differences in the primary outcome and may be underpowered to detect differences in secondary outcome. The authors decided in advance to statistically analyse time until “fit for discharge” and actual discharge date for both groups, along with parental anxiety scores and whether the child was fit to attend daycare.

What were the results and what does this mean?


Following some loss to follow-up and protocol violations, 293 subjects were analysed in the standard group at 6 months and 291 in the modified oximeter group. This still reflects a study population greater than that determined by the power calculation. There was no difference in the median time to cough resolution which was 15.0 days in both groups.

The authors addressed both intention to treat analysis (analysing those subjects with protocol violations – being given the wrong oximeter probe – according to their original allocated group) and per-protocol analysis (analysing them only if they fulfilled the allocation from start to finish) and found this did not affect the results.

The modified group also had quicker return to adequate feeding and “back to normal” time. Patients in the modified group, predictably, received supplemental oxygen in fewer cases, for a shorter period, were considered fit for discharge sooner and were discharged sooner. There were fewer serious adverse events and adverse events in the modified group (35 SAEs in 32 infants in the standard group vs 25 SAEs in 24 infants in the modified group). The modified group had increased HDU admissions (13 episodes in the modified group vs 8 in the standard group) but fewer reattendances (26 in the standard group vs 12 in the modified group).

The authors postulate that having a higher target oxygen saturation influences decisions about fitness for discharge and that the increased use of oxygen in the standard group might have adversely affected feeding through drying of nasal passages, reflected in the time to adequate feeding. They also suggest that increased time in hospital in the standard group might expose these infants to nosocomial infection, causing the increased readmission rate – but of course this is all speculation 🙂

What can we take from this paper into clinical practice?

It seems that infants subjectively recover from bronchiolitis at the same rate even if we target SpO2 90% or above instead of 94% or above. However this was a population for whom a need for admission to hospital had already been identified and the extrapolation of this to the Emergency Department population is not wholly appropriate. We can be reasonably relaxed about SpO2 90-94% in these patients but until further work is done to reflect our undifferentiated population we should probably be careful about assuming we can safely discharge these infants.

More questions to ask

  • Would we see the same resolution and patterns of return to normal behaviour/complications in the undifferentiated ED population of infants with bronchiolitis?

See Also:

Don’t Forget the Bubbles – Tessa Davis reviews a JAMA paper on oxygen saturations in admission decision-making in patients with bronchiolitis –


Follow us on twitter: @PEMLit

15th May: Should We Cool Children Following Out-Of-Hospital Cardiac Arrest?




Where can I find this paper?

What is this paper about (what is the research question)?

Does therapeutic hypothermia increase the proportion of patients surviving at one year with good functional status following paediatric out-of-hospital cardiac arrest?

Summary of the Paper

Design: single-blinded, multicentre randomised controlled trial

Objective: to determine whether therapeutic hypothermia after out-of-hospital cardiac arrest confers a benefit in children

Primary outcome measure: survival at 12 months with good neurological function (defined as age-corrected standard score of 70 or more on the Vineland Adaptive Behaviour Scales (VABS-II)

Intervention: subjects were randomly assigned in 1:1 (permuted blocks stratified by age) to therapeutic hypothermia (target temperature 33°C) for 48h then normothermia (target temperature 36.8°C) for 72h, or normothermia for 120h. Active cooling was undertaken in either case to achieve the target temperature.

Participants: 295 patients randomised between September 2009 – December 2012. 155 were randomised to hypothermia, 140 to normothermia.

  • Inclusions: patients aged 48hrs-18 years presenting following out-of-hospital cardiac arrest to one of 38 sites in the US and Canada, having required chest compressions for at least two minutes and with an ongoing requirement for mechanical ventilation after return of spontaneous circulation (ROSC)
  • Exclusions: inability to undergo randomisation within 6h, score of 5-6 on the motor component of the Glasgow Coma Scale, decision to withhold aggressive treatment, major trauma as cause of arrest, patients with pre-existing VABS-II score <70


Survivors at 12 months with VABS-II score >70

Hypothermia 27/138 (20%)

Normothermia 15/122 (12%)

Risk difference 7.3 (95% confidence interval -1.5 to 16.1)

Relative likelihood 1.54 (95% confidence interval 0.86 to 2.76, P=0.14)

Authors’  conclusions

In comatose children who survive out-of-hospital cardiac arrest, therapeutic hypothermia, as compared with therapeutic normothermia, did not confer a significant benefit with respect to survival with good functional outcome at one year.

On the study design

The study was utilised multicentre collaboration to recruit a sample with 85% power to detect a 15-20% difference in the primary outcome between treatment groups. This was a pragmatic design; although the subjects and those providing care to the patients could not be blinded to the intervention, reasonable steps were taken to ensure that the investigators recording the primary outcome were a) independent from those delivering care and b) blinded to the arm of the study to which subjects had been randomised.

Unlike other studies, the normothermia in this case was also an active decision; the patients’ temperature was actively controlled according to the group to which they were randomised.

The authors tells us that other than the temperature targeted, care between the groups was identical although they later state that “all other aspects of care were determined by the clinical teams.” This does leave us to wonder what if and how knowledge of the treatment arm and expectation of its efficacy (or otherwise) might have influenced those treating clinicians.

What were the results and what does this mean?

There was no statistical difference in survival with a good neurological outcome at 12 months between the two groups. In the secondary outcomes, there was no difference in absolute survival between the groups, nor in the reduction in neurological performance score, however there was increased incidence of hypokalaemia and thrombocytopenia in the hypothermia group and increased requirement for renal replacement therapy in the normothermia group.

Results were analysed using intention to treat analysis, which includes subjects in the final analysis of the arm to which they were randomised irrespective of whether they dropped out of the study or received an alternative treatment in the end. This is a conservative approach which can help to ameliorate the effects of unpleasant side effects of treatments; there’s a nice explanation of intention to treat here. It helps give us a realistic expectation of the results we might see in clinical practice.

What can we take from this paper into clinical practice?

In this study the null hypothesis was no difference between the groups, this study doesn’t prove that hypothermia is harmful or not beneficial; there is simply insufficient evidence to reject the null hypothesis of no difference, based on this study. We should continue to follow local protocols in terms of cooling but this paper does give clinicians a little additional confidence in deviating from protocols if indicated.

More questions to ask

  • Is there evidence for cooling patients following in-hospital cardiac arrest?
  • Would a larger sample size demonstrate a benefit and is this feasible?

See Also:

This post at St Emlyns: JC: Getting Chilly Quickly 4. Doing It For The Kids

This post at Academic Life in Emergency Medicine: Therapeutic Hypothermia After Paediatric Cardiac Arrest Out-Of-Hospital

This post at Resus.Me: Post Arrest Hypothermia in Children Did Not Improve Outcome

Follow us on twitter: @PEMLit

5th July 2013: Comparison of cosmetic outcomes of absorbable versus nonabsorbable sutures in pediatric facial lacerations

5th July 2013

Where can I find this paper?

What is this paper about (what is the research question)?

Do non-absorbable and absorbable sutures give comparable cosmetic results for repair of simple facial wounds in kids?

Summary of the Paper

Design: multicentre, randomised controlled, single blinded trial with allocation concealment

Objective: to compare long-term cosmetic outcomes of absorbable versus non-absorbable sutures based on physician scoring of facial lacerations in the paediatric population

Outcomes: primary – visual analogue scale assessment of wound acceptability made by physicians, blinded to suture material, at 3 months. Secondary – caregiver completion of same visual analogue scale plus completion of satisfaction questionnaire

Intervention: closure of wound by standard approach using 5.0 fast-absorbing surgical gut (FAC) without removal of sutures

Reference Standard: closure of wound by standard approach using 5.0 non-absorbable suture (NYL) with removal of sutures at 4-7 days

Participants: patients presenting to two urban paediatric EDs in Philadelphia April 2008-April 2010

Inclusion – English speaking patients aged 1-18 years with isolated, non-contaminated linear facial wounds between 1-5cm in length assessed by clinicians as requiring closure by suture

Exclusion – irregular or contaminated wounds/bites, wounds>8h old, patients with complex wounds, immunodeficiency, bleeding/clotting disorder, pregnancy, diabetes, renal dysfunction, or allergy to local anaesthetic

Results: 98 patients were recruited of whom 49 had closure with FAC and 49 with NYL. 85 were followed-up at 4-7 days (42 FAC,43 NYL) and 76 at 3months in person or by telephone (FAC 37, NYL 39). Telephone follow up did not include VAS score.

61 patients had completed VAS scores at 3/12 (FAC 29, NYL 32)

Mean VAS scores by physicians:

FAC 57.6, NYL 67.6

Difference in means -10 (95% CI for difference in means -19.6 to -0.4) 

Authors’  conclusions

We are not yet able to conclude that absorbable sutures are equivalent to nonabsorbable sutures with respect to cosmetic outcomes of facial lacerations in children.

On the study design

There is little information on how patients were recruited, but other than the restriction of English-speaking patients inclusion and exclusion criteria seem sensible.

The allocation concealment and blinding is helpful in reducing bias, but I would question whether leaving absorbable sutures until completely absorbed is standard practice – it isn’t mine, and therefore this impacts the external validity of the study.

The plan for follow-up at 3/12 seems sensible and is rationalised by the authors but this seems early to fully assess the “long-term” impact of wound closure.

While the exact suture material does not necessarily replicate standard UK practice it is reasonable to assume little difference between non-absorbable and absorbable suture material around the globe.

What were the results and what does this mean?

The trial is a non-inferiority trial – the aim is to show that using absorbable suture material does not give a perceptibly inferior cosmetic result. The visual analogue scoring undertaken by blinded physicians (and averaged between three scorers) showed not only lower VAS satisfaction scores for the absorbable suture group but a 95% confidence interval which did not cross zero, suggesting the study was unable to demonstrate non-inferiority. The validity of the VAS has been assessed elsewhere but there is a considerable difference between physician and caregiver scores.

It is also important to remember that despite sample size calculations which predicted attrition of 40%, only 61/98 recruited patients actually completed the full study protocol and had photographs for assessment by VAS – so the study was insufficiently powered.

What can we take from this paper into clinical practice?

It appears that if we use absorbable sutures and don’t remove them, there are noticable differences in wound healing at 3/12; there’s insufficient evidence in this paper to convince us that not removing sutures provides a comparable cosmetic result in the first three months.

More questions to ask

  • Are there benefits to using absorbable sutures and then removing them (in the same timeframe as we would normally remove non-absorbable sutures)?
  • Would we see non-inferiority at a later review – 18 months after closure perhaps?
  • Would we see non-inferiority in an appropriately powered study?

Follow us on twitter: @PEMLit

12th April 2013: Electrolyte Profile of Paediatric Patients with Hypertrophic Pyloric Stenosis

120413 title

Where can I find this paper?

What is this paper about (what is the research question)?

In paediatric patients with hypertrophic pyloric stenosis (HPS), what is the prevalence of abnormal laboratory results?  Are these results related to the duration of illness (by duration of vomiting), and is there any time trend in these results?

Summary of the Paper

Design: Retrospective chart review

Objective: To investigate the incidence and prevalence of abnormal laboratory results in patients with a radiological and operative diagnosis of HPS


Primary – prevalence of high, low and normal CO2, K and Cl in HPS cases

Secondary – trend in prevalence of metabolic alkalosis and acidosis in HPS cases over the study period

Tertiary – association between days of vomiting and abnormal CO2, K and Cl

Reference Standard: Normal range laboratory results for the facility

Participants: Patients younger than 6 months, with HPS confirmed on ultrasound or Upper GI series, who underwent pyloromyotomy at a tertiary regional paediatric centre from 2000-2009.


205 patients were included in the study.  Their age varied from 1.4 to 13.9 weeks (SD 2.2), with a weight range of 2.1 to 4.9kg (SD 0.5).  88.3% were male.  74.3% were of non-Hispanic ethnicity.  80.5% white race, 1.5% African-American, 1.5% Asian and 16.5% other.

The proportion of HPS cases with normal serum CO2 was 62%, low 20%, and high CO2 18%.  Potassium was normal in 57%, low in 8% and high in 35% of cases.  Chloride was normal in 69%, low in 25% and high in 6% of cases.

Logistic regression analysis of the proportion of normal, low and high CO2 over the study period showed an increased in the prevalence of metabolic alkalosis (p=0.009) and a decreased in metabolic acidosis (p=0.002).

Advancing age was associated with presence of metabolic alkalosis on presentation with HPS (data not provided).

There was no correlation between the number of days of vomiting and abnormalities in electrolytes in this study population.

Authors’  conclusions

We observed that normal laboratory values are the most common finding in HPS and that metabolic alkalosis was found more commonly in the latter part of the decade and in older infants.

On the study design

This was a retrospective chart review for a 10 year period from 2000-2009.  Data from 2000-2002 was combined to increase power because the case numbers in single years were “small and unstable.”  It’s not clear what they mean by “unstable” as the raw data is not provided.

The authors do not comment on the total number of presentations over the study period, so it’s unclear if any cases were excluded, and reasons for any such exclusions.

There is demographic data missing with respect to birth weight (138/205), days of vomiting (196/205), heart rate at presentation (203/205) and weight at presentation (204/205).  The latter categories are unlikely to have been affected by this, and it is unclear whether additional data on duration of vomiting would have changed the analysis.

Prospective studies have the advantage of more complete data sets, and potential for further variables to be included, however can introduce observation/measurement bias.

What were the results and what does this mean?

Normal laboratory values are the most common finding in HPS and therefore  serum electrolytes are a poor marker for the presence or absence of HPS.

CO2 normal 62% low 20% high 18%

K normal 57% low 8% high 35%

Cl normal 69%  low 25%  high 6%

The incidence of metabolic alkalosis increased over the study period, and its prevalence is higher in older infants. 

They have no explanation for the increase in metabolic alkalosis over the decade of the study.

The authors postulate that the latter finding may demonstrate that advanced age at diagnosis serves as a marker for the duration and severity of stenosis.

What can we take from this paper into clinical practice?

This paper agrees with previous studies that the “typical metabolic picture” of hypochloraemic hypokalaemic metabolic alkalosis in paediatric HPS is no longer seen in the majority of presentations.

For us this means that we cannot rely on laboratory results as a marker for hypertrophic pyloric stenosis in infants.  We must continue to have a high index of suspicion for this condition in infants presenting with persistent vomiting and proceed to ultrasound for diagnosis.

Although laboratory results don’t help us decide which children need ultrasound, it is important to look for metabolic derangements and correct them as indicated.

What this study adds is that contrary to previous beliefs, there is no relationship between the duration of illness, and particularly vomiting, on the severity of metabolic derangements in these children.  This seems counter-intuitive, and perhaps the more important factor is not the duration of vomiting, but whether the infants are able to keep down an adequate amount of fluids – i.e. The severity of dehydration.

Unfortunately there was insufficient data in the patient charts to enable analysis of trends between dehydration, vomiting and abnormal laboratory results.  Only 43/205 (21%) charts mentioned hydration status, however 42% of the patients whose charts noted dehydration (36/205) had metabolic alkalosis at presentation, compared to 44% with normal CO2.

More questions to ask

  • Were there a higher proportion of males in this group than other populations?
  • Why was delayed presentation (60days vomiting in one case) not associated with more severe illness??
  • An insufficient number of charts contained information about hydration status – is this more relevant for laboratory abnormalities than days of vomiting?

Follow us on twitter: @PEMLit

5th April 2013: Prospective Pilot Derivation of a Decision Tool for Children at Low Risk for Testicular Torsion


Where can I find this paper?

What is this paper about (what is the research question)?

Is it possible to exclude a diagnosis of testicular torsion on the basis of history and examination alone?

Summary of the Paper

Design: prospective cohort study for derivation of a clinical decision rule

Objective: to derive a pilot clinical decision tool with 100% NPV for testicular torsion

Outcome: Proposed low-risk decision tree determined by recursive partitioning based on historical and examination variables recorded prior to ultrasonographic or specialist assessment

Reference Standard: presence of testicular torsion defined by: diminished blood flow on testicular doppler US (read by paediatric radiologist), or ischaemic/infarcted testicle at operative assessment (by paediatric surgeon or urologist), or presence of testicular atrophy at 1- to 3-month follow-up (contralateral difference in testicular size as measured by orchidometer)

Participants: Convenience sample of male patients aged 0-21 years with acute (<72h) testicular pain presenting to a tertiary children’s ED between July 2005-February 2008

Results: 228 patients (of 552 eligible patients) were enrolled. 55 (10% of eligible patients) were diagnosed with testicular torsion, of whom 21 (9.2%) were among those recruited into the study.

Odds ratios:

  • Horizontal/inguinal testicular lie OR=18.17 (95%CI 6.2-53.2)
  • Unilaterally or bilaterally absent cremasteric reflect OR=11.01 (95%CI 3.14-38.64)
  • Nausea or vomiting OR=5.63 (95%CI 2.08-15.22)
  • Age 11-21 years OR=3.9 (95%CI 1.27-11.97)
  • Scrotal oedema OR=3.42 (95%CI 1.21-9.69)

Authors’ Conclusions:

Patients with normal testicular lie, without nausea or vomiting, and between the ages of 0-10 years are at low risk for having testicular torsion despite the presence of acute testicular pain. Thus, patients who do not meet all three of these criteria should be considered at risk for possible testicular torsion and should undergo subsequent emergent evaluation.

On the study design

The inclusion and exclusion criteria seem sensible too; patients were included in the age 0-21 group with testicular pain of <72h duration, and subsequently excluded if they had prior ipsilateral inguinal or  urological surgery, definite hydrocoele or inguinal hernia or known diagnosis at initial evaluation. The authors have tried to maximise their awareness of the patient population by using database searches during the study period to identify “missed” participants.

Unfortunately the convenience sample meant that more than half of patients presenting during the study period who were diagnosed with testicular torsion were not included in the data collection. This means the study was underpowered for the question it intended to ask. Convenience sampling is often significantly cheaper and easier than a 24-hr recruiting presence in the ED but as this paper demonstrates it can have a profound effect on the numbers recruited, particularly in conditions which are relatively rare.

Various measures have been utilised to minimise the effect of bias; standardised data collection forms are always helpful in this regard. The initial ED assessments were made prior to ultrasound or speciality assessment which acts as a blind assessment, although surgeons and radiologists determining the outcome were not blinded. The authors argue that clinical information is essential in patient care, but many studies use blinded radiological assessment after the event and this could certainly have been undertaken in this case even if the surgeons could not be blinded.

In the UK, it is likely that testicular tissue would be sent for histological diagnosis; arguably, this is a more definitive outcome and could certainly be blinded.

The decision to follow-up at 1- 3 months with orchidometer measurements when baseline measurements were not taken is an odd one; surely this invites all manner of confounders? Thankfully this did not actually involve any subjects but it seems a strange choice – perhaps an afterthought?

What were the results and what does this mean?

Odds ratios for the various examination and historical findings were given in table 2. These variables were formulated into a decision rule using recursive partitioning.

050413 Table 2

The most strongly predictive finding was abnormal testicular lie, with an odds ratio of 18.17 but a very wide confidence interval (95%CI 6.2-53.2) reflecting the small study numbers.

The decision rule in itself had the following test characteristics:

  • NPV 100% (95%CI 98-100%)
  • Sensitivity 100% (95%CI 98-100%)
  • Specificity 44% (95%CI 38-50%)
  • PPV 15% (95%CI 11-21%)

Obviously an NPV of 100% and sensitivity of 100% is impressive and important in a rule-out tool such as this, but the specificity and positive predictive value are very low. This would ordinarily expose a large number of patients to further examination and assessment, but as these patients have not yet had doppler examination it may not be unworkable.

However, this rather raises the question – if I saw a 7-year-old patient with testicular pain and vomiting, would I really need this decision rule to tell me that he needed further assessment to exclude testicular torsion?

What can we take from this paper into clinical practice?

I don’t think that at this stage we can rely fully on the absence of abnormal lie, nausea/vomiting and age <10 years to exclude testicular torsion as a diagnosis in patients with acute testicular pain in the ED, but it will be interesting to see how the proposed decision tool performs in external validation.

However, taking a step back, we are able to see that what this paper is  trying to do is formalise the process of diagnostic suspicion of testicular torsion. We have little information about the skill and experience levels of the ED physicians performing the initial assessment. Does this paper tell us anything we don’t already know as clinicians?

Well, maybe yes – it looks as though we can be a little reassured by the group of patients aged <10 without abnormal lie or nausea/vomiting. The use of sensitivity analysis adds to this – the authors have included  patients lost to follow-up and assumed that they had torsion, finding that the decision rule performed just as well.

However, we really need to see how the rule performs in a fresh setting when applied to all patients rather than a convenience sample.

More questions to ask

  • How would this rule perform in a different setting – an external ED or even in general practice?
  • Does this decision process reduce our referrals for expert assessment/doppler US or does the low specificity/PPV represent a potential increase in referral, time and cost?

Follow us on twitter: @PEMLit