• No results found

We were sceptical of the submission of reports and examinations for a second opinion (double reading), which appeared to be default at some departments

N/A
N/A
Protected

Academic year: 2022

Share "We were sceptical of the submission of reports and examinations for a second opinion (double reading), which appeared to be default at some departments"

Copied!
85
0
0

Laster.... (Se fulltekst nå)

Fulltekst

(1)

D

OUBLE  READING  IN  

N

ORWEGIAN  HOSPITAL  RADIOLOGY  DEPARTMENTS

 

               

P

ETER  

M

ÆHRE  

L

AURITZEN

 

     

   

       

       

     

 

Department  of  Diagnostic  Imaging   and  

HØKH  Research  Unit   Akershus  University  Hospital  

 

Institute  of  Clinical  Medicine,  Faculty  of  Medicine,  University  of  Oslo    

Norway  

(2)

© Peter Mæhre Lauritzen, 2016

Series of dissertations submitted to the Faculty of Medicine, University of Oslo

ISBN 978-82-8333-180-6 ISSN 1501-8962

All rights reserved. No part of this publication may be reproduced or transmitted, in any form or by any means, without permission.

Cover: Hanne Baadsgaard Utigard

Printed in Norway: 07 Media AS – www.07.no

(3)

 

 “There  is  an  art,  it  says,  or  rather,  a  knack  to  flying.  The  knack  lies  in  learning  how   to  throw  yourself  at  the  ground  and  miss.”  

  Douglas  Adams,  Life,  the  universe  and  everything,  1982      

                                                                       

To  my  family.  You  are  everything  to  me.  

 

 

(4)

Preface  

The  starting  point  of  this  project  was  a  series  of  conversations  with  my  senior   colleague  Gunnar  Sandbæk  while  on  call  at  Aker  University  hospital.  Gunnar  is  a  man   who  gets  things  done,  and  I  liked  to  believe  that  so  was  I.  These  conversations  often   touched  on  questions  of  efficiency,  responsibility  and  purposeful  workflow.  

 

We  discussed  how  radiology  could  best  provide  the  clinicians  with  what  they  need:  

timely  and  clear  answers  to  their  diagnostic  questions.  We  were  sceptical  of  the   submission  of  reports  and  examinations  for  a  second  opinion  (double  reading),   which  appeared  to  be  default  at  some  departments.  We  wondered  whether  this   practice  improved  quality,  or  simply  squandered  resources  and  caused  delays.  

 

The  question  was  simple  enough.  After  exploring  this  subject  for  the  better  part  of   five  years,  I  can  state  with  great  conviction  that:  “It  is  complicated”.    

 For  every  step  in  the  diagnostic  imaging  process,  judgement  is  used  repeatedly:  the   decision  to  refer,  what  to  include  in  the  referral,  how  to  image  the  patient,  the   interpretation  of  the  images,  what  to  emphasize  or  omit  in  the  report,  how  to   understand  the  report  in  the  clinical  context,  and  ultimately  how  to  manage  the   patient.  

 

For  every  step  rights  and  wrongs  exist,  but  in  between  the  two  is  a  grey  area  of   judgement.  What  happens  in  this  grey  area  is  not  always  recorded,  and  it  is  often   difficult  to  measure.  Even  the  rights  and  wrongs  are  evasive,  as  most  imaging   findings  are  not  followed  by  an  undisputable  diagnosis  –  the  gold  standard.  

 

In  the  intersection  between  radiology  and  quality  improvement  I  have  come  to   realize  that  it  isn´t  all  about  getting  it  done  and  getting  it  right.  We  also  have  to  learn   from  our  inevitable  mistakes  and  improve  our  systems  to  prevent  recurrences.  

Although  my  perspective  has  changed  since  I  started,  I  still  believe  that  our  use  of   judgement  is  an  important  field  of  enquiry.  It  is  complicated,  but  not  impossible  nor   futile.  

 

 

(5)

Acknowledgements  

First  I  would  like  to  sincerely  thank  my  supervisor  Pål  Gulbrandsen  and  my  co-­‐

supervisors  Gunnar  Sandbæk  and  Petter  Hurlen.  They  have  always  given  me  all  the   support  I  needed,  and  without  them  I  would  never  have  left  the  starting  blocks.  

Gunnar  gave  me  the  opportunity  to  turn  a  question  into  a  dissertation  by  giving  me   my  first  research  position.  His  support  as  an  experienced  radiologist,  researcher  and   manager  has  been  invaluable.  I  envy  Gunnar  his  positive  attitude,  contagious   enthusiasm,  and  his  habit  of  giving  compliments.  

Petter’s  combined  knowledge  of  management  and  computer  programming  has  been   priceless.  He  gives  precise  feedback,  asks  crucial  questions,  and  has  taught  me  to   limit  my  scope.  I  admire  his  ability  to  cut  to  the  chase.  

It  is  impossible  to  list  the  things  Pål  has  taught  me  and  the  things  I  can  thank  him  for.  

Pål  has  provided  daily  guidance  in  all  questions  big  and  small.    It  is  inspiring  that   despite  being  rich  in  both  knowledge  and  wisdom,  Pål  possesses  such  curiosity  about   everything.  

Jack  Gunnar  Andersen,  Mali  Victoria  Stokke,  and  Anne  Lise  Tennstrand,  my  

collaborators  at  Ullevål,  Drammen,  and  Bærum  deserve  great  praise.  The  work  they   have  done  is  above  and  beyond  the  call  of  duty,  and  the  project  could  not  have  been   carried  out  without  them.  

By  contributing  their  clinical  experience  Thomas  Hegglund,  Rolf  Aamodt,  and   Andreas  Ødegaard  at  the  department  of  abdominal  surgery  and  Gisle  Bjerke,  Knut   Stavem,  and  Vidar  Søyseth  at  the  department  of  pulmonary  medicine  have  provided   the  basis  for  the  very  end  point  in  two  of  the  studies.  Knut  Stavems  extended   involvement  has  further  improved  our  product,  and  was  much  appreciated.  

A  special  thanks  to  Fredrik  Dahl  for  contributing  continuing  statistical  support   throughout  this  project  and  to  fellow  statisticians  Jurate  Saltyte  Benth  and  Jonas   Christoffer  Lindstrøm  for  valuable  help  with  calculations  and  spread  sheets.  

Thanks  to  my  dear  colleagues  Heidi  Eggesbø  and  Erik  Rud  for  feedback  on  several   drafts  of  the  questionnaires.  The  feedback  from  Ingrid  Nermoen  and  Christofer   Lundquist  on  the  first  draft  of  the  clinical  rating  scale  was  great  help  in  preparing  the   pilot  study.  Thanks  to  Ellen  Deilkås  for  sharing  her  insight  into  quality  culture  and   approaches  to  quality  improvement.  Thanks  to  Haldor  Husby  for  retrieving  data  from   Ahus  and  support  with  the  document  comparison  software.  Thanks  to  all  my  

colleagues  at  the  Health  Services  Research  Centre  for  maintaining  an  inclusive,   supportive,  and  diverse  work  environment.  Thanks  to  everyone  at  the  libraries  at   Akershus  University  Hospital  and  Oslo  University  Hospital  for  always  providing  such   an  excellent  service.  

Thanks  to  the  Norwegian  Medical  Association  for  access  to  data  from  the  SERUS   system  and  for  funding  from  the  fund  for  Quality  Improvement  and  Patient  Safety.  

Thanks  to  the  Norwegian  Society  of  Radiology  for  funding  and  for  allowing  our   survey  to  be  distributed  to  all  its  members.  Thanks  to  all  who  said  yes  when  they   could  have  said  no.  

 

 

(6)

Funding  

The  Department  of  Diagnostic  Imaging  at  Akershus  University  Hospital,  the   Department  of  Radiology  and  Nuclear  Medicine  at  Oslo  University  Hospital,  the   Norwegian  Medical  Associations’  fund  for  Quality  Improvement  and  Patient  Safety   and  The  Norwegian  Society  of  Radiology,  funded  this  project.  

   

(7)

Contents  

PREFACE   4  

ACKNOWLEDGEMENTS   5  

FUNDING   6  

CONTENTS   7  

ANNOTATIONS  &  ABBREVIATIONS   9  

ANNOTATIONS   9  

ABBREVIATIONS   10  

1  LIST  OF  PAPERS   10  

2  SUMMARY   11  

2.1  BACKGROUND   11  

2.2  OBJECTIVES   12  

2.3  MATERIAL  AND  METHODS   12  

2.4  SUMMARY  OF  PUBLISHED  RESULTS   13  

2.5  CONCLUSION   13  

3  INTRODUCTION   14  

3.1  UNANSWERED  QUESTIONS   17  

4  OBJECTIVES   17  

5  MATERIAL  AND  METHODS   18  

5.1  TWO  NATIONAL  SURVEYS   18  

5.1.1  SURVEY  DESIGN  AND  ITEMS   18  

5.1.2  PARTICIPANTS  AND  RECRUITMENT   18  

5.1.3  ANALYSIS   19  

5.2  TWO  RETROSPECTIVE  CROSS  SECTIONAL  MULTICENTRE  STUDIES   21  

5.2.1  STUDY  DESIGN   21  

5.2.2  RECRUITMENT  OF  DEPARTMENTS   21  

5.2.3  POWER  CALCULATION   24  

5.2.4  FORMAL  APPROVAL   25  

5.2.5  INCLUSION  CRITERIA   25  

5.2.6  DATA  RETRIEVAL   26  

5.2.7  PATIENT,  EXAMINATION  AND  READER  DATA   27  

5.2.8  DATA  PROCESSING   28  

5.2.9  DATA  ANALYSIS   29  

6  UNPUBLISHED  RESULTS   32  

6.1  PAPER  I:  UNPUBLISHED  RESULTS   32  

6.2  PAPERS  II  &  III:  UNPUBLISHED  RESULTS   34    

(8)

7  GENERAL  DISCUSSION   38  

7.1  DISCUSSION  OF  DESIGN  AND  METHODS   38  

7.1.1  PAPER  I   38  

7.1.2  PAPERS  II  &  III   41  

7.2  DISCUSSION  OF  RESULTS   52  

7.2.1  SAFE  &  EFFECTIVE:   52  

7.2.2  TIMELY  &  EFFICIENT   54  

7.2.3  EQUITABLE   55  

8  GENERAL  CONCLUSIONS   55  

9  REFERENCES   57  

PAPER  I   63  

PAPER  II   75  

PAPER  III   93  

APPENDIX  1:  QUESTIONNAIRE  TO  RADIOLOGY  DEPARTMENT  MANAGERS   111   APPENDIX  2:  QUESTIONNAIRE  TO  CONSULTANT  RADIOLOGISTS   123  

 

   

(9)

Annotations  &  abbreviations  

Annotations  

The  terms  “discrepancy”  and  “discordance”  may  be  used  interchangeably  in  the   description  of  differences  between  interpreters  in  reporting  their  findings.  Although   some  discrepancies  constitute  “errors”,  this  is  not  always  the  case.  The  term  

“discrepancy”  is  used  when  reporting  our  own  findings  because  it  is  a  more  accurate   description  of  the  variation  between  readers  than  “error”.  Also  our  focus  was  the   clinical  consequence  of  such  variations  and  not  the  blameworthiness  of  the  readers   involved.    

 

The  terms  “quality  assurance”  and  “quality  improvement”  are  generally  used   interchangeably.  “Quality  assurance”  refers  more  specifically  to  detection  and   correction  of  individual  errors.  Whereas  “quality  improvement”  refers  more   generally  to  any  endeavour  to  improve  quality  of  services.  

   

Radiologists  are  also  referred  to  as  readers  when  it  concerns  their  role  as   interpreters  of  examinations.  Consultant  radiologist  refers  to  any  radiologist   employed  in  a  senior  position  (Overlege).  This  includes  acting  or  appointed   consultants  (Konstituert  overlege)  that  have  not  received  specialist  approval.  

Radiologists  employed  in  training  positions  (Lege  i  spesialisering)  are  referred  to  as   residents.  

 

The  terms  “radiological  examination”  and  “imaging  examination”  are  used   interchangeably.  

 

“Radiology  conferences”  refer  to  any  meeting  in  which  radiologists  and  clinicians   meet  to  discuss  patient  management  in  light  of  medical  imaging  results.  This  also   includes  Multidisciplinary  team  meetings  (MDTM)  and  daily  radiology  rounds.    

   

(10)

Abbreviations  

-­‐ ACR  (American  College  of  Radiology)   -­‐ Ahus  (Akershus  University  Hospital)  

-­‐ BI-­‐RADS  (Breast  Imaging  Reporting  and  Data  System)   -­‐ CT  (Computed  Tomography)  

-­‐ CTPA  (Computed  Tomographic  Pulmonary  Angiography)   -­‐ EPR  (Electronic  Patient  Record)  

-­‐ ESR  (European  Society  of  Radiology)   -­‐ FTE  (Full  Time  Equivalent)  

-­‐ JCAHO  (Joint  Commission  on  Accreditation  of  Healthcare  Organizations)   -­‐ MRI  (Magnetic  Resonance  Imaging)  

-­‐ NCRP  (Norwegian  Classification  of  Radiological  Procedures)   -­‐ NORAKO  (Norsk  Radiologisk  Kode)  

-­‐ PACS  (Picture  Archiving  and  Communication  System)   -­‐ PET-­‐CT  (Positron  Emission  Tomography  CT)  

-­‐ RCR  (Royal  College  of  Radiologists)   -­‐ RIS  (Radiology  Information  System)  

-­‐ SERUS  (System  for  Electronic  Reporting  of  Educational  Activity  in  Hospital   Departments)  

-­‐

SQL  (Sequence  Query  Language)

   

1  List  of  papers  

I. Lauritzen  PM,  Hurlen  P,  Sandbæk  G,  Gulbrandsen  P.  Double  reading  rates   and  quality  assurance  practices  in  Norwegian  hospital  radiology  

departments:  two  parallel  national  surveys.  Acta  Radiol.  2015;56(1):78-­‐

86.  

II. Lauritzen  PM,  Stavem  K,  Andersen  JG,  Stokke  MV,  Tennstrand  AL,  Bjerke   G,  Hurlen  P,  Sandbæk  G,  Dahl  FA,  Gulbrandsen  P.  Double  reading  of   current  chest  CT  examinations:  clinical  importance  of  changes  to   radiology  reports.  Accepted  for  publication  in  European  Journal  of   Radiology.  

III. Lauritzen  PM,  Andersen  JG,  Stokke  MV,  Tennstrand  AL,  Aamodt  R,   Heggelund  T,  Hurlen  P,  Sandbæk  G,  Dahl  FA,  Gulbrandsen  P.  Prospective   double  reading  of  abdominal  computed  tomography:  the  clinical   importance  of  changes  to  radiology  reports.  Under  review.  

 

 

 

(11)

2  Summary  

2.1  Background  

Diagnostic  information  is  extracted  from  imaging  examinations  through  a  process  of   interpretation.  This  process  is  carried  out  by  humans  and  involves  professional   judgement  and  decisions  made  under  conditions  of  uncertainty.  Therefore  variations   and  even  errors  will  occur.    

 

These  variations,  often  referred  to  as  discrepancies,  can  be  uncovered  by  double   reading.  This  involves  having  two  different  readers  interpret  the  same  examination.  

It  can  be  done  retrospectively,  which  is  often  the  case  with  peer  review  systems  and   audits.  Applied  prospectively  double  reading  can  be  used  for  quality  assurance  of   radiology  reports,  and  has  been  shown  to  reduce  errors  and  increase  sensitivity.    

 

Double  reading  is  routine  in  the  training  of  resident  radiologists.  In  Norwegian   hospitals  there  is  tradition  for  double  reading,  even  of  radiological  examinations   read  by  consultants.  The  consultant  may  usually  choose  whether  to  finalize  the   report  directly  or  submit  the  examination  for  a  second  reading  by  a  colleague.  The   second  reader  reviews  the  examination  and  the  preliminary  report,  and  corrects  it  if   necessary  before  it  is  finalized.  

 

It  is  this  sequential,  non-­‐independent  double  reading  in  which  both  readers  are   consultant  radiologists,  which  is  the  topic  of  this  dissertation.  This  practice  varies   considerably  between  departments,  and  the  necessity  and  feasibility  of  such  double   reading  is  much  debated  in  the  Norwegian  radiological  community.  Special  emphasis   has  been  made  on  the  delays  caused  and  resources  consumed.  

 

There  are  few  reports  of  the  extent  to  which  double  reading  is  practiced   internationally,  and  none  that  estimate  the  working  hours  consumed.  Our   knowledge  of  the  effect  of  the  practice  stems  from  related  but  not  identical  

practices  such  as  peer  review,  audits,  independent  double  reading  in  mammography   screening  and  double  reading  of  resident  radiologists.  

   

(12)

2.2  Objectives  

The  objectives  of  this  study  were:  

• To  investigate  the  rates  of  double  reading  in  Norwegian  hospital  radiology   departments.  

• To  identify  department  characteristics  associated  with  the  rates  of  double   reading  

• To  investigate  possible  associations  between  double  reading  rates  and  other   quality  assurance  practices.  

• To  estimate  the  proportion  of  radiology  reports  that  were  changed  during   double  reading  of  Computed  Tomography  (CT)  examinations  of  chest  and   abdomen  respectively  and  to  assess  the  potential  clinical  impact  of  these   changes.  

• To  explore  whether  characteristics  of  examinations  or  radiologists  were   associated  with  a  higher  proportion  of  clinically  important  changes.  

2.3  Material  and  methods  

Quality  assurance  practices  and  rates  of  double  reading  in  Norwegian  hospital   radiology  departments  were  explored  in  two  parallel  nationwide  surveys  issued  to   department  management  and  consultant  radiologists  respectively  (paper  I).  Both   surveys  covered  practice  of  double  reading,  department  guidelines  and  quality   improvement  work.  Management  also  reported  staffing  and  perceived  resource   situation.  

 

The  responses  of  consultant  radiologists  grouped  according  to  workplace  were  used   to  validate  management  responses  about  working  hours  consumed  by  double   reading.  Departments  were  categorized  according  to  teaching  status.  Responses   regarding  different  aspects  of  department  quality  improvement  work  were   organized  into  three  quality  indices.  The  items  in  “Person  index”  concerned   monitoring  of  personal  performance  and  feedback  to  individuals.  The  items  in  

“System  index”  concerned  systems  performance  monitoring  and  collective  feedback.  

The  items  in  “Appropriateness  index”  concerned  assurance  of  appropriateness  of   investigations.  

 Comparisons  of  departments  were  done  with  Kruskall  Wallis’  test.  Linear  regression   was  used  to  assess  whether  differences  in  double  reading  rates  remained  significant   when  adjusting  for  size.  All  correlations  were  tested  with  Spearman  correlation.  

 

The  clinical  importance  of  changes  to  radiology  reports  was  estimated  in  two   retrospective,  cross-­‐sectional,  multicentre  studies  (paper  II  and  III).  In  paper  II  we   focused  on  Chest  CT  examinations  of  patients  from  the  departments  of  internal   medicine.  In  paper  III  we  focused  on  Abdominal  CT  of  surgical  patients.  In  each  study   we  collected  pairs  of  preliminary  and  final  reports  from  more  than  1.000  consecutive   double  read  examinations.  We  used  document  comparison  software  to  compare  the   preliminary  and  final  reports.  Experienced  clinicians  in  relevant  specialties  rated  the   clinical  importance  of  all  changes  in  content.  

 

(13)

We  reported  classifications  of  clinical  importance  of  report  changes  as  percentages   with  binomial  95%  confidence  intervals.  Exploratory  analysis  of  associations  between   clinically  important  changes  and  characteristics  of  patients,  examinations,  and   readers  was  performed  with  multivariate  logistic  regression.  We  also  constructed   two  random  effects  models  to  test  for  clustering  of  clinically  important  report   changes  in  separate  examinations  read  by  the  same  radiologist.  

2.4  Summary  of  published  results  

In  paper  I  we  found  a  mean  double  reading  rate  of  33%  of  all  exams  read  by   consultants,  consuming  an  estimated  20-­‐25%  of  consultant  working  hours.  The   double  reading  rates  were  highest  in  university  hospital  departments  (59%),   intermediate  in  other  teaching  departments  (30%),  and  lowest  in  non-­‐teaching   departments  (11%).  By  modality  double  reading  rates  were  highest  for  Magnetic   Resonance  Imaging  (MRI)  (47%)  and  CT  (33%),  intermediate  for  X-­‐ray  (24%)  and   fluoroscopy  (23%),  and  lowest  for  ultrasonography  (16%)  and  intervention  (16%).  

Among  the  three  quality  indices,  mean  scores  were  the  highest  on  the  

“appropriateness  index”  (68%),  intermediate  on  the  “person  index”  (56%),  and   lowest  on  the  “system  index”  (37%).  There  were  no  correlations  between  double   reading  rates  and  scores  on  any  of  the  three  quality  indices.  

 

In  paper  II  changes  were  classified  as  clinically  important  in  91  (9%)  of  1,023  reports.  

Of  these:  3  were  critical,  15  were  major,  and  73  were  intermediate.  More  clinically   important  changes  were  made  to  urgent  examinations,  and  less  to  female  first   readers.  Chest  radiologists  made  more  clinically  important  changes  than  other   second  readers.  The  severity  of  the  radiological  findings  was  increased  in  73  (80%)  of   the  clinically  important  changes.  

 

In  paper  III  changes  were  classified  as  clinically  important  in  146  (14%)  of  1,071   reports.  Of  these:  3  were  critical,  35  were  major,  and  108  were  intermediate.  

Important  changes  were  made  less  frequently  when  abdominal  radiologists  were   first  readers,  and  more  frequently  when  they  were  second  readers  and  to  urgent   examinations.  The  severity  of  the  radiological  findings  was  increased  in  118  (81%)  of   the  clinically  important  changes.  

2.5  Conclusion  

The  practice  of  double  reading  in  Norwegian  hospital  radiology  departments  is   extensive,  but  there  are  large  variations  between  departments  with  different   teaching  status  and  between  modalities.  Double  reading  has  a  major  impact  on   workflow  and  output  directly  by  consuming  working  hours,  and  probably  also   indirectly  by  generating  more  investigations.  The  rates  of  clinically  important   changes  to  radiology  reports  following  double  reading  indicate  that  some  quality   assurance  of  radiological  interpretation  is  warranted.  A  higher  yield  of  discrepant   interpretations  may  be  achieved  by  targeting  a  selection  of  urgent  examinations  and   examinations  read  by  inexperienced  radiologists,  and  using  subspecialist  second  

readers.  

 

(14)

3  Introduction  

The  purpose  of  most  radiological  examinations  is  to  provide  diagnostic  information,   improving  the  basis  for  clinical  decision-­‐making.  Radiological  examinations  are   subject  to  interpretation  by  humans.  The  process  of  interpretation  involves   judgement  and  decision  making  under  conditions  of  uncertainty.  Inevitably,  there   are  variations.  The  results  will  not  be  identical  if  the  same  exam  is  interpreted  by   different  readers  or  at  different  times.  When  such  variation  affects  the  conveyed   result  of  the  interpretation,  it  may  be  referred  to  as  a  discrepancy.  

 

When  a  discrepancy  becomes  an  error  is  subject  to  opinion.  Although  not  all  

discrepancies  constitute  errors,  it  is  vital  to  acknowledge  that  both  discrepancies  and   errors  do  occur  in  the  practice  of  clinical  radiology.  An  autopsy  study  of  patients   dying  in  hospital  estimated  that  radiological  misinterpretation  caused  8%  and   contributed  to  another  33%  of  diagnostic  errors  in  patients  with  relevant  imaging   [1].    

 

The  reports  “To  Err  is  Human,  Building  a  safer  health  system”  followed  shortly  after   by  “An  organization  with  a  memory”  made  clear  the  massive  scale  and  

consequences  of  preventable  medical  errors,  which  were  estimated  to  cause  more   deaths  than  motor-­‐vehicle  accidents,  breast  cancer  and  AIDS.  The  reports  raised   awareness  of  quality  issues  and  patient  safety,  and  emphasized  the  need  for  a   culture  and  system  for  error  reporting  in  order  to  prevent  recurrences  by  learning   and  system  improvement  [2,  3].  

 

Double  reading  is  a  quality  assurance  practice  in  which  two  different  readers   interpret  an  imaging  examination.  Internationally  it  is  routine  in  the  training  of   resident  radiologists,  who  submit  virtually  all  their  preliminary  reports  and   examinations  for  a  second  reading  by  a  consultant.  If  necessary,  the  consultant   corrects  the  report  before  it  is  finalized.  

 

In  Norwegian  hospitals  there  is  tradition  for  double  reading,  even  of  radiological   examinations  read  by  consultants,  who  may  usually  choose  whether  to  finalize  their   report  directly  or  submit  the  examination  for  a  second  reading  by  a  colleague.  This   practice  predates  the  digital  transformation  of  radiology  departments  that  occurred   around  the  turn  of  the  millennium  with  the  implementation  of  Radiology  

Information  Systems  (RIS)  and  Picture  Archiving  and  Communication  Systems  (PACS)   [4].  Prior  to  this,  images  were  read  on  film  alternators,  and  written  preliminary   reports  were  placed  next  to  the  images.  The  second  reading  was  often  conducted  in   conjunction  with  a  radiology  conference  the  next  morning.  Although  the  

implementation  of  RIS  and  PACS  led  to  substantial  workflow  changes,  in  many   departments  this  form  of  sequential,  non-­‐independent  double  reading  was  retained.    

 

The  topic  of  this  dissertation  is  this  prospectively  applied,  sequential,  and  non-­‐

independent  form  of  double  reading,  in  which  both  readers  are  consultants.  Unless   otherwise  specified  “double  reading”  refers  to  this  practice,  and  not  double  reading   of  residents  or  other  related  practices.  

(15)

 

The  practice  of  double  reading  has  been  much  debated  in  the  Norwegian  radiological   community,  and  it  has  varied  considerably  between  departments  [5].  There  have   been  concerns  that  double  reading  cause  limitations  on  output  and  unacceptable   waits,  and  that  final  reports  are  unduly  delayed  [5,  6].  It  has  been  argued  that  a   beneficial  effect  is  not  well  established,  and  that  radiologists  become  less  vigilant,   and  more  prone  to  error  when  colleagues  check  their  work  [7].  Some  have  worried   that  radiologists  are  less  willing  to  make  decisions  and  assume  responsibility  for   them,  and  that  routine  submission  of  examinations  for  double  reading  represents  a   disclaimer  of  that  responsibility  [5].  

 

Our  knowledge  of  the  discrepancies  of  interpretation  uncovered  by  double  reading   stems  mainly  from  three  sources:  screening  programmes,  retrospective  audits  or   peer-­‐review,  and  evaluation  of  resident  on-­‐call  performance.  

 

Some  breast  cancer  screening  programmes  use  independent  double  reading,  in   which  the  second  reader  is  blinded  to  the  interpretation  of  the  first  [8-­‐10].  Feasibility   of  independent  double  reading  depends  on  a  limited  number  of  options  for  

interpretations  or  ideally  a  categorization  system  such  as  the  “Breast  Imaging   Reporting  and  Data  System”  (BI-­‐RADS)  used  reporting  mammograms.  Therefore   independent  double  reading  is  seldom  used  in  clinical  radiology.  For  screening   mammograms  the  reported  rates  of  discrepant  interpretations  are  in  the  order  of   5%  [11].  There  are  several  reasons  why  these  results  are  not  necessarily  valid  for   clinical  radiology.  The  population  consists  of  screening  subjects,  and  not  patients   referred  for  investigation  of  a  condition  or  symptom.  The  frequency  of  pathology  is   quite  low  with  concordant  positive  interpretations  in  2.1%  [11].  The  examination  is   aimed  at  the  detection  or  exclusion  of  one  diagnosis,  cancer.  

 

Information  about  interpretational  discrepancy  rates  is  also  found  in  reports  from   peer  review  and  audits.  Peer  review  is  a  continuous  process  in  which  all  colleagues   (peers)  take  part  in  review  of  each  other’s  examinations,  and  rate  their  agreement  or   disagreement  with  the  interpretation  in  the  report.  This  is  usually  done  

retrospectively  by  reviewing  previous  examinations  when  they  are  compared  to  the   current  ones  being  interpreted.  Audits  also  involve  retrospective  review  of  

examinations  and  reports,  but  are  usually  one-­‐time  or  periodical,  not  continuous,   and  may  involve  external  expert  reviewers  as  opposed  to  colleagues.  In  both  cases   the  goal  is  performance  measurement  of  departments  and  radiologists,  and   improvement  through  shared  learning,  rather  than  quality  assurance  of  individual   radiology  reports.  

 

The  use  of  peer  review  is  widespread  in  the  United  States  of  America,  where  a   continuous  random  review  of  5%  of  cases  is  required  for  credentialing  by  the  Joint   Commission  on  Accreditation  of  Healthcare  Organizations  (JCAHO)  [12].  The  most   prevalent  peer  review  system  is  RADPEER,  which  was  introduced  by  the  American   College  of  Radiology  (ACR)  in  2002  in  response  to  the  report  “To  Err  is  Human”  [2,   13].  The  reported  rates  of  interpretational  discrepancies  from  audits  and  peer  

(16)

peer  review  data  have  the  strength  that  they  usually  involve  a  large  number  of   examinations  and  readers.  However,  critics  have  raised  concerns  over  sampling   issues  and  underreporting,  and  a  more  recent  survey  report  radiologists’  conscious   manipulation  of  review  data  by  biased  sampling  and  reporting  [17,  18].  

 

The  third  major  source  of  interpretational  discrepancy  rates  is  the  abundance  of   studies  evaluating  resident  performance.  These  studies  are  heterogeneous  both  with   regards  to  design  and  results.  Agostini  et  al  reported  that  the  attending  or  resident   radiologists  missed  one  or  more  lesions  in  71%  of  whole  body  CT  examinations  of   polytrauma  patients  and  that  37%  of  all  lesions  were  missed  [19].  Reported  

discrepancy  rates  for  emergency  angiograms  of  the  head  and  neck  are  10.4%  for  MRI   and  13.6%  for  computed  tomography  [20,  21].  For  plain  chest  radiographs  the   reported  discrepancy  rates  are  1%  for  the  presence  of  pneumonia  and  1.9%  for  the   presence  of  congestive  heart  failure  [22,  23].  

 

These  studies  show  a  large  variation  in  discrepancy  rates  between  modalities,   settings,  and  probably  also  incidence  of  pathology.  The  results,  however,  are  not   necessarily  valid  in  the  context  of  quality  assurance  of  consultant  interpretation.  

There  are  several  factors  that  may  contribute  to  higher  discrepancy  rates.  The  level   of  experience  of  the  readers  has  been  shown  to  influence  discrepancy  rates  [21,  24].  

Reported  discrepancy  rates  are  higher  in  positive  than  negative  examinations,  and   one  might  expect  a  higher  frequency  of  pathology  in  emergency  after-­‐hours   examinations  [25,  26].  Furthermore,  long  shifts,  increasing  caseload  and   interruptions  in  the  form  of  telephone  calls  may  all  negatively  affect  diagnostic   accuracy  [24,  27,  28].  

 Internationally,  there  are  few  reports  of  the  extent  to  which  double  reading  is   practiced.  One  Swedish  university  hospital  reported  double  reading  all  examinations   in  2008,  while  in  1991  the  JCAHO  requirement  of  reviewing  5%  of  cases  was  met  by   74%  of  imaging  groups  in  the  USA  [29,  30].  Husby  et  al  reported  that  41%  of  imaging   examinations  in  Norway  were  double  read  in  20081  [31].  

 

In  prospective  double  reading  potential  errors  are  corrected  one  by  one.  In  contrast,   performance  measurement,  as  accomplished  by  retrospective  peer  review  or  audit,   does  not  in  it  self  improve  quality  [17].  Therefore  it  is  vital  to  couple  performance   measurement  with  a  quality  improvement  initiative.  Peer  review  is  usually  coupled   with  “discrepancy  meetings”  in  which  colleagues  discuss  discrepancies  for  the   purpose  of  shared  learning,  but  also  in  order  to  reach  a  consensus  on  the  final  rating   of  the  discrepancies.  Little  is  known  about  how  Norwegian  radiology  departments   couple  double  reading  with  performance  evaluations  and  initiatives  to  promote   shared  learning.  

   

                                                                                                               

1  At  the  time  this  report  was  published,  work  with  the  present  survey  was  already   well  under  way.  

(17)

3.1  Unanswered  questions  

Although  much  debated,  many  aspects  of  double  reading  practices  in  Norwegian   hospitals  have  not  previously  been  reported,  and  it  was  the  purpose  of  this  study  to   address  and  explore  some  of  these  issues.  

 

There  are  no  estimates  of  consumption  of  working  hours.  The  factors  that  contribute   to  variation  in  double  reading  rates  between  departments  are  not  fully  described.  

There  is  little  or  no  data  on  the  application  of  other  quality  assurance  practices  and   their  possible  association  with  double  reading  rates.  The  effects  of  prospective   double  reading  of  examinations,  which  are  read  and  selected  for  double  reading  by   consultants,  is  also  unknown.  We  have  neither  estimates  of  the  proportion  of   radiology  reports  that  are  changed  following  double  reading  nor  the  proportion  of   such  changes  that  are  clinically  important.    

 

There  are  many  similarities  with  practices  such  as  independent  double  reading,  peer   review,  audit  and  over-­‐reading  of  residents,  and  our  knowledge  of  the  effect  of   double  reading  stems  from  these,  related  practices.  However,  considerable   differences  in  patient  populations,  reader  experience,  reading  conditions,  workflow   and  data  collection  limit  the  validity  of  results  originating  from  other  settings.  These   differences  also  make  comparison  particularly  interesting,  and  offer  opportunities   for  mutual  learning  and  practice  improvement.  

4  Objectives  

The  objectives  of  this  study  were:  

• To  investigate  the  rates  of  double  reading  in  Norwegian  hospital  radiology   departments.  

• To  identify  department  characteristics  associated  with  the  rates  of  double   reading  

• To  investigate  possible  associations  between  double  reading  rates  and  other   quality  assurance  practices.  

• To  estimate  the  proportion  of  radiology  reports  that  were  changed  during   double  reading  of  CT  examinations  of  chest  and  abdomen  respectively,  and   to  assess  the  potential  clinical  impact  of  these  changes.  

• To  explore  whether  characteristics  of  examinations  or  radiologists  were   associated  with  a  higher  proportion  of  clinically  important  changes.  

 

 

(18)

5  Material  and  methods  

The  two  main  topics  covered  in  this  dissertation  are  the  extent  of  double  reading   and  quality  assurance  practices  in  Norwegian  hospital  radiology  departments  (paper   I)  and  the  clinical  importance  of  changes  to  radiology  reports  following  double   reading  (papers  II  &  III).  The  former  was  approached  by  two  surveys  while  the  latter   was  assessed  in  two  retrospective,  cross  sectional,  multicentre  studies.  

5.1  Two  national  surveys   5.1.1  Survey  design  and  items  

Quality  assurance  practices  and  rates  of  double  reading  in  Norwegian  hospital   radiology  departments  were  explored  in  two  parallel  nationwide  surveys  conducted   between  27  March  and  27  May  2012.  The  two  electronic  surveys  were  issued  to   department  management  and  consultant  radiologists  respectively.    The   management  survey  covered  staffing,  perceived  resource  situation,  practice  of   double  reading,  department  guidelines  and  department  quality  improvement  work   (cf.  appendix  1).  The  radiologist  survey  covered  practice  of  double  reading,  

department  guidelines,  and  department  quality  improvement  work  (cf.  appendix  2).  

Some  of  the  items  in  the  two  surveys  were  identical  in  order  to  obtain  similar   information  from  separate  sources.  

5.1.2  Participants  and  recruitment  

The  Norwegian  Hospital  Reform  in  2002  transformed  approximately  70  county   owned  hospitals  to  28  government  owned  health  corporations  [32].  Many  of  these   corporations  operate  at  several  separate  locations,  because  merged  neighbouring   hospitals  were  not  always  collocated.  Department  management  may  be  separate  on   each  location  or  merged  and  located  mainly  on  one  location.  We  decided  to  define  a   department  as  the  smallest  unit  with  an  immediate  supervisor  to  the  radiologists.  By   this  definition  there  were  45  separate  hospital  radiology  departments  in  Norway.  

 

Although  accounting  for  23%  of  all  performed  imaging  exams  in  Norway  (2008),  we   decided  not  to  include  private  imaging  centres  since  a  previous  study  showed  that   they  perform  double  reading  to  a  limited  degree  compared  with  hospital  radiology   departments  [31,  33].  We  decided  to  include  the  radiology  departments  of  private   hospitals  owned  by  non-­‐profit  trusts  such  as  Martina  Hansens  Hospital,  Lovisenberg   Diakonale  Sykehus,  Diakonhjemmet  Sykehus,  Haraldsplass  Diakonale  Sykehus,  and   Betanien  Sykehus  since  they  provide  imaging  for  inpatients  and  constitute  an   integral  part  of  public  health  care  in  Norway.  Whereas  for-­‐profit  providers  such  as   Unilabs,  Curato,  and  Aleris  that  serve  mainly  outpatients  were  excluded.  

 

The  target  population  for  the  management  survey  was  the  management  at  the  45   hospital  radiology  departments.  We  defined  management  as  the  chief  medical   officer  and/or  the  head  of  the  department.  All  management  invitees  were  contacted   by  telephone  prior  to  receiving  the  survey.  Non-­‐responders  were  reminded  by  e-­‐mail   and  again  by  telephone  before  closing  the  survey.  At  departments  where  the  head  of   the  department  and  the  chief  medical  officer  were  not  the  same  person  they  chose  

(19)

either  to  submit  joint  or  separate  responses.  Separate  responses  were  merged  into   one  department  response  (cf.  paper  I,  Fig  1).  When  merging  discrepant  responses  to   individual  survey  items  priority  was  given  according  to  the  responsibilities  of  the   management  representatives  for  each  survey  item.  

 

The  target  population  of  the  radiologist  survey  was  consultant  radiologists  working   in  hospital  radiology  departments  (excluding  management).  In  order  to  reach  this   population,  all  726  members  of  the  Norwegian  Society  of  Radiology  were  invited  by   e-­‐mail  to  participate  in  the  survey.  Non-­‐responders  were  reminded  by  e-­‐mail.  We   decided  to  exclude  responders  not  working  mainly  or  exclusively  in  a  hospital   department  as  not  belonging  to  the  target  population.    

5.1.3  Analysis  

The  analysis  of  the  surveys  was  conducted  in  an  exploratory  manner.  

5.1.3.1  Validation  of  responses  

Department  staffing  information  was  also  acquired  for  all  teaching  departments   from  the  SERUS2  reporting  system,  and  served  to  validate  management  responses  on   staffing.  The  validated  staffing  information  was  used  to  calculate  the  size  of  the   target  population  in  the  radiologist  survey  and  the  proportion  of  residents  in  the   radiologist  staff.  The  responses  of  consultant  radiologists  grouped  according  to   workplace  were  used  to  validate  management  responses  about  working  hours   consumed  by  double  reading.  

5.1.3.2  Characterization  of  departments  

We  considered  several  characteristics  of  departments  including  size,  teaching  status,   regional  health  authority  affiliation,  and  presence  of  subdivisions.  Categorization   according  to  affiliation  with  North-­‐,  Middle-­‐,  West-­‐  or  South-­‐East  Regional  Health   Authorities  was  abandoned  when  analysis  showed  that  the  variation  of  practice  was   similar  within  and  between  regions.  This  indicated  that  regional  affiliation  did  not   represent  a  unifying  factor  with  regards  to  double  reading  and  quality  assurance   practices.  Categorization  according  to  presence  of  subdivisions  was  also  abandoned,   as  it  proved  difficult  to  construct  exhaustive  and  mutually  exclusive  categories.  

When  present,  divides  were  based  on  modality,  anatomic  regions,  referring   departments,  or  a  mix  according  to  local  conditions.  Some  departments  reported   being  “partially  subdivided”.    

 

Teaching  status  provided  distinct  and  formal  categories.  In  Norway,  there  are  two   levels  of  formal  accreditation  for  teaching  departments:  Group  I  departments  are   usually  large,  specialized  units  conducting  scientific  research.  Residents  are  required   to  serve  at  least  1.5  years  of  their  residency  at  a  group  I  department.  Since  all  group  I   departments  but  one  (Drammen)  are  located  at  University  Hospitals,  we  designated   this  category  “University  hospital  department”  for  the  benefit  of  international   readers  not  familiar  with  the  Norwegian  system.  Group  II  departments  are  as  a  rule   smaller  units  that  demonstrate  a  capability  for  resident  training.  Residents  may  serve  

(20)

up  to  3.5  years  of  their  residency  in  such  units.  We  called  this  group  “other  teaching   departments”.    The  remaining  departments  are  not  licenced  to  train  residents  and   were  designated  “non-­‐teaching”  departments.  This  categorisation  might  reasonably   reflect  real  differences  between  departments,  and  was  used  in  the  analysis.  

It  was  natural  to  have  a  more  direct  measurement  of  department  size  in  the  analysis,   and  we  considered  staffing  and  output  to  be  the  most  relevant  measures.  Staffing   was  covered  in  the  survey,  and  SERUS  provided  data  both  on  staffing  and  on   department  output.  However,  the  reported  output  was  in  different  and  sometimes   undisclosed  units  (NORAKO3-­‐codes,  NCRP4-­‐codes,  Number  of  referrals,  number  of   examinations),  and  the  data  was  only  available  for  teaching  departments.  We   decided  to  measure  size  as  staffing  in  the  form  of  consultant  radiologist  full  time   equivalents  (FTE’s).  

5.1.3.3  Construction  of  three  quality  indices  

In  addition  to  requesting  rates  of  double  reading,  the  management  survey  covered   quality-­‐directed  activities  and  guidelines.  Explored  separately  these  items  produce   fragments  of  data  that  are  not  easily  interpreted  or  conveyed,  and  a  thematic   synthesis  in  some  shape  or  form  seemed  appropriate  and  necessary.  Among  the   candidate  items  we  identified  20  separate  items  that  could  be  thematically   organized  into  three  groups,  and  we  constructed  our  quality  indices  from  these   groups.  

The  department  score  on  each  index  was  the  percentage  of  affirmative  responses  on   index  items.  The  first  group  comprised  four  items  relating  to  monitoring  of  personal   performance  and  feedback  to  individuals  –  “Person  index”.  The  second  group   comprised  nine  items  concerning  systems  performance  monitoring  and  collective   feedback  –  “System  index”.  The  third  group  comprised  seven  items  regarding   assurance  of  appropriateness  of  investigations  –  “Appropriateness  index”  (cf.  paper   I,  table  2).    

5.1.3.4  Statistical  analysis  

Management  reported  double  reading  rates  to  each  modality  in  the  following   categories:  0%,  1g 33%,  34g 66%,  67g 99%,  or  100%  (cf. appendix 1).  For   purposes  of  statistical  analysis  the  intervals  1g 33%,  34g 66%,  and  67g 99%  were   converted  to  their  middle  values  17%,  50%,  and  83%  respectively.  Comparisons  of   departments  according  to  teaching  status  were  made  Kruskall  Wallis’  test.  Linear   regression  was  used  to  assess  whether  differences  in  rates  of  double  reading   remained  significant  when  adjusting  for  size.  All  correlations  were  tested  with   Spearman  correlation.  The  data  were  analysed  using  IBM  SPSS  Statistics5.  All  p   values  are  twog sided,  and  p  <  0.05  indicates  statistical  significance.  

3  Norsk  Radiologisk  Kode  (NORAKO):  A  Norwegian  system  for  classification  of  

radiological  examinations  and  procedures.  

4  Norwegian  Classification  of  Radiological  Procedures  (NCRP):  A  similar  classification   system,  which  replaced  NORAKO  31  December  2011.  

5  IBM  SPSS  Statistics,  Version  22,  IBM  corp,  Somers,  NY.  

(21)

5.2  Two  retrospective  cross  sectional  multicentre  studies   5.2.1  Study  design  

In  order  to  estimate  the  clinical  importance  of  changes  to  radiology  reports  following   double  reading,  we  conducted  two  retrospective  cross  sectional  multicentre  studies.  

In  these  studies  we  compared  preliminary  and  final  reports  from  double  read   examinations  for  changes,  and  clinicians  rated  the  clinical  importance  of  these   changes.  In  paper  II  we  focused  on  changes  to  reports  from  chest  CT  examinations  of   patients  from  the  departments  of  internal  medicine,  while  we  focused  on  changes  to   reports  from  abdominal  CT  examinations  of  surgical  patients  in  paper  III.  Except  the   difference  in  examination  type  and  department  affiliation  of  the  patients,  the  two   studies  were  similar.    

5.2.2  Recruitment  of  departments  

Candidate  departments  for  collaboration  were  identified  using  data  from  the   management  survey.  A  prerequisite  for  the  comparison  of  preliminary  and  final   reports  was  that  all  versions  are  saved  in  the  RIS  or  the  Electronic  Patient  Record   (EPR).  More  than  half  of  the  departments  reported  in  the  survey  that  this  was  not   the  case,  and  were  thereby  excluded  as  collaborators.  Another  prerequisite  was  the   double  reading  of  a  sufficient  proportion  of  examinations.  To  safeguard  against   possibly  exaggerated  self-­‐reported  double  reading  rates,  we  set  an  arbitrary  lower   limit  of  20%  reported  double  reading.    

 

From  these  criteria  six  potential  collaborators  were  identified,  five  of  which  were   affiliated  with  the  South-­‐East  Regional  health  authority:  Akershus  University  Hospital   (Ahus),  Oslo  University  Hospital  -­‐  Ullevål  and  the  Vestre  Viken  hospitals  in  Bærum,   Drammen  and  Ringerike.  The  sixth  hospital  (St.  Olavs  Hospital)  was  not  approached   since  there  were  already  three  group  I  hospitals  in  the  selection,  and  because  the   distance  might  have  complicated  collaboration.  Characteristics  of  the  departments   of  surgery  and  internal  medicine  at  these  five  hospitals  are  shown  in  table  1,  and   characteristics  of  the  radiology  departments  are  shown  in  table  2.  

 

   

(22)

Table  1:  Characteristics  of  participating  hospitals.  

Hospital   Catchment  population   No  of  beds   Annual  dep.  output1   Medicine   Surgery   Medicine   Surgery   Medicine   Surgery  

Ahus   471,661   471,661   293   130   32,612   18,152  

Ullevål   244,676*   244,676*   283   144   41,872   18,093  

Drammen   156,076**   209,072**   104   75   9,688   5,930  

Bærum   170,936   170,936   76   46   7,110   3,600  

Ringerike     77,836   77,836   50   17   5,822   1,913  

Sum   1,121,185   1,174,181   806   412   97,104   47,688  

1Diagnosis  Related  Group  (DRG)-­‐weighted  output  (no  of  admissions  x  DRG-­‐index).  

*Regional  functions  for  a  population  of  2.7  million.  

**  Regional  functions  for  a  population  of  457.844.  

 

(23)

        Table  2:  Characteristics  and  contribution  of  participating  departments

  Radiology   department  

Annual  no  of   examinations   all  modalities1  

Annual  no  of  CT   examinations1  

No  of   consultant   radiologists2  

No  of  involved   consultant   radiologists  

No  of  radiology   reports  collected  

Percentage  of   double  read   examinations   Paper  II  Paper  III  Paper  II  Paper  III  Paper  II3  Paper  III4   Ahus  207,365  42,878  36  26  24  319  354  59  42   Ullevål  209,796  43,584  54  34  31  405  414  20  33   Drammen  91,349  13,006  22.5  14  18  185  194  55  31   Bærum  67,284  12,431  7  14  12  71  66  15  12   Ringerike  43,274  5,862  5  5  6  43  43  45  47   Sum  619,068  117,761  124.5  91*  90*  1,023  1,071  N/A  N/A   1 Norwegian  Classification  of  Radiological  Procedures  (NCRP),  2012.   2 Full  Time  Equivalent  (FTE).   3 Chest  computed  tomography  referred  from  department  of  internal  medicine,  double  read  by  consultants.   4 Abdominal  computed  tomography  referred  from  the  department  of  surgery,  double  read  by  consultants.   *Two  radiologists  worked  in  two  of  the  hospitals  during  this  time.

(24)

 

5.2.3  Power  calculation  

Previous  studies  report  varying  discrepancy  rates.  Many  of  them  involve  on-­‐call   residents  as  first  readers,  and  reported  discrepancy  rates  span  from  37%  for   polytrauma  to  2.8%  for  Computed  Tomography  Pulmonary  Angiograms  (CTPA)  [19,   21,  25,  34,  35].  For  brain  CT  examinations  a  rate  of  1.3%  discrepant  interpretations   has  been  reported  between  specialists  and  neuroradiology  subspecialists  [36].  

RADPEER  data  show  rates  of  discrepancies  between  consultants  reading  CT  of  1.7%  

for  misinterpretations  and  5.5%  for  disagreements  in  difficult  cases  [15].  We   assumed  that  the  reported  discrepancy  rates  for  consultants  were  most  appropriate   and  expected  changes  made  only  to  a  small  proportion  of  the  reports.  

 

We  aimed  at  detecting  a  hypothesized  rate  of  clinically  important  changes  of  2.5%  

with  a  95%  degree  of  confidence  that  a  true  population  estimate  be  between  0.5   and  4.5%.  We  believed  that  a  considerably  wider  confidence  interval  would  limit  the   clinical  relevance  of  our  findings.    

 

Individual  examinations  read  by  the  same  radiologist  were  not  considered   independent  observations.  We  did  not  have  data  to  estimate  expected  Intra  Class   Correlations  (ICC)6  quantifying  the  degree  of  clustering  due  to  repeated  

observations,  and  assumed  it  to  be  0.25.  We  expected  approximately  100  

radiologists  to  be  involved.  The  number  of  independent  observations  (N*)  needed  to   achieve  a  95%  confidence  interval  of  ±2%,  given  a  rate  of  2.5%,  is  estimated  by  the   equation:    

 

𝑁=

𝑝 1−𝑝 𝑧!+𝑧!

!

!

,  

 where  p  is  the  rate  (0.025),  Δ=0.02  (2%),  𝑧!=0.84  for  80%  power,  and  𝑧!

!=1.96   represents  a  level  of  significance  of  5%.  The  resulting  estimated  N*  of  478  has  to  be   adjusted  by  multiplication  by  the  so  called  Variance  Inflation  Factor  (VIF),  estimated   by  the  equation:  

 

VIF  =  1  +  (m-­‐1)  x  ICC  =  1  +  (4,78-­‐1)  x  0,25  =  1.945,    

where  m  is  the  mean  cluster  size  (m)  is  the  number  of  independent  observations   divided  by  the  number  of  clusters  (radiologists):  478/100  ≈  4.78.  

 N  =  N*  x  VIF  =  478  x  1.945  ≈  930    

By  our  estimates  we  would  need  at  least  930  observations  to  achieve  our  aim.  

Because  of  the  hierarchical  structure,  with  departments,  radiologists,  and  individual                                                                                                                  

6  The  proportion  of  the  variance  that  could  be  attributed  to  the  identity  of  the  

reader.  

(25)

 

examinations  representing  three  levels,  we  collected  from  each  department  a   number  of  examinations  in  relative  proportion  to  the  number  of  consultant  FTE’s.  

(Cf.  table  2,  section  5.2.2,  p.  23.)   5.2.4  Formal  approval  

The  South-­‐East  Regional  committee  for  medical  research  ethics  approved  the  study   and  granted  a  waiver  of  informed  consent  18  January  2013.  The  study  was  approved   by  the  data  protection  officer  at  Ahus  30  January  2013,  and  at  Oslo  University   Hospital  and  Vestre  Viken  28  March  2013.  Data  processor  agreements  with  Vestre   Viken  and  Oslo  University  Hospital  and  were  signed  2  and  3  April  2013  respectively.  

5.2.5  Inclusion  criteria  

• Patient  age:  18  years  or  older.  

• Department  affiliation  of  patient:  

• Department  of  internal  medicine7  (paper  II)  

• Department  of  surgery8  (paper  III).  

• Examinations:    

• Chest  CT  (paper  II):  including  standard  contrast  enhanced  CT,  non-­‐

enhanced  CT,  whole  body  CT9,  and  CTPA.  Excluding  high  resolution  CT   and  systemic  arterial  CT  angiography.  

• Abdominopelvic  CT  (paper  III):  including  standard  contrast  enhanced   CT,  non-­‐enhanced  CT  and  whole  body  CT10.  Excluding  systemic  arterial   CT  angiography,  CT  colonography,  CT  ventriculography,  CT  

enteroclysis,  CT  urography,  low-­‐dose  urolithiasis  CT  and  isolated   upper  abdominal  CT  examinations.    

• Primary  report  composed  by  a  consultant11,  and  finalized  by  another   consultant  (examinations  read  by  residents  were  excluded).  

• If  there  were  several  double  read  examinations  of  the  same  patient,  the  first   one  would  be  selected.  

• Addendums  made  after  finalizing  the  report  were  disregarded.  

                                                                                                               

7  Internal  medicine:  At  the  Vestre  Viken  Hospitals  the  department  of  internal   medicine  was  not  formally  subdivided,  and  patients  in  all  disciplines  were  eligible  for   inclusion.  At  Ullevål  and  Ahus  the  departments  of  internal  medicine  are  subdivided,   and  patients  from  all  disciplines  of  internal  medicine  were  eligible  except  oncology,   preventive  medicine,  physical  therapy  and  rehabilitation.  

8  Surgery:  At  the  Vestre  Viken  hospitals  the  department  of  surgery  was  not  formally   subdivided,  and  patients  in  all  disciplines  except  orthopaedic  surgery  were  eligible   for  inclusion.  At  Ullevål  and  Ahus  the  departments  of  surgery  are  subdivided,  and   only  patients  from  the  department  of  abdominal  surgery  were  eligible  for  inclusion.  

9  Whole  body  CT:  Chest  CT  examinations  including  other  anatomic  regions  such  as   abdomen  (with  or  without  the  pelvis),  head,  neck  and  extremities.  

10  Whole  body  CT:  abdominopelvic  CT  examinations  including  other  anatomic   regions  such  as  chest,  head,  neck  and  extremities.  

Referanser

RELATERTE DOKUMENTER