Interrater reliability of schizoaffective disorder compared with schizophrenia, bipolar disorder, and unipolar depression – A systematic review and meta-analysis
Introduction
Schizoaffective disorder is a prevalent diagnosis in both clinical and epidemiological samples. For example, in an Australian epidemiological survey, 16.1% of all patients screened positive for psychosis eventually received a diagnosis of schizoaffective disorder (Morgan et al., 2012), and a European population based study estimated its prevalence to be 1.1% (Scully et al., 2004). Also, a study of Medicaid claims found almost half as many patients with diagnoses of schizoaffective disorder as patients with schizophrenia diagnoses (42%) (Olfson et al., 2009).
Despite its prevalence, the diagnosis of schizoaffective disorder has been critically debated for decades. Some authors recommend abandoning the diagnosis entirely (Lake and Hurwitz, 2007, Maier, 2006, Malhi et al., 2008), whereas others emphasize its usefulness (Marneros, 2007). The recent revision of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) retained schizoaffective disorder as a diagnostic entity (American Psychiatric Association, 2013), and ICD-11, scheduled to appear in 2017, will also provide diagnostic criteria for schizoaffective disorder (Gaebel, 2012).
Diagnostic manuals aim at improving the reliability of diagnoses, a key issue for clinical practice and research and a long-standing problem in psychiatry. In an earlier meta-analysis, we have shown that test-retest reliability of schizoaffective disorder is moderate and consistently, statistically significantly and to a considerable extent lower than the test-retest reliability of its main differential diagnoses schizophrenia, bipolar disorder, and unipolar disorder (Santelmann et al., 2015). We are, however, not aware of any systematic and quantitative attempt at summarizing the interrater reliability of schizoaffective disorder. From a clinical viewpoint interrater reliability is particularly consequential because it measures to what degree two doctors use the same diagnosis for the same patient.
As a consequence, we conducted a systematic review and meta-analysis of studies investigating the interrater reliability of schizoaffective disorder relative to other functional psychoses. We hypothesized lower interrater reliability in schizoaffective disorder than in schizophrenia, bipolar disorder, and unipolar depression.
Section snippets
Methods
This is a systematic review and meta-analysis of diagnostic interrater reliability studies comparing schizoaffective disorder with schizophrenia, bipolar disorder, and unipolar depression. The analysis is part of a research project on the diagnostic reliability of schizoaffective disorder (registered on PROSPERO: CRD42013006713; www.crd.york.ac.uk/prospero). Earlier results have been published on the test-retest reliability (Santelmann et al., 2015) and on the diagnostic shift seen in patients
Results
Out of 4126 articles screened, 345 were assessed for eligibility at full-text level, and 23 articles on 25 studies were included in the analysis (see PRISMA flowchart in Fig. 1). The studies were published between 1974 and 2012 and, in total, reported on 7912 patients (range: 24–3493; Median: 100). Table 1 provides a breakdown of the key characteristics of the 25 studies included.
Discussion
We consider three results of this study as particularly important: First, with an estimated kappa of 0.57 the interrater reliability of schizoaffective disorder is only moderate according to the interpretation of Landis and Koch (1977). Second, in direct comparisons, the interrater reliability of schizoaffective disorder turned out to be substantially lower than that of schizophrenia, bipolar disorder, and unipolar depression. Third, while heterogeneity measures showed high levels of
Role of funding source
No funding body agreements.
Contributors
C. Baethge had the idea for this research and supervised it all way through. H. Santelmann carried out the literature search and screened all relevant abstracts. H. Santelmann and J. Bußhoff independently of each other extracted data from relevant full text articles. J. Franklin advised in the statistical analysis of this paper which was carried out by H. Santelmann. C. Baethge and H. Santelmann drafted this paper which was corrected by J. Franklin and J. Bußhoff.
Conflict of interest
The authors reported no conflict of interest with respect to this work.
References (56)
- et al.
The reliability of the standard for clinicians' interview in psychiatry (SCIP): a clinician-administered tool with categorical, dimensional and numeric output
Schizophr. Res.
(2014) - et al.
Prophylaxis of schizoaffective disorder with lithium or carbamazepine: outcome after long-term follow-up
J. Affect. Disord.
(2004) - et al.
High agreement but low kappa: II. Resolving the paradoxes
J. Clin. Epidemiol.
(1990) - et al.
Interrater reliability of the structured clinical interview for DSM-III-R, Axis II: schizophrenia spectrum and affective spectrum disorders
Psychiatry Res.
(1991) - et al.
Korean version of the diagnostic interview for genetic studies: validity and reliability
Compr. Psychiatry
(2004) - et al.
The misdiagnosis of bipolar disorder as a psychotic disorder: some of its causes and their influence on therapy
J. Affect. Disord.
(2009) - et al.
Characteristics and heterogeneity of schizoaffective disorder compared with unipolar depression and schizophrenia - a systematic literature review and meta-analysis
J. Affect. Disord.
(2016) - et al.
Diagnostic Interview for Genetic Studies (DIGS): inter-rater and test-retest reliability and validity in a Spanish population
European Psychiatry
(2007) - et al.
Schizophrenia, schizoaffective and bipolar disorder within an epidemiologically complete, homogeneous population in rural Ireland: small area variation in rate
Schizophr. Res.
(2004) Diagnostic and Statistical Manual of Mental Disorders DSM-5
(2013)
The Comprehensive Assessment of Symptoms and History (CASH). An instrument for assessing diagnosis and psychopathology
Arch. Gen. Psychiatry
Long-term treatment of schizoaffective disorder: review and recommendations
Pharmacopsychiatry
Substantial agreement of referee recommendations at a general medical journal–a peer review evaluation at Deutsches Arzteblatt International
PLoS One
Understanding heterogeneity in meta-analysis: the role of meta-regression
Int. J. Clin. Pract.
Definitions of depression: concordance and prediction of outcome
Am. J. Psychiatry
Diagnostic reliability and validity of the PSE CATEGO-system
Arch. Psychiatr. Nervenkr.
A twin study of schizoaffective-mania, schizoaffective-depression, and other psychotic syndromes
Am. J. Med. Genet. B Neuropsychiatr. Genet.
The diagnoses of schizophrenia, schizoaffective disorder, bipolar disorder and unipolar depression: interrater reliability and congruence between DSM-IV and ICD-10
Psychopathology
A coefficient of agreement for nominal scales
Educ. Psychol. Meas.
Trim and fill: a simple funnel-plot–based method of testing and adjusting for publication bias in meta-analysis
Biometrics
Bias in meta-analysis detected by a simple, graphical test
BMJ
DSM-5 schizoaffective disorder: will clinical utility be enhanced?
Soc. Psychiatry Psychiatr. Epidemiol.
DSM-IV field trial for schizophrenia and other psychotic disorders
Measuring nominal scale agreement among many raters
Psychol. Bull.
Testing ICD-10: results of a multicentric field trial in German speaking countries
Nervenarzt
Diagnostic stability of ICD/DSM first episode psychosis diagnoses: meta-analysis
Schizophr. Bull.
Status of psychotic disorders in ICD-11
Schizophr. Bull.
Computing inter-rater reliability and its variance in the presence of high agreement
Br. J. Math. Stat. Psychol.
Cited by (21)
Deep learning model using retinal vascular images for classifying schizophrenia
2022, Schizophrenia ResearchClozapine in patients with schizoaffective disorder: A systematic review
2021, Revista de Psiquiatria y Salud MentalBipolar disorder diagnostic stability: a Portuguese multicentric study
2020, Psychiatry ResearchCitation Excerpt :Secondly, when comorbid, it can be a confounding factor of diagnosis, hence contributing to diagnostic errors (McIntyre et al., 2019) – a fact reflected in DSM-5, where mood episodes due to substances are considered exclusion criteria for BD (American Psychiatric Association. Diagnostic and Statistical Manual of Mental Disorders. Fifth Edition. DSM-5, 2013). There is a scarcity of research focusing on psychiatric diagnoses stability, and, specifically, on BD diagnostic stability (Santelmann et al., 2016). While studies can be found evaluating interrater reliability of BD diagnosis (Santelmann et al., 2016, 2015), the assessment of BD diagnostic stability over time is rare (Cegla-Schvartzman et al., 2019).
Staging and profiling for schizophrenia spectrum disorders: Inter-rater reliability after a short training course
2020, Progress in Neuro-Psychopharmacology and Biological PsychiatrySteeper aging-related declines in cognitive control processes among adults with bipolar disorders
2019, Journal of Affective DisordersCitation Excerpt :First, psychiatric diagnoses, including BD, were based on clinical interview and review of available medical records in the context of a clinical neuropsychological evaluation and not based on structured diagnostic interviews typical of research protocols. However, past studies support good reliability of BD diagnoses across diagnostic methods—from electronic-record text mining to non-structured clinical interviews (Castro et al., 2015; Regier et al., 2013; Santelmann et al., 2016). This real-world clinical assessment procedure possibly makes the present findings more generalizable to other clinical BD populations.