Psychosomatics
Journal Home Search Current Issue Past Issues Subscribe All APPI Journals Help Contact Us
 
Quicksearch
Advanced Search
Or Search All APPI Journals
This Article
* Abstract Freely available
* Full Text (PDF)
* Alert me when this article is cited
* Alert me if a correction is posted
* Citation Map
Services
* Email this article to a Colleague
* Similar articles in this journal
* Similar articles in PubMed
* Alert me to new issues of the journal
* Add to My Articles & Searches
* Download to citation manager
* reprints & permissions
Citing Articles
* Citing Articles via HighWire
* Citing Articles via Google Scholar
Google Scholar
* Articles by Clarke, D. M.
* Articles by McKenzie, D. P.
* Search for Related Content
PubMed
* PubMed Citation
* Articles by Clarke, D. M.
* Articles by McKenzie, D. P.
Related Collections
* General Topics in Psychiatry
* Syndromes Secondary to General Medical Disorders
Psychosomatics 39:318-328, August 1998
© 1998 The Academy of Psychosomatic Medine

Monash Interview for Liaison Psychiatry (MILP)

Development, Reliability, and Procedural Validity

David M. Clarke, M.B.B.S., Ph.D., F.R.A.N.Z.C.P., Graeme C. Smith, M.D., F.R.A.N.Z.C.P., Helen E. Herrman, M.D., F.R.A.N.Z.C.P., and Dean P. McKenzie, B.A.

Received January 16, 1998; accepted January 30, 1998. From the Consultation-Liaison Psychiatry Research Unit, Monash University Department of Psychological Medicine, Monash Medical Centre, Melbourne, Australia. Dr. Herrman is from the University of Melbourne, and Hospital and Community Psychiatry Service, St. Vincent's Hospital, Melbourne. Address reprint requests to Dr. Clarke, Monash University, Department of Psychological Medicine, Monash Medical Centre, 246 Clayton Road, Clayton, Victoria 3168, Australia. e-mail: david.clarke{at}med.monash.edu.au


  ABSTRACT

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
The Monash Interview for Liaison Psychiatry (MILP) is a structured interview designed for use with patients who have physical and psychiatric comorbidity. Linked to a computerized diagnostic algorithm, the MILP is able to establish diagnoses according to DSM-III-R, International Classification of Diseases–10th Edition (ICD–10), and DSM-IV criteria, as well as a range of other criteria relevant to consultation-liaison psychiatry. Interrater reliability was assessed with 54 joint interviews, in which the mean kappa for agreement of items was 0.83 and of diagnoses was 0.68. Comparative procedural validity was tested against DSM-III-R decision-tree diagnoses, ICD–10 checklist diagnoses, and Structured Clinical Interview for DSM-III-R interview diagnoses on another sample of 54 patients. Mean kappas for these comparisons were 0.61, 0.56, and 0.31, respectively. As predicted, the MILP more fully covered the spectrum of somatizing disorders, compared with the other methods for establishing diagnoses.

Key Words: Monash Interview • Diagnosis • Interview • Patient Assessment


  INTRODUCTION

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
Descriptive psychopathology has been the cornerstone of twentieth-century psychiatry, and classification its preoccupation in recent decades. However, the emphasis on measurement and reliability of diagnosis has tended to overshadow the importance of validity and utility of classifications.1,2 This emphasis is particularly evident in the field of consultation-liaison psychiatry, in which current systems of classification poorly describe the nature, range, and etiological understanding of the psychiatric pathology observed. In depressed patients with a physical illness, for example, there is always a psychosocial stressor, and "organicity" frequently cannot be excluded. In this situation, etiological statements are often overly simplistic and reductionist.3,4 Further, the range of depressive phenomena seen in the context of physical illness is different from that seen in general psychiatric practice and subthreshold diagnoses are significant.5,6 Despite anxiety being extremely common in the physically ill, diagnoses of anxiety disorders are made infrequently,7 due to the difficulty of defining what is pathological8 and to the tradition of hierarchy.9 Somatization, a common phenomenon,10 is poorly represented by standard classifications,11 and the attribution of cause and meaning to physical symptoms is always difficult. For all these reasons, there is a need to examine the psychopathology of physical–psychiatric comorbidity without the assumptions made by classifications of mental disorders developed in other populations. Because most structured interviews to date have been developed around the current classificatory systems, these interviews are limited in their usefulness for this task.

We describe a new structured interview, the Monash Interview for Liaison Psychiatry (MILP), designed for the study of psychiatric disorders in the physically ill. The validity of the interview is based on its inquiry of a range of phenomena appropriate to the setting and its emphasis on a careful consideration of the attribution of cause. Compared with the Diagnostic Interview Schedule (DIS)12 and its descendants, the Composite International Diagnostic Interview13 and Structured Clinical Assessment for Neuropsychiatry,14 the MILP covers a wider range of mood disorders and somatoform disorders and is administered by interviewers competent to make the clinical judgments required. Compared with the Structured Clinical Interview for DSM-III-R (SCID),15 the MILP is more comprehensive in inquiry, in that it minimizes the use of screening questions and "skips," and covers a wider range of disorders.


  DESCRIPTION OF THE MILP

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
The MILP is a structured interview that takes between 60 and 90 minutes to administer. It includes a systematic symptom inquiry, with the emphasis on current state, defined as being during the past month, although the duration of any symptom is recorded for the application of diagnostic criteria. For symptoms present, a judgment of attribution of cause is coded as one or more of the following: physical illness or injury, medication, drugs or alcohol, psychogenic or "unexplained." Guided by the interview protocol, the interviewer makes this judgment after asking the subject a number of questions, inquiring of the medical staff if necessary. With the assistance of computerized data entry and scoring algorithm,16 the interview generates diagnoses according to DSM-III-R,17 DSM-IV,18 and International Classification of Diseases (ICD)–10th Edition19 classifications. In addition, the MILP allows inclusion of a range of other diagnoses relevant to consultation-liaison psychiatry. These include the Stewart et al. criteria for depression20; the Endicott criteria21; the concepts of atypical depression22; Newcastle endogenous depression23; grief24; and diagnoses of "subthreshold" disorders, such as the DSM disorders "not otherwise specified (NOS)," the ICD disorders "unspecified," and "abridged somatization."25 The interview has been acceptable for patients with physical illness but cannot be administered to patients with significant cognitive impairment. Diagnoses of organic mental disorders are not made.


  RELIABILITY

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
Methods
To examine interrater reliability, an observed interview model was chosen, as used in other similar studies.2628 This method removes information and occasion variance but requires a commitment on the part of raters to remain independent in their assessments.29,30 Fifty-four joint interviews were conducted by the first author (DMC), a psychiatrist, together with one of two psychologists, raters alternating between interviewing and observing. The sample size was chosen to be consistent with other studies.26,28,3133 Patients were selected from the medical and surgical wards of Monash Medical Centre, a university-affiliated, suburban general hospital. During the period of study (1992–1993), all available patients were screened upon admission with the 36-item General Health Questionnaire34 scored in the "chronic" manner,35 using a 20/21 cutoff.36 To create a "clinical" sample, the patients scoring above the cutoff were invited to participate until the desired sample size was achieved. The mean age ± standard deviation (SD) of the patients interviewed was 46.0 ± 18.0 years; 57% were male. Following the interview, data were entered, and the diagnoses were generated by computer. Agreement between the raters was determined for both diagnoses and items. The primary measure of agreement used was the kappa statistic, which gives a chance-corrected measure of agreement between two raters.3739 Because kappa is unstable in the presence of low base rates, it is not reported in instances in which the base rate was less than 10% for items (set conservatively) and 5% for diagnostic categories (to include a wider range).40 Overall agreement, which is not chance-corrected, is also reported, defined as agreed cases plus agreed noncases expressed as a percentage of total subjects.41 In line with suggested practice, kappas above 0.40 were taken to indicate at least "moderate" agreement, whereas kappas below 0.40 were considered unsatisfactory.42 All statistics were calculated by a FORTRAN program written by one of the authors (DMcK).

Results
Reliability measures for interview items are summarized in Table 1. The mean kappa for all items was 0.83. One symptom item had a kappa less than 0.04—"rapport with grief"—an item involving subjective clinical judgment. Two "attribution" items had kappas less than 0.04. Both these items had a large proportion of empty cells in the analysis table (94% and 92%), a situation that makes kappa unstable in a way similar to a low base rate.43 The judgments of whether depressive and anxiety reactions were "in excess of what might be expected" (a criterion required for the diagnosis of adjustment disorder) also had kappas less than 0.40.


View this table:
[in this window]
[in a new window]
 

TABLE 1.



Mean kappa for agreement on DSM-III-R diagnoses was 0.68 (see Table 2A and Table 2B). One diagnosis, anxiety disorder NOS, had a kappa less than 0.40. Mean kappa for DSM-IV diagnoses was 0.63. Anxiety disorder NOS and pain disorder had kappas less than 0.40. Mean kappa for ICD–10 diagnoses was 0.66. Depressive disorder unspecified and adjustment disorders with anxiety had kappas less than 0.40. Mean kappa for diagnoses not included in these three systems was 0.91. Overall mean kappa for diagnostic agreement was 0.68.


View this table:
[in this window]
[in a new window]
 

TABLE 2A.




View this table:
[in this window]
[in a new window]
 

TABLE 2B.




  PROCEDURAL VALIDITY

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
The concern here is not the validity of the underlying constructs or diagnoses (construct validity, predictive validity etc.), but procedural validity, defined as "the extent to which the new diagnostic procedure yields results similar to the results of an established diagnostic procedure that is used as a criterion" (p. 595).44 Because there is no established "gold standard," comparisons were made with a number of other diagnostic procedures. It is, in a sense, a comparative procedural validity that is being measured. Three comparisons were made: 1) MILP vs. a DSM-III-R decision-tree diagnosis, 2) MILP vs. an ICD–10 checklist, and 3) MILP vs. the SCID.

Methods
Patients were screened and recruited, as for the reliability study. One day per week, one patient was randomly selected from those already interviewed with the MILP. A SCID interview was performed within 48 hours of the MILP by a "blind" psychologist trained and satisfactorily rated in the use of the SCID but not otherwise involved in the study. Fifty-four interviews were completed, and this group of patients became the sample for each of the three comparisons. The mean age ± SD of this group was 46.8 ± 16.7 years; 69% were male.

The data were entered into a computer, and DSM-III-R and ICD–10 diagnoses were generated. In addition, and before generating the MILP diagnoses, the interviewer created DSM-III-R diagnoses, aided by a computerized decision-tree program,45,46 and ICD–10 diagnoses by using the ICD–10 Research Criteria Checklist provided by the World Health Organization (Janca A, personal communication, 1992). Because no assumptions were made about the "gold standard," the measures of agreement used were kappa and overall agreement as previously described.39 Because reliability assumes that error is random,47 exact McNemar's tests were used to look for systematic bias between ratings—specifically to determine the presence of any significant difference in base rates between the two methods of diagnosis under consideration.43 Exact probabilities, based on the binomial distribution, were calculated by using the algorithm of Berry et al.48 Results are only considered for the diagnostic categories that mostly had base rates greater than 5%. All diagnoses are "current."

Results
The results are presented in Table 3, Table 4, and Table 5. Table 3 shows good agreement for the MILP vs. DSM-III-R decision-tree diagnoses for all groups, except the somatizing disorders, in instances in which there was a tendency (McNemar's P<0.10, NS) for the MILP to yield more diagnoses. Anxiety disorder NOS also had a kappa less than 0.40.


View this table:
[in this window]
[in a new window]
 

TABLE 3.




View this table:
[in this window]
[in a new window]
 

TABLE 4.




View this table:
[in this window]
[in a new window]
 

TABLE 5.



In the MILP vs. ICD checklist comparison, there was a similarly less than satisfactory agreement for the somatizing disorders (Table 4), with significantly different base rates evident.

In the MILP vs. SCID analyses (Table 5), there were four groups with overall poor agreement and significant base rate differences.


  DISCUSSION AND CONCLUSIONS

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
Reliability
As expected with a structured interview, interrater reliability results overall are satisfactory, influenced by the joint-interview study design, which minimized occasion and information variance. The mean kappa for the presence or absence of symptoms was above 0.80, and there was only one symptom in which there was poor agreement—rapport with grief. This item requires no question of the patient, and this assessment is based solely on the judgment made by the interviewer and observer. This lack of agreement may be partly explained by the differences in training and experience among the raters. The judgments about the attribution of symptoms were also mostly satisfactory, with only two items having kappas less than 0.40, and both of these have high numbers of "empty cells."

For the diagnoses, measures of agreement are mostly moderate or better. Only 4 out of 49 categories had kappas less than 0.40, and these were all the less specific categories—three NOS categories and an adjustment disorder. These diagnoses have few criteria and are dependent mostly on the exclusion of other diagnoses. It is possible that the disagreements in the other, for example, anxiety disorders, had a cumulative effect on the residual category of anxiety disorder NOS. This theory is suggested by the satisfactory agreement for the ICD–10 diagnosis of anxiety disorder unspecified, which has exactly the same symptom criteria as its DSM-III-R and DSM-IV counterparts.

Procedural Validity
Because there is no true gold standard, the aim in the validity studies was to compare the MILP with a variety of known standards, thus placing the MILP in relation to these standards, in a form of "triangulation"—a term borrowed from "orienteering."49 In these studies, there was close agreement between the DSM-III-R diagnoses of the MILP made by the computerized algorithm compared with the decision tree in all groups, except the somatizing disorders, in which the MILP yielded more diagnoses. There was a similar result in the comparison of ICD-10 diagnoses made by the MILP and the ICD–10 checklist.

Greater disagreements occurred in the MILP vs. SCID comparison. The SCID yielded no diagnoses of somatizing disorders, which is not surprising. The SCID has a small range of somatic diagnoses, only proceeds with the somatic section of the interview if the interviewer judges it to be relevant after screening questions, and is noted to make somatoform diagnoses infrequently.50 These differences contrast with the exhaustive symptom inquiry of the MILP. Similarly, with other drug use disorders, the MILP yielded 13 diagnoses, compared with the SCID's one. Anxiety disorders and minor mood disorders each had "fair" agreement,42 with kappas of 0.37 and 0.30, respectively. Although no significant bias was recorded and the actual number of cases was small, generalized anxiety disorder, phobias, and obsessive-compulsive disorder were all diagnosed more often with the MILP than with the SCID. Four patients diagnosed by the MILP as having minor mood disorder were diagnosed by the SCID as having major depression, consistent with the lower rate of diagnosis of major depression by the MILP compared with the other standards. However, seven of those patients given a diagnosis of minor mood disorder by MILP were given no diagnosis by the SCID. This difference suggests that the MILP has a lower threshold for diagnosis of minor mood disorders. With the exception of major depression, the MILP tends to be overinclusive compared with the SCID.


  CONCLUSIONS

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 
The studies of comparison described here examined different possible sources of variance. The comparison with the SCID is the most susceptible to error. Although unstructured assessments have been used as the standard for validity studies,51 this method was not used here, as it provides no standardization of the process. The use of a structured interview such as the SCID minimizes information and interpretation variance of the standard; however, since the SCID was conducted at a time separate from the MILP, the possibility of occasion variance exists. This possibility was minimized by conducting the SCID interview very soon after the MILP.

There are limitations in the study and in the interpretation of results. Because the interview is so broad, leading to a large number of diagnoses, the base rate for some items and diagnoses was small, affecting the kappa statistic adversely. The study sample was "enriched" by screening and was thus similar to clinical populations seen in the consultation-liaison setting. Still some diagnoses were rarely made. Validity of these diagnoses will need to be tested in clinical samples with higher prevalences for these disorders. In line with principles established in studies of the DIS,52 it was considered important to test the interview in the type of population in which it is to be used and to examine the disorders commonly encountered in this setting.

Because there is no true "gold standard," what is particularly important is that the interview has adequate reliability. The results demonstrate that it does. However, compared with other diagnostic methods, the results show less than moderate agreement for some diagnostic groups. These results are mostly explained by systematic bias reflecting different thresholds for diagnosis. As Carey and Gottesman47 have explained, "When two raters are rating the same phenomenon, using two different thresholds for doing so, the crucial issue is to determine which of the raters has a better, more valid, threshold" (p. 1458). If MILP overdiagnoses somatizing disorders compared with other standards, the validity of these ratings will need to be tested in some further way, such as an examination of predictive or criterion validity. Such an approach will be a movement away from establishing the validity of the procedure toward establishing the validity of the constructs.

Nevertheless, it is important to note the differences between the MILP and other diagnostic methods. The MILP appears to be stricter than other measures in diagnosing major depression. Reliability of this diagnosis is almost perfect. It may be that the MILP is "tougher" in its judgment concerning the attribution of cause and the exclusion of items judged to be of organic origin. In the area of somatizing disorders and drug use disorders, the inquiry of the MILP is more intensive than that of the SCID. In addition, however, the comparisons of the MILP with the DSM decision tree and the ICD checklist suggest there may be problems with the procedural validity with respect to the somatizing disorders. Low kappas for these comparisons are only partly explained by systematic bias. Again, this may be related to difficulties in the process of attribution of cause. This will need further examination.

It is interesting to compare these results with the reliability and validity studies of other instruments. The DIS has probably had the most investigation. The earliest study by Robins et al. yielded the best results (kappas ranging from 0.40 to 0.86), but the study was done with a psychiatric population in which the base rates for diagnoses were high.53 A similar study by Helzer et al. of a community sample showed weaker agreements (kappa 0.24 to 0.68).51 In both these studies, the standard was a second DIS interview administered by a psychiatrist independently of the first lay interview, a design close to being one of test–retest reliability. Although one would expect almost perfect agreement in this situation, the relatively modest agreement achieved (mean kappa 0.69),53 despite using a structured interview and computer-generated diagnoses, demonstrates how error can be introduced by the interviewer through observation and interpretation variance. In a similar vein, during the DSM-IV trials, McGorry et al. demonstrated significant disagreements between diagnostic procedures using identical diagnostic criteria.54 DIS studies using checklists and clinical interviews as comparisons have also produced particularly poor agreements.51,52,55

The SCID does not yet have published studies of procedural validity, but there have been a number of reports of reliability. Two studies of interrater reliability, one using audiotaped interviews of patients with a range of disorders31 and the other joint interviews of depressed patients,56 found mean kappas for agreement on diagnosis of 0.76 and 0.68, respectively. Examination of test–retest reliability of the SCID found a mean kappa of 0.61 in a patient sample and 0.37 in a nonpatient sample in which the base rates of diagnoses were lower.50 Interestingly, the base rates for the somatizing disorders in both samples were so low (from 0% to 2%) that no measures of agreement were calculated. The Standardized Polyvalent Psychiatric Interview,28 a composite interview built around the Clinical Interview Schedule,57 found kappas for interrater reliability mostly in the range of 0.70 to 0.90.

The MILP has been designed specifically for use in the physically ill and has been acceptable to patients and interviewers. Its inquiry is more extensive than other similar instruments in the areas of somatizing disorders, drug use disorders, and subthreshold disorders. The reliability and procedural validity of the MILP is comparable to other structured interviews, although the results highlight the different thresholds for diagnosis compared with the SCID, and potential difficulties with some of the subthreshold disorders, those on the boundary of normality, and those requiring a degree of judgment. As with the SCID, the importance of paying "careful attention to adequate training of interviewers" (p. 636)50 is affirmed. The results confirm the need for more study of the classification of psychiatric disorders in the physically ill and the testing of construct and predictive validity of diagnostic categories.


  ACKNOWLEDGMENTS

 
The authors thank Anne Silbereisen, Kevan Pitcher, and Lisa Henry, who conducted the interviews; Paul Low for data management; and Roland Yap for writing the PROLOG software for the diagnostic algorithm. The study was supported by the National Health and Medical Research Council of Australia.


  REFERENCES

 
 TOP
 ABSTRACT
 INTRODUCTION
 DESCRIPTION OF THE MILP
 RELIABILITY
 PROCEDURAL VALIDITY
 DISCUSSION AND CONCLUSIONS
 CONCLUSIONS
 REFERENCES
 

  1. Vaillant GE: The disadvantages of DSM-III outweigh its advantages. Am J Psychiatry 1984; 141:542–545[Free Full Text]
  2. Berrios GE: Phenomenology and psychopathology: was there ever a relationship? Compr Psychiatry 1993; 34:213–220
  3. Leigh H, Price L, Ciarcia J, et al: DSM-III and consultation-liaison psychiatry: towards a comprehensive medical model of the patient. Gen Hosp Psychiatry 1982; 4:283–289[Medline]
  4. Lipowski ZJ: Is "organic" obsolete? Psychosomatics 1990; 31:342–344
  5. Snaith RP: The concepts of mild depression. Br J Psychiatry 1987; 150:387–393[Abstract/Free Full Text]
  6. Wells KB, Stewart A, Hays RD, et al: The functioning and well-being of depressed patients: results from the Medical Outcome Study. JAMA 1989; 262:914–919[Abstract]
  7. McKegney FP, McMahon T, King J: The use of DSM-III in a general hospital consultation-liaison service. Gen Hosp Psychiatry 1983; 5:115–121[Medline]
  8. Creed F: Anxiety in general medical patients, in Handbook of Anxiety, Vol. 2: Classification, Etiological Factors and Associated Disturbances, edited by Noyes R, Roth M, Burrow GD. Amsterdam, The Netherlands, Elsevier Science Publishers, 1988, pp. 239–268
  9. Foulds GA, Bedford A: Heirarchy of classes of personal illness. Psychol Med 1975; 5:181–192[Medline]
  10. Katon W, Lin E, VonKorff M, et al: Somatization: a spectrum of severity. Am J Psychiatry 1991; 148:34–40[Abstract/Free Full Text]
  11. DeGruy F, Crider J, Hashimi DK, et al: Somatization disorder in a university hospital. J Fam Pract 1987; 25:579–584[Medline]
  12. Robins LN, Helzer JE, Croughan J, et al: National Institute of Mental Health Diagnostic Interview Schedule: its history, characteristics, and validity. Arch Gen Psychiatry 1981; 38:381–389[Abstract]
  13. Robins LN, Wing J, Wittchen H-U, et al: The Composite International Diagnostic Interview: an epidemiologic instrument suitable for use in conjunction with different diagnostic systems and in different cultures. Arch Gen Psychiatry 1988; 45:1069–1077[Abstract]
  14. Janca A, Ustun TB, Sartorius N: New versions of World Health Organization instruments for the assessment of mental disorders. Acta Psychiatr Scand 1994; 90:73–83[Medline]
  15. Spitzer RL, Williams JBW, Gibbon M, et al: The Structured Clinical Interview for DSM-III-R (SCID) I: history, rationale and description. Arch Gen Psychiatry 1992; 49:624–629[Abstract]
  16. Yap RHC, Clarke DM: An expert system for psychiatric diagnosis using the DSM-III-R, DSM-IV and ICD-10 classifications. Paper published in the Proceedings of the Annual Fall Symposium of the American Medical Informatics Association, Washington DC, October 1996, pp. 229–233
  17. American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, 3rd Edition, Revised. Washington, DC, American Psychiatric Association, 1987
  18. American Psychiatric Association: Diagnostic and Statistical Manual of Mental Disorders, 4th Edition. Washington, DC, American Psychiatric Association, 1994
  19. World Health Organization: The ICD–10 classification of mental and behavioural disorders. Diagnostic criteria for research. Geneva, Switzerland, World Health Organization, 1993
  20. Stewart MA, Drake F, Winokur G: Depression among medically ill patients. Dis Nerv Syst 1965; 26:479–485[Medline]
  21. Endicott J: Measurement of depression in patients with cancer. Cancer 1984; 53:2243–2247[Medline]
  22. Davidson NRT, Miller RD, Turnbull CD, et al: Atypical depression. Arch Gen Psychiatry 1982; 39:527–534[Abstract]
  23. Carney MWP, Roth M, Garside RF: The diagnosis of depressive syndromes and the prediction of ECT response. Br J Psychiatry 1965; 3:659–674
  24. Vargas LA, Loya F, Hodde-Vargas J: Exploring the multidimensional aspects of grief reactions. Am J Psychiatry 1989; 146:1484–1488[Abstract/Free Full Text]
  25. Escobar JI, Canino G: Unexplained physical complaints: psychopathology and epidemiological correlates. Br J Psychiatry 1989; 154:24–27
  26. McGorry PD, Singh B, Copolov DL, et al: Royal Park Multidiagnostic Instrument for Psychosis: Part II. Development, reliability and validity. Schizophr Bull 1990; 16:517–536
  27. Wittchen H-U, Robins LN, Cottler LB, et al: Cross-cultural feasibility, reliability and sources of variance of the Composite International Diagnostic Interview (CIDI). Br J Psychiatry 1991; 159:645–653[Abstract/Free Full Text]
  28. Lobo A, Campos R, Perez-Echeverria M-J, et al: A new interview for the multiaxial assessment of psychiatric morbidity in medical settings. Psychol Med 1993; 23:505–510[Medline]
  29. Helzer JE, Robins LN, Taibleson M, et al: Reliability of psychiatric diagnosis, I: A methodological review. Arch Gen Psychiatry 1977; 34:129–133[Abstract]
  30. Grove WM, Andreasen NC, McDonald-Scott P, et al: Reliability studies of psychiatric diagnosis. Arch Gen Psychiatry 1981; 38:408–413[Abstract]
  31. Skre I, Onstad S, Torgersen S, et al: High interrater reliability for the Structured Clinical Interview for DSM-III-R Axis I (SCID-I). Acta Psychiatr Scand 1991; 84:167–173[Medline]
  32. Janca A, Robins LN, Bucholz KK, et al: Comparison of Composite International Diagnostic Interview and clinical DSM-III-R criteria checklist diagnoses. Acta Psychiatr Scand 1991; 84:167–173
  33. Janca A, Robins LN, Cottler LB, et al: Clinical observation of assessment using the Composite International Diagnostic Interview (CIDI): an analysis of the CIDI field trials—wave II at St. Louis site. Br J Psychiatry 1992; 160: 815–818
  34. Goldberg DP: The detection of psychiatric illness by questionnaire, Maudsley Monograph No. 21. London, UK, Oxford University Press, 1972
  35. Goodchild ME, Duncan-Jones P: Chronicity and the General Health Questionnaire. Br J Psychiatry 1985; 146:55–61[Abstract/Free Full Text]
  36. Clarke DM, Smith GC, Hermann HE. A comparative study of screening instruments for mental disorders in general hospital patients. Int J Psychiatry Med 1993; 23:323–337[Medline]
  37. Cohen J: A coefficient of agreement for nominal scales. Educational Psychological Measure 1960; 20:37–46
  38. Streiner DL: Learning how to differ: agreement and reliability statistics in psychiatry. Can J Psychiatry 1995; 40:60–66[Medline]
  39. Langenbucher J, Labouvie E, Morgenstern J: Measuring diagnostic agreement. J Consult Clin Psychol 1996; 64:1285–1289[Medline]
  40. Spitznagel EL, Helzer JE: A proposed solution to the base rate problem in the kappa statistic. Arch Gen Psychiatry 1985; 42:725–728[Abstract]
  41. Baldessarini RJ, Finklestein S, Arana GW: The predictive power of diagnostic tests and the effect of prevalence of illness. Arch Gen Psychiatry 1983; 40:569–573[Abstract]
  42. Landis JR, Koch GG: The measurement of observer agreement for categorical data. Biometrics 1977; 33:159–174[Medline]
  43. Fleiss JL: Statistical Methods for Rates and Proportions, 2nd Edition. New York, Wiley, 1981
  44. Spitzer RL, Williams JBW: Classification of Mental Disorders and DSM-III, in Comprehensive Textbook of Psychiatry, 4th Edition, edited by Kaplan HI, Sadock BJ. Baltimore, MD, Williams & Wilkins, 1985, pp. 591–613
  45. First MB, Williams JBW, Spitzer RL: DTREE: the electronic DSM-III-R (computer software, user's guide, case workbook). Washington, DC, American Psychiatric Press, Inc., 1989
  46. First MB, Opler LA, Hamilton RM, et al: Evaluation in an inpatient setting of DTREE: a computer-assisted diagnostic assessment procedure. Compr Psychiatry 1993; 34:171–175[Medline]
  47. Carey G, Gottesman II: Reliability and validity in binary ratings. Arch Gen Psychiatry 1978; 35:1454–1459[Abstract]
  48. Berry KJ, Mielke PW, Helmericks SG: An algorithm to generate discrete probability distributions: binomial, hypergeometric, negative binomial, inverse hypergeometric and poisson. Behavior Research Methods, Instruments, and Computers 1994: 26:366–367
  49. Campbell DT, Fiske C: Convergent and discriminant validation by the multitrait-multimethod matrix. Psychol Bull 1959; 56:81–105[Medline]
  50. Williams JBW, Gibbon M, First MB, et al: The Structured Clinical Interview for DSM-III-R (SCID) II: Multisite test–retest reliability. Arch Gen Psychiatry 1992; 49:630–636[Abstract]
  51. Helzer JE, Robins LN, McEvoy MA, et al: A comparison of clinical and Diagnostic Interview Schedule diagnoses: Physician re-examination of lay-interviewed cases in the general population. Arch Gen Psychiatry 1985; 42:657–666[Abstract]
  52. Anthony JC, Folstein M, Romanoski AJ, et al: Comparison of the lay Diagnostic Interview Schedule and a standardized psychiatric diagnosis. Arch Gen Psychiatry 1985; 42:667–675[Abstract]
  53. Robins LN, Helzer JE, Ratcliff KS, et al: Validity of the Diagnostic Interview Schedule, Version II: DSM-III diagnoses. Psychol Med 1982; 12:855–870[Medline]
  54. McGorry PD, Mihalopoulos C, Henry L, et al: Spurious precision: procedural validity of diagnostic assessment in psychotic disorders. Am J Psychiatry 1995; 152:220–223[Abstract/Free Full Text]
  55. Burnam MA, Karno M, Hough RL, et al: The Spanish Diagnostic Interview Schedule. Arch Gen Psychiatry 1983; 40:1189–1196[Abstract]
  56. Weiss MG, Raguram R, Channabasavanna SN: Cultural dimensions of psychiatric diagnosis: a comparison of DSM-III-R and illness explanatory models in South India. Br J Psychiatry 1995; 166:353–359[Abstract/Free Full Text]
  57. Goldberg DP, Cooper B, Eastwood MR, et al: A standardized psychiatric interview for use in community surveys. Br J Prev Soc Med 1970; 24:18–23[Medline]



This article has been cited by other articles:


Home page
PsychosomaticsHome page
G. C. Smith, D. M. Clarke, D. Handrinos, A. Dunsis, and D. P. McKenzie
Consultation-Liaison Psychiatrists' Management of Somatoform Disorders
Psychosomatics, December 1, 2000; 41(6): 481 - 489.
[Abstract] [Full Text]


Home page
PsychosomaticsHome page
D. M. Clarke, A. J. Mackinnon, G. C. Smith, D. P. McKenzie, and H. E. Herrman
Dimensions of Psychopathology in the Medically Ill: A Latent Trait Analysis
Psychosomatics, October 1, 2000; 41(5): 418 - 425.
[Abstract] [Full Text]


This Article
* Abstract Freely available
* Full Text (PDF)
* Alert me when this article is cited
* Alert me if a correction is posted
* Citation Map
Services
* Email this article to a Colleague
* Similar articles in this journal
* Similar articles in PubMed
* Alert me to new issues of the journal
* Add to My Articles & Searches
* Download to citation manager
* reprints & permissions
Citing Articles
* Citing Articles via HighWire
* Citing Articles via Google Scholar
Google Scholar
* Articles by Clarke, D. M.
* Articles by McKenzie, D. P.
* Search for Related Content
PubMed
* PubMed Citation
* Articles by Clarke, D. M.
* Articles by McKenzie, D. P.
Related Collections
* General Topics in Psychiatry
* Syndromes Secondary to General Medical Disorders


Get information about faster international access.

Privacy Policy

Copyright © 1998 Academy of Psychosomatic Medicine. All rights reserved.

Home | Search | Current Issue | Past Issues | Subscribe | All APPI Journals | Help | Contact Us

American Psychiatric Publishing, Inc. Academy of Psychosomatic Medicine
1000 Wilson Boulevard, Suite 1825, Arlington, VA 22209-3901 * 800-368-5777 * appi at psych.org