Translation, cross-cultural adaptation, validation, and reliability of the Arabic version of diagnostic infant preschool assessment (DIPA) scale

The diagnostic infant and preschool assessment (DIPA) was one of the fewest available instruments which have been developed to assess young children up to 6 years old. The present study translated, validated, and cross-culturally adapted the DIPA from English to Arabic. Forward translation, expert panel evaluation, and back translation of the DIPA were conducted and followed by assessment of cultural relevance and content validity. Validation was performed on a clinical sample of 30 children, through agreement between the diagnostic infant and preschool assessment (DIPA) and Arabic version of DSM-based Child Behavior Check List (CBCL). Validity of categorical variables of translated DIPA showed substantial kappa (0.61-0.80) for conduct disorder, moderate kappa (0.41-0.60) for depressive disorder, post-traumatic stress disorder, generalized anxiety disorder, oppositional defiant disorder, and sleep disorders; poor kappa (0-0.40) for separation anxiety disorder, attention deficit hyperactivity disorder and reactive attachment disorder. Test-retest reliability had almost perfect agreement for all disorders (kappa > 0.81). The current study shows an encouraging psychometric property for a new Arabic translated and culturally validated assessment tool for psychiatric disorders in Egyptian young children. This instrument is useful in examining DSM-IV disorders for young children. Future studies are needed to include larger sample size, age younger than 1.5, and to include patients from specialty clinic.

on interviews of their caregivers [3], by using the Child Behavior Checklist 1.5-5 years (CBCL) [4], Infant-Toddler Social and Emotional Assessment [5], and Preschool Age Psychiatric Assessment (PAPA) [6]. However, these instruments lacked to provide the full coverage of symptoms in childhood psychiatric disorders according to the Diagnostic and Statistical Manual, 5th Edition (DSM-5) and they also could not identify the disorder-specific functional impairment, which is essentially needed for making diagnoses [3].
The diagnostic infant and preschool assessment (DIPA) was developed with various characteristics to compensate for the limitations in other instruments. DIPA is an interview of caregivers about their children under 6 years of age. It is updated for the Diagnostic and Statistical Manual, 5th Edition (DSM-5) including 16 psychiatric disorders with all their symptoms. The DIPA also assesses functional impairment for each specific disorder and it does not require a clinically experienced interviewer [3].
Since conceptions of health and illness in general vary according to cultural, social, and linguistic factors. Thus, meticulous considerations should be given to psychometric instruments for validation and cultural adaption to be used in non-Western and non-English speaking countries.
The main goal of the present study was to contribute to the development of linguistically and culturally appropriate DIPA instrument for use in the early detection of childhood mental disorders among Arab children.

Methods
This study consisted of two parts: (1) translation and cross-cultural adaptation of the original English version of DIPA instrument into the Arabic language and (2) the test of the psychometric properties of DIPA Arabic language version (DIPA-A). The process of translation and cross-cultural adaptation of the DIPA to Arabic followed the recommendations of the World Health Organization (WHO) guidelines for the process of translation and adaptation of instruments [7].

Ethical consideration
Authors' approval was taken before starting the translation and cultural adaptation process. A written informed consent was obtained from parents of children who agreed to participate in the study. The study was conducted in accordance with the guidelines of the Research and Ethics Committee of Okasha Institute of Psychiatry, Ain Shams University.

Study design
This study is a cross-sectional study.

Site of the study
The study was conducted in Cairo, Egypt. Children's cases were recruited over 2 months period from 15 January 2018 to 15 March 2018, from two child mental health clinics: 1) The child psychiatry outpatient clinic at Al-Abbassia Mental Health Hospital, Cairo, Egypt. 2) The child psychiatry outpatient clinic at Okasha Institute of Psychiatry, Ain Shams University, Cairo, Egypt Translation and cultural adaptation of the diagnostic infant and preschooler assessment (DIPA) This process took about a year from November 2016 to December 2017. We followed the WHO recommended steps for translation and adaptation of instruments. Implementation of this method includes the following steps [7]:

Forward translation
The WHO guidelines recommend that the translators are preferred to be health professionals, so DIPA was translated from English into Arabic and back to English by psychiatrists specializing in childhood disorders with experience in using rating scales in a clinical and research context. They were Egyptians by birth, proficient in the English language, and aware of the purpose of the DIPA tool.

Expert panel
The evaluation by a committee of specialists (two psychologists and three psychiatrists) was conducted to identify and resolve the inadequate expressions/concepts of the translation. The panel reviewed the entire translations, mentioned the difficulties they experienced in using the scale, and suggested some changes. All suggestions were discussed and were included in the final version of the DIPA. The following are examples: The expression "driven by a motor" in the question A14, we do not use it in our culture, so we modified it to "Does not feel comfortable in the stability for long periods, such as sitting in a restaurant." Adding the word time to questions B2, B3, B4, B5, B6, B7, and B8 to clarify that the symptoms are occurring in a duration of time.

Back translation
The final Arabic-language version was back translated by a bilingual psychiatrist who did not have access to the original English version. The expert panel evaluated this version again, in order to compare it with the original version. Two words were changed to be more accurate in questions Q3 and A13.
Preliminary study of DIPA (pre-testing and cognitive interviewing) In order to use the scale in an easy and systematic way, a psychiatry specialist received training on applying the instrument. Then she conducted a pre-test with 10 patients in order to assess the tool in terms of clarity of the instructions, as well as in terms of understandability of the content and the assessment of each item in the scale.
Regarding the application of the instrument, the time used by the participants to complete the questionnaire ranged from 25 to 55 min. While performing this step, participants were asked to give their opinion on the instrument in general and on each of its items. Participants were unanimous in considering the questionnaire is easy to understand.

Test-retest reliability between two settings Selection of the sample
Sample size For reliability analysis, the standard advice is to have at least 10 participants per each item on the scale. However, this should be regarded as the bare minimum .Many authors have studied the power of different sample sizes to detect a given alpha. According to these studies [8,9], Samuels concluded and recommended that, do not run reliability analysis with less than 30 participants [10].
Over 2 months period, parents of 76 children from 1.5 to 5 years old attending outpatient clinics of AL-Abbasia Psychiatric Hospital and the child psychiatry outpatient clinic at the Institute of Psychiatry, Cairo, Egypt, were invited to participate in the study. Their age ranged from 1.5-5 years, both sexes were included. Children were excluded if their IQ was less than 90, had neurological or any other medical condition or other neuro-developmental disorders (e.g., autism and schizophrenia) as confirmed by the routine data sheet used at Ain Shams University Institute of Psychiatry (ASUIP). Sixteen parents refused to participate. Twentythree children were excluded due to the presence of one of the above exclusion criteria. During the study, seven participants missed their appointments. After the drop-outs during the study, the sample which accomplished the whole sessions ended up to 30 patients.

Tools
-Participants were interviewed twice by the trained psychiatry specialist. In the first setting, diagnostic infant preschool assessment (DIPA) and the Child Behavior Checklist (CBCL) were used, while only DIPA was used in the second setting. The mean duration between the two settings was 9.8 days; the range was 7-21 days. Test-retest reliability was evaluated by comparing the results of the two settings. -Categorical tests were performed for each disorder on two types of outcomes: (1) diagnosed disorder (fulfilled DSM-5 diagnostic criteria and functional impairment present), (2) subclinical cases (symptoms present but did not fulfill the DSM-5 criteria and functional impairment may or may not be present).

Diagnostic infant preschool assessment (DIPA)
DIPA is an interview of parents about their children in the first year of life through 6 years. It is updated for the Diagnostic and Statistical Manual, 5th Edition (DSM-5).
It covers 16 disorders. The time frame of the interview identified that a symptom or behavior be present within the last 4 weeks.
The DIPA evaluates functional impairment in a disorder-specific fashion by enquiring about impairment at the end of each disorder. Five areas of role functioning (with parents, with siblings, with peers, at school/day care, and in public) [11].

Child Behavior Checklist (CBCL) the 1.5-5 years version
The CBCL the 1.5 to 5 years version, is one of the most widely used rating scale screening measures for preschool child psychopathology currently available [12]. The test-retest reliabilities for scale creation ranged from 0.78 to 0.88 (Pearson r overall mean = 0.83). Subsequently, these DSM-oriented scales showed significant phi correlations with diagnoses derived from DISC interviews (ADHD scale with ADHD diagnosis 0.65, ODD scale with ODD diagnosis 0.42, affective scale with MDD diagnosis 0.57, anxiety scale with SAD diagnosis 0.37), except the anxiety scale did not significantly correlate with GAD diagnosis (0.29) [13]. The DSMoriented scales were originated through an empirical process in which an international panel rated the CBCL items for nine DSM disorders [13].

Validity through agreement between the translated DIPA and Child Behavior Checklist scales
It was decided to test the results of the first setting for every disorder in the diagnostic infant and preschool assessment (DIPA) separately in comparison to the Child Behavior Checklist (CBCL) scores. Comparisons were done for the attention deficit hyperactivity disorder diagnoses (ADHD) with attention deficit hyperactivity problems scale. Oppositional defiant disorder diagnosis (ODD) compared with oppositional defiant problems scale. Major depressive disorder (MDD) diagnosis compared with depressive problems scale. Sleep disorders diagnosis compared with sleep problems scale. Conduct disorder diagnosis (CD) with aggressive behavior scale. Post-traumatic stress disorder (PTSD), separation anxiety disorder (SAD), generalized anxiety disorder (GAD), reactive attachment disorder (RAD), and obsessivecompulsive disorder (OCD) diagnoses separately with anxiety problems scale.

Statistical analysis
Results were tabulated, grouped, and statistically analyzed using the statistical package of social sciences SPSS-15th version (2007). Numerical data were expressed as mean and standard deviation and range. Qualitative data were expressed as frequency and percentage. Test-retest reliability for categorical data was done using Cohen's kappa test to evaluate agreement between the two settings. Reliability assessment was based on the accepted ranges of Cohen's kappa as poor 0-0.4, fair to good 0.4-0.6, substantial 0.6-0.8, and excellent 0.8-1.0 [14]. All tests were two-tailed. A P value < 0.05 was considered significant. Validity of DIPA subscales on categorical variables against the CBCL was done using Cohen's kappas.

Sociodemographic characteristics of the sample
The current study examined 30 children. Their mean age was 4 years with a standard deviation of 0.6 and a range of 2.9 to 4.8, with the predominance of boys, as 24 children (80%) were males.

Clinical characteristics of the study participants by CBCL
Eight participants (26.7%) were above the internalizing 60th percentile cutoff, four children (13.3%) were exceeding the externalizing 60th percentile cutoff. On categorical disorders, 56.6% were above one or more 70th percentile cutoffs denoting that this was a symptomatic group as would be expected of a clinical population. The mean duration between interviews was 9.8 days (7-21 days) (Tables 1 and 2)

Test-retest reliability
Categorical tests were conducted between the results of the two settings for each disorder on two types of outcomes: (1) Diagnosed disorder (symptoms and functional impairment were present) (2) Subclinical cases (symptoms present but did not fulfill the DSM-5 criteria and functional impairment may or may not be present) The kappa was almost perfect agreement (kappa > 0.81) for all disorders with significant P-value. No cases of bipolar, OCD, or phobias were diagnosed among the study sample, so kappa could not be computed (Table 3). For categorical variables, kappas were substantial (kappa 0.61-0.80) for one disorder (CD). Kappas were moderate (kappa 0.41-0.60) for five disorders (PTSD, GAD, MDD, ODD, sleep). Kappas were poor (0-0.40) for three disorders (SAD, ADHD, and RAD). No cases of Bipolar, OCD, or phobias were diagnosed, so the validity of DIPA for these disorders, could not be tested (Table 4).

Discussion
Recent studies suggested that rates of psychopathology may be as prevalent in preschoolers as in school-age children [15]. Because of that, it is necessary to have a diagnostic instrument that is specifically made for infants and preschoolers like diagnostic infant preschool assessment (DIPA). Because of the need for an instrument like DIPA in the clinical applications and research in Egypt and the Arabic country, the aim of this study was to translate and evaluate reliability and validity of diagnostic infant preschool assessment (DIPA) on an Egyptian sample.
This would be the first Egyptian study to translate and cross culturally adapt the DIPA, keeping accordance with international guidelines to ensure the quality of results. The final version of the translated and adapted DIPA into Arabic showed high levels of acceptance and verbal understanding.
Translation is not a single process leading from a starting point ST = source text to a target point TT = target text, but a more complicated and recursive process that comprises an infinite number of feedback loops, in which it is possible to return to earlier stages of the analysis [16]. The successful accomplishments of instrument translation primarily determined by the professional knowledge, cultural experience, and linguistic competence of the translators as well as their acquaintances of the study objectives. They also must be aware of the aim of the tool so that the meanings of terms are in agreement with the context [17,18]. In our study, both translators and back translators matched these criteria, but the back translator was fully blind about the original version of DIPA to avoid any bias in the correction of the translation.
The current study used an expert panel discussion in the process of translation of the DIPA. This helped to DIPA diagnostic infant and preschooler assessment, PTSD post-traumatic stress disorder, MDD major depressive disorder, ADHD attention deficit hyperactivity disorder, ODD oppositional defiant disorder, CD conduct disorder, SAD separation anxiety disorder, GAD generalized anxiety disorder, RAD reactive attachment disorder  improve the quality of translation by the experts' constructive feedback and discussion about usage of culturally, psychologically, and religiously sensitive translated words. After developing the final version, we measured the test-retest reliability and validity of the Arabic instrument on a sample of 30 children.
In terms of test-retest reliability, categorical tests were conducted between the results of the two settings for each disorder on two types of outcomes: diagnosed and subclinical cases.
Our findings show satisfactory results for test-retest reliability, as kappa was almost perfect agreement (kappa > 0.81) for all disorders with significant P value. This result is slightly higher than the results of the DIPA 2010 version. The kappa was substantial (kappa 0.6-0.8) for one disorder (MDD), fair to good (kappa 0.4-0.6) for four disorders (ADHD-inattentive, ADHD hyperactive, PTSD-AA, and SAD), and poor (kappa 0-0.4) for one (ODD). This may be because of that, this version of DIPA 2017 with Likert-style answers on a 0-4 scale instead of yes/no answers in the DIPA2010 version and this allows a greater range of sensitivity.
This finding is comparable with several other studies, which used other tools like affective disorders and schizophrenia for school-age children (K-SADS-PL) for the assessment of preschool children, which is one of the most used instruments in child psychiatry. The kappas for all KSADS-PL positive screening symptoms were between 0.70 and 0.86 (all P values < 0.01) [19]. And test-retest reliability of the preschool age psychiatric assessment (PAPA) kappas ranging from 0.36 to 0.79 [6], and in the Child and Adolescent Psychiatric Assessment (CAPA) overall reliability of diagnosis ranged from K = 0.55 (conduct disorder) to 1.0 (substance abuse or dependence) [20]. Test-retest agreement of the diagnostic interview for children and adolescents for parents of preschool and young children (DICA-PPYC) with a mean interval of 8.8 days ranged from slight to excellent (kappa from 0.39 to 1) for DSM-IV-TR and from fair to good (kappa from 0.49 to 0. 77) for research diagnostic criteria-preschool age diagnoses [21].
One of these study limitations is that our sample did not involve children under 1.5 years. Further Egyptian studies are required to be conducted on children below this age to detect the lower age limit for which a diagnostic instrument is valid.
The Arabic version of DIPA 2017 revealed acceptable criterion validity when compared to the CBCL. For categorical variables, kappas were substantial (kappa 0.61-0.80) for one disorder (CD), moderate (kappa 0.41-0.60) for five disorders (PTSD, GAD, MDD, ODD, Sleep), poor (kappa 0-0.4) for three disorders (SAD, RAD, and ADHD). In addition, the P value was significant for all disorders except SAD and RAD.
These findings were comparable with the DIPA 2010 version's validation, kappas for disorders with impairment were fair to good for one disorder (SAD) for clinicians, and for three disorders (ADHD-hyperactive, ODD, and PTSDAA) for RAs. Kappas were poor for five disorders (ADHD-inattentive, MDD, PTSD-DSMIV, GAD, and OCD) for both clinicians and RAs, and for two more disorders (ADHD hyperactive and PTSD-AA) for clinicians, and for one more (SAD) for RAs [3].
In addition, the validity of the Arabic DIPA 2017 was slightly higher than the DIPA 2010 [3]. In the validation study of the DIPA 2010 version, there were no cases of GAD and they did not measure the DIPA validity for the following disorders: conduct disorders (CD), reactive attachment disorders (RAD), and sleep disorders. Contrary to our study as we had cases of GAD, CD, RAD, and sleep disorders. However, there were no bipolar or OCD cases in both studies. The lack of these disorders is consistent with the fact that they are rare disorders in this age group [22].
Another study limitation is that the size and character of the sample limited the ability to examine some psychiatric disorders like bipolar disorder and OCD, and to some extent RAD. In our study internalizing disorders were generally less prevalent and there were too few symptoms of these disorders to make reliable conclusions. And previous studies to investigate less prevalent psychiatric disorders (like OCD, BAD) concluded that samples need to be recruited from a specialty clinic, so they could find enough symptomatic patients [22].
Despite this limitation, our study is considered the first trial for translation and validation of DIPA instrument. In addition, its sample size is still relatively larger than previous studies for some other instruments developed for older children, including the diagnostic interview for children and adolescents (n = 27) [23], and the schedule for affective disorders and schizophrenia for school-aged children (n = 20) [19].

Conclusion
The current study shows an encouraging psychometric property for a new Arabic translated and culturally validated assessment tool for psychiatric disorders in Egyptian young children. This instrument is useful in examining DSM-IV disorders for young children. Thus, confirms the global analysis of symptoms, and helps for early diagnosis and management for children of this age group. Future studies are needed to include larger sample size, age younger than 1.5, and to include patients from specialty clinic.