An analysis of the quality of studies that evaluate potentially inappropriate drug therapy

s (2 = 0.647). At all stages, the intervention of a third evaluator was necessary to resolve disagreements between the two primary researchers (Figure 1). Of the selected studies, 40.3% were performed in Europe, 32.7% in North America, and 4.2% did not indicate the country where the research was conducted. The samples observed were heterogeneous; some 316 Afr. J. Pharm. Pharmacol. studies evaluated individual patients, while other studies evaluated prescriptions in databases. Thus, the sample size varied from 30 patients in the study by Stuij et al. (2008) to 33,830,599 prescriptions in the study by Lai et al. (2009). When we grouped the different samples used in “number of reviews,” the average number of reviews (patients or prescriptions) across studies was 1,223. Three studies did not specify the sample size. The duration of the studies varied from 1 month to 9 years. Notably, 79% of the studies did not indicate or specify a study duration. Among the studies, 32.7% were cross-sectional, 19.3% were cohort, and 19.3% did not report which methodological design was used in the study. In addition, 22.6% did not provide a complete description of the methodological design. As for the study scenario, the most frequent were hospitals or outpatient clinics, which accounted for 38.6% of the studies. In 8.4% of the studies, retirement, social security, and health plan databases were used for data collection. Only two studies were undertaken using more than one study scenario (Crotty et al., 2004; Miquel et al., 2010). Additionally, 94.9% of the studies were written in English, and 15.9% of the articles did not mention their limitations in the text. Regarding the fulfillment of the items proposed by STROBE, 49 articles met between 60 and 100% of the 34 items recommended by the initiative (Table 1).


INTRODUCTION
The aging process produces physiological and pathological alterations that increase the predisposition to chronic diseases and consequent use of various medications.This increased consumption of medication raises the odds of the elderly population using five or more drugs (polypharmacy), which increases the occurrence of problems related to the use of medication (Ribeiro et al., 2005;Soares et al., 2011).For this reason, pharmacotherapy for the elderly is challenging, especially if potentially inappropriate drug therapy (PIDT) is prescribed, as this increases health risks (Gallagher et al., 2007).
Medication is potentially inappropriate when its risks outweigh its benefits (Beers et al., 1991;Gallagher et al., 2008).Notably, elderly patients consume three times more medications than young adults in industrialized countries.According to Brekke et al. (2008), 10 to 20% of hospital admissions among elderly people are due to PIDT use.This is because elderly persons using PIDT are 1.8 to 1.9 times more likely to be hospitalized (Albert et al., 2010).
Additionally, there is worldwide discussion about whether the standards used in the prescription of pharmacotherapy in older people are inappropriate (Iyer et al., 2008).For example, a study conducted in the south of Ireland with 1.329 patients over 65 years of age, with an average of five drugs per patient, identified 632 prescriptions containing PIDT (Albert et al., 2010).Laroche et al. (2007) showed that the incidence of damage caused by medication was 20.4% among patients with PIDT, compared to 16.4% for patients who use only medications appropriate for the elderly.
Concerns regarding the harmful effects of the use of medication by the elderly led health professionals, such as pharmacists and physicians, to develop and implement various methods and tools to identify PIDT prescription patterns (Ribeiro et al., 2005).Therefore, the adequacy of these techniques should be evaluated by explicit and implicit methods, and the tools validated to reduce PIDT prescription (Iyer et al., 2008;Forsetlund et al., 2011).Some revisions debate these instruments, but there are few published systematic reviews assessing the quality of studies that use tools to evaluate PIDT in various practice scenarios (Guaraldo et al., 2011;Dimitrow et al., 2011).The purpose of this review was to analyze research that uses tools to assess PIDT through the strengthening the reporting of observational studies in epidemiology (STROBE) initiative.

METHODOLOGY
A review of the scientific literature was performed to identify studies involving inappropriate prescriptions for elderly patients.The Literatura Latino-Americana e do Caribe em Ciências da Saúde (LILACS), PubMed, Scopus and Web of Science databases were reviewed (up to January, 2013).The search strategy included the following keyword terms in various combinations: in English, "aged," "elderly," "inappropriate prescribing," and "drug utilization"; in Spanish, "anciano," "utilización de medicamentos," and "prescripción inadecuada"; and in Portuguese, "idoso," "medicamento inapropriado," "medicamento inadequado," and "uso de medicamento".The research strategies were adapted according to the protocols of each database.The keywords were defined using the National Library of Medicine's controlled vocabulary thesaurus (MeSH).It consists of sets of descriptors, arranged in a hierarchical structure that permits searching at various levels of specificity.In addition to the MeSH terms, other non-standard terms were used to expand the search strategy.The study design followed the guidelines of the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA).
The subsequent screening process was performed in three stages (title, abstract, and full text screening) by two researchers (APALS and DTS); when there was disagreement, a third researcher (DPLJ) analyzed and judged the discrepancy.The measure of agreement between the two reviewers -defined as Cohen's kappa () was calculated with a confidence interval of 95%.The titles and abstracts were compared using the following predefined inclusion criteria to determine the relevance of the theme: (i) the study involved the use of potentially inappropriate medication for elderly patients, and (ii) the study used a validated tool to make such an assessment.
A researcher (APALS) performed an initial selection, which excluded the titles that did not meet the inclusion criteria.The studies excluded were as follows: (i) reviews and editorials; (ii) studies not written in English, Portuguese, or Spanish; (iii) studies that did not provide the abstract or full text (even with attempts to get them by direct email to the authors); (iv) studies that evaluated only one or two classes of drugs; and (v) studies evaluating PIDT in only one or two diseases.
The papers that satisfied the inclusion criteria for data extraction were carefully examined regarding the following variables: country, sample size, duration, study type, practice scenario, language of publication, limitations, and fulfillment of the items proposed by the STROBE initiative.The final analysis was performed to assess the methodological rigor of the articles published in this research area; for that purpose, the STROBE tool was used (Malta et al., 2010).The tool's 22 items were separated into 34 items to perform a more complete and accurate description of observational studies.In this review, each item fulfilled by the article was awarded one point; thus, the score could vary from 0 (0%) to 34 (100%) points.

RESULTS
From the various combinations of keywords, 8,610 articles were found.The first evaluation was performed by one of the evaluators (APALS) who excluded 7,372 articles that did not meet at least one of the inclusion criteria.Of the remaining 1,238 articles, 478 were repeated in the databases.Thus, 760 titles were considered potentially relevant.Of these, 365 were excluded for not meeting the inclusion criteria, leaving 395 items to be evaluated according to the abstracts.In this study, 44 abstracts were not available; therefore, 351 abstracts were read and evaluated.From this evaluation, a further 144 articles were excluded for not meeting the inclusion criteria, leaving 207 articles to be read.At first, 76 articles had no free access, and 50 articles were later retrieved by the bibliographic commutation program of the Brazilian Institute of Science and Technology (IBICT-Comut).Of the articles assessed manually, 62 did not meet the inclusion criteria.At the end of the selection process, 119 articles met the specific inclusion criteria.
Figure 1 shows the progressive selection, number of articles, and reasons for exclusion at each step.The degree of agreement among the researchers was moderate for the titles ( = 0.479) and substantial for abstracts ( 2 = 0.647).At all stages, the intervention of a third evaluator was necessary to resolve disagreements between the two primary researchers (Figure 1).
Of the selected studies, 40.3% were performed in Europe, 32.7% in North America, and 4.2% did not indicate the country where the research was conducted.The samples observed were heterogeneous; some studies evaluated individual patients, while other studies evaluated prescriptions in databases.Thus, the sample size varied from 30 patients in the study by Stuij et al. (2008) to 33,830,599 prescriptions in the study by Lai et al. (2009).When we grouped the different samples used in "number of reviews," the average number of reviews (patients or prescriptions) across studies was 1,223.Three studies did not specify the sample size.The duration of the studies varied from 1 month to 9 years.Notably, 79% of the studies did not indicate or specify a study duration.
Among the studies, 32.7% were cross-sectional, 19.3% were cohort, and 19.3% did not report which methodological design was used in the study.In addition, 22.6% did not provide a complete description of the methodological design.As for the study scenario, the most frequent were hospitals or outpatient clinics, which accounted for 38.6% of the studies.In 8.4% of the studies, retirement, social security, and health plan databases were used for data collection.Only two studies were undertaken using more than one study scenario (Crotty et al., 2004;Miquel et al., 2010).Additionally, 94.9% of the studies were written in English, and 15.9% of the articles did not mention their limitations in the text.Regarding the fulfillment of the items proposed by STROBE, 49 articles met between 60 and 100% of the 34 items recommended by the initiative (Table 1).

DISCUSSION
Most of the studies included were performed in the US.This may be because the Beers criteria (most used/cited in the literature), STOPP-START criteria, Medication Appropriateness Index (MAI), Assessing Care of Vulnerable Elders (ACOVE), drug use review (DUR), HEDIS criteria, and Zhan criteria were developed there.The prevalence of studies and criteria developed in the US confirms the country as a pioneer in the clinical arena, especially regarding the evaluation of pharmacotherapy (Silva et al., 2010).Additionally, several studies were conducted in Europe, which further indicates the progress of PIDT research in developed countries compared to developing countries.Therefore, it is necessary for developing countries to increase research in this area, focusing on the effectiveness of treatments and above all, the safety of patients.
In the reviewed studies, we found a high variation in sample size, which provided a comprehensive evaluation of the tools used in different sample groups.However, two studies did not clearly describe the size of the sample surveyed (Goulding, 2004;Van der Hooft et al., 2005).In this case, two studies indicated that the lack of information on the sample could reduce the impact of the study (Holmes et al., 2009;Malta et al., 2010).Therefore, the sample in which the hypothesis is being tested should be stated and comprehensively detailed to ensure the robustness of the study.
The largest study samples consisted of retirement and health plan databases to evaluate PIDT.Despite being a viable strategy to assess the situational diagnosis of a sample, it is necessary to question the validity of the results obtained from databases such as these because the use of secondary data can mask possible selection biases.According to Guaraldo et al. (2011), an active data search can decrease the overestimation or underestimation of drug use because it is unknown whether the patient actually used the prescribed pharmacotherapy.
There was a variation of 107 months between studies.Additionally, some of the manuscripts were unclear in differentiating between the time of data collection and the study duration.Thus, in most of the articles, the real time of execution of the study is not clear, which compromises the reader's understanding.According to von Elm et al. (2007), the author should describe the context in which the study is inserted, in addition to locations and relevant dates, including periods of recruitment, exposure, followup (if any), and data collection.Thus, an adequate description assists in the analysis of the results of the study so that they can be incorporated into public policies and/or large interventions, if necessary.
In this review, there were a large number of crosssectional studies.The cross-sectional study can be used as an analytical study to evaluate hypotheses of association between exposure/characteristics and an event; they are cost-effective, easy, and fast to perform.In addition, they describe what happens to a particular group, at a particular time, and are thus important guides for decision making in the health-planning sector (Lima-Costa and Barreto, 2003).However, there are limitations when trying to identify the nature of the relationships between exposure and event in these situations.Therefore, confounding factors must be considered in this type of study, which emphasizes the need for clinical trials to evaluate the effect of PIDT in the elderly population (Hanlon et al., 2000).Approximately 42% of the studies included in the analysis either lacked methodological rigor in the description of the study design or did not mention it at all.Methodological rigor is necessary to provide sufficient detail so that the reader can understand and duplicate the methodology if they wish (Holmes et al., 2009).
Among the practice scenarios, there was a higher prevalence of studies performed with institutionalized elderly people in comparison to studies with noninstitutionalized elderly.However, this prevalence exists because the criteria used for these studies have been primarily developed for evaluating the pharmacotherapy of non-institutionalized elderly patients who have different socio-demographic and clinical characteristics from institutionalized patients (Hanlon et al., 2011).Moreover, it was observed that some tools developed a priori for non-institutionalized elderly patients were used in Cohort study-Report numbers of outcome events or summary measures over time.Case-control study-Report numbers in each exposure category, or summary measures of exposure.Crosssectional study-Report numbers of outcome events or summary measures 96.4 Give unadjusted estimates and, if applicable, confounder-adjusted estimates and their precision (e.g., 95% confidence interval).Make clear which confounders were adjusted for and why they were included 74.1 Report category boundaries when continuous variables were categorized 76.7 Table 1.Cont'd.
If relevant, consider translating estimates of relative risk into absolute risk for a meaningful time period 14.2 Report other analyses done-e.g., analyses of subgroups and interactions, and sensitivity analyses.
86.6 Summarize key results with reference to study objectives.100 Discuss limitations of the study, taking into account sources of potential bias or imprecision.Discuss both direction and magnitude of any potential bias 83 Give a cautious overall interpretation of results considering objectives, limitations, multiplicity of analyses, results from similar studies, and other relevant evidence 97.3 Discuss the generalizability (external validity) of the study results 43.7 Give the source of funding and the role of the funders for the present study and, if applicable, for the original study on which the present article is based 50.8 institutions.According to Bakken et al. (2012), the application of these criteria should be carefully applied, because they can be affected by differences in study population and data source.
The institutionalization of patients can facilitate the collection and evaluation of data, justifying the high number of hospital-based studies.In this sense, the applicability and reliability of these tools should be evaluated carefully through the analysis of the results obtained in their respective studies to avoid reproducing the erroneous selection of criteria.
Regarding the citation of research limitations in the text, most of the studies were in agreement with Malta and colleagues who advocate that the manuscript should describe its limitations and consider potential sources of inaccuracy (Malta et al., 2010).Further, the study should discuss the magnitude and direction of potential bias, which is essential for the reader's understanding, as well as evaluations by the article reviewers (Holmes et al., 2009;Malta et al., 2010).Fewer than half of the observational articles included in the review fulfilled 60% or more of the items proposed by STROBE.Overall, the studies included in this review had no good methodological consistency.This may be related to lack of standardization of studies and the fact that discussion on the use of PIDT has been recent.The intention of the STROBE initiative is to offer a recommendation on how to report observational studies more accurately, without making recommendations or prescriptions to the design or conduct of these studies.However, adherence to the items contributes to a more accurate report of such studies, and consequently facilitates the review of these publications by editors, reviewers, and readers (Malta et al., 2010).
In general, the results of the studies included in this review indicated high levels of PIDT.Strategies to reduce unnecessary prescriptions should be implemented to promote more appropriate use of these medications among this age group.The careful use of PIDT lists can assist with the detection of these drugs and prevent problems related to their use (Gallagher et al., 2007).In addition to identification of PIDT, it is necessary to undertake practical interventions.A study that aimed to systematically review the effects of interventions to optimize prescription found that, of the 16 studies assessed, 8 reviewed the impact of educational interventions, and of those, six showed statistically significant improvements in prescription quality.A multifaceted approach and clearer policy guidelines are required to improve prescriptions for these vulnerable patients (Loganathan et al., 2011).Moreover, strategies shown to be effective for improving prescription outcomes include educational outreach visits (academic detailing), and interventions involving a pharmacist.Pharmacist services, such as conducting medication reviews or providing advice to general practitioners, may lead to improvements in prescription outcomes (Clyne et al., 2013).

Strengths and limitations
The study's strength is that it was the first review to assess the methodological rigor of studies evaluating PIDT.Its limitations include the use of English, Portuguese, and Spanish keywords, which can omit important publications in different languages; this limitation is common to systematic review articles.Other keywords, such as "potential inappropriate drug therapy," were not used.Furthermore, database restriction and the search strategy may have excluded important studies not published in the data sources used.The exclusion criteria used in this study may have also excluded relevant studies; however, it was necessary to adopt such measures, as the review's purpose was to evaluate studies focusing on various diseases and medications.Moreover, no studies were analyzed that evaluated the omission or subuse of medication, and studies that obtained negative results may not have been published.

Agenda for future studies
Current PIDT studies are potentially valuable because, in general, their objective is to verify PIDT prevalence in various scenarios, as well as serve as a warning to health care professionals who work with elderly patients.However, more research is needed in this area, particularly in developing countries, as it is necessary to evaluate the morbidity and mortality related to PIDT use.To reduce the limitations of PIDT studies, an active data collection search is needed, through which the reported prevalence of PIDT will be more reliable.Moreover, studies that relate the use of PIDT with outcomes such as adverse effects, hospitalizations, and deaths are rare, but are required to verify the real problems associated with using PIDTs.As noted in this review, studies evaluating interventions, such as education, have shown positive results.Thus, more studies, especially randomized clinical trials, are needed to conclude whether the interventions are indeed effective.

Conclusion
A discussion of the methodological rigor of studies evaluating PIDT is critical and can contribute to the wider health care discussion.This review showed that PIDT is studied mainly in developed countries, which reinforces the need for more research in developing countries.The articles included in this study focused on observing the prevalence of PIDT in various practice scenarios.Most studies were observational and fulfilled at least 40% of the items proposed by the STROBE initiative.Our results have highlighted the potential for more detailed studies about PIDT with practical implications for patient safety.

Table 1 .
Compliance of the Items Proposed by STROBE.Cohort study-Give the eligibility criteria, and the sources and methods of selection of participants.Describe methods of follow-up.Case-control study-Give the eligibility criteria, and the sources and methods of case ascertainment and control selection.Give the rationale for the choice of cases and controls.Cross-sectional study-Give the eligibility criteria, and the sources and methods of selection of participants 93.7Cohort study-For matched studies, give matching criteria and number of exposed and unexposed.Case-control study-For matched studies, give matching criteria and the number of controls per case 0Clearly define all outcomes, exposures, predictors, potential confounders, and effect modifiers.Give diagnostic criteria, if applicable 91 For each variable of interest, give sources of data and details of methods of assessment (measurement).Describe comparability of assessment methods if there is more than one group Report the numbers of individuals at each stage of the study-e.g., numbers potentially eligible, examined for eligibility, confirmed eligible, included in the study, completing follow-up, and analyzed