Household location and self-assessed health among Brazilian adults living in large cities: A multilevel analysis

José Rodrigo de Moraes*, Jéssica Pronestino de Lima Moreira, Karynna Pimentel Viana, Alexandre dos Santos Brito and Ronir Raggio Luiz Departamento de Estatística, Instituto de Matemática e Estatística, Universidade Federal Fluminense, Rua Mario Santos Braga s/ no – Campus Valonguinho – 7o andar – Centro, Niterói / Rio de Janeiro, CEP: 24020-140. Instituto de Estudos em Saúde Coletiva, Universidade Federal do Rio de Janeiro Rio de Janeiro (RJ), Brasil.


INTRODUCTION
Urbanization is still considered to be the phenomenon with the greatest influence on socioeconomic and environmental conditions in developing countries (Martine and Mcgranahan, 2010). Different regions and states in a country present unequal urbanization and contrasts in population distribution between urban and rural areas (Giffoni, 2010). Rapid population growth with proper adaptation of infrastructure conditions is a threat to sustainable development and has consequences for urban populations such as pollution, environmental degradation, and unsustainability of production and con-sumption patterns (United Nations, 2014). Geib (2012) reinforced this idea by affirming that urbanization had worsened poverty and social exclusion, and had contributed towards maintenance of income inequalities and proliferation of poor-quality housing, thereby impeding development of the concept of healthy housing (Martine and Mcgranahan, 2010).
In developing countries, urbanization has taken place more rapidly and the rate of urbanization has presented a weaker correlation with economic growth than in developed countries (United Nations, 2013). While urbanization has brought positive opportunities for the population, especially in East and Southeast Asia, it has also brought negative effects for the health and wellbeing of the population, including that in Latin America, North and sub-Saharan Africa, the Caribbean, and South Asia (Muggah, 2014). From a systematic review, Eckert and Kohler (2014) concluded that in developing countries, urbanization is not significantly associated with greater life expectancy and that risk factors for chronic diseases are more prevalent in urban areas. These authors also highlighted higher mortality among children under the age of 5 years in urban areas, as indicators of worse quality of life.
Brazilian urbanization has been marked by profound spatial and social transformations characterized by a set of risk factors that include the following: unemployment, poor urban housing and working conditions, inadequate basic infrastructure conditions, and violence (Soares et al., 2014;Neto, 2011;Angel and Bittschi, 2014). These problems have tended to amplify the adverse effects on health, especially in the absence of any proactive attitude towards the population's needs (Martine and Mcgranahan, 2010).
Through the new paradigm for the health-disease process based on promotion and prevention, for which the expected result is improvement of the population's quality of life and well-being, studies within the field of public health that also take into account the attributes of urban spaces to explain health differences within urban populations have emerged over recent decades. However, the number of studies conducted in Brazil that have aimed to find associations between individual/contextual determinants of the housing location and self-assessed state of health remains small (Pavão et al., 2013). Thus, this point justifies conducting studies that take into account the environmental characteristics inside and outside homes, along with the individuals' characteristics.
This study had the objective of establishing the association between household location and overall selfassessed health, among adults in areas of high population in Brazil, using multilevel analysis. In evaluating this association, the control variables consisted of a set of characteristics relating to the individuals and the environments inside and outside their homes.

National household sampling survey
The Brazilian National Household Sampling Survey (PNAD) is a series of complex sampling surveys of national coverage, conducted by the Brazilian Institute for Geography and Statistics (IBGE). For the 2008 survey, information on a probabilistic sample of 150,591 households and 391,868 individuals was gathered (IBGE, 2010).
The PNAD sample was planned such that representative estimates would be obtained for all of Brazil, major regions, federal states, and nine metropolitan regions. With regard to sample planning, PNAD was a cross-sectional study that used a complex sampling plan including stratification, unequal selection probabilities, and clustering of units into two or three selection stages, depending on whether the stratum was from self representative or non-self-representative municipalities. For self representative municipalities, the PNAD sampling plan was stratified according to municipality (stratum) and clustered into two stages, in which census tracts were the primary sampling unit and households were the secondary sampling units.
For non-self-representative municipalities, the sampling plan was stratified with strata formed by sets of non-self-representative municipalities according to size and geographical proximity, and clustered into three selection stages in which the non-selfrepresentative municipalities were the primary sampling units, the census tracts were the secondary sampling units, and the households were the tertiary sampling units (Silva et al., 2002).
The PNAD sampling weights comprised the product of the natural weights of the design (the inverse of the selection probabilities at each stage) and an adjustment factor calculated as the ratio between the estimated and known (or projected) total populations (Silva et al., 2002).

Study population
The study population was formed by 92,745 Brazilian adults aged 20 years or more who declared what their overall state of health was. They were living in permanent private households located in large-population municipalities, that is, self-representative municipalities.

Multilevel ordinal logistic regression analysis
In this study, a multilevel ordinal logistic regression was fitted using the STATA 10 software. The model had four hierarchical levels, such that the adults were the first-level units, the households were the second-level units, the census tracts were the third-level units, and the municipalities were the fourth-level units (Carle, 2009). The hierarchical data structure corresponded to the characteristics of the PNAD sampling plan for the municipalities considered in this study, with the exception of the survey sample weights. Before fitting the multilevel model, an analysis was conducted to assess whether the sampling weights would be informative, that is, whether these weights would correlate with the outcome of interest in the presence of the structural variables of the sampling plan (Carle, 2009;Johnson, 2008).
To assess the need to fit a four-level multilevel model, a fitted model comparability test was also applied (chi-square test for pseudo-likelihood ratios).
The outcome from the model was self-assessed health, according to the following three-category ordinal scale: (1) poor/very poor; (2) fair; and (3) very good/good. Besides considering the household location, a set of 18 control variables that portrayed the characteristics of the individuals and the environments inside and outside their homes (census tract) were also considered in the linear structure of the model.
The characteristics of the adults (first-level unit) composed of 12 variables: sex, age group, color/race, schooling level, occupational situation, physical activity, smoking, self-reported morbidity, physical mobility, possession of a health insurance plan, consultation with a doctor within the last 12 months, and region of residence. The characteristics of the households (second-level unit) composed of five variables: household registered with the family healthcare program, housing quality, possession of basic goods in the household, household occupation condition, and per-capita monthly household income. Lastly, the characteristics of the census tracts (third-level unit) consisted of the proportion of the households in the census tract that were considered to present adequate housing quality, that is, in relation to basic social services (water, sewage, garbage, and electricity), housing density, and housing construction standards. For the municipal level, no variable was included, other than identifying the municipality to incorporate and stratify the sample.

RESULTS
Among the adults living in large-population municipalities, 96.3% were living in households located in urban areas and the majority (72.3%) of them reported having a good/very good state of health, while 23.1% reported having a fair state of health and 4.6% reported having poor/very poor state. In relation to the control variable distribution, it can be highlighted that the greatest proportion of the adults lived in the southeastern region (51.7%), in households with the four basic goods (89.5%), and with adequate housing quality (66.3%). It was also observed that in these municipalities, there were greater proportions of adults who have never smoked (52,4%), who practiced physical activity (28.5%), who did not have any chronic diseases (53.7%), who did not have a health insurance plan (63.0%), who did not have physical limitations (65.6%), who had consulted a doctor within the last 12 months (76.3%), and who did not live in households registered with the FHP (66.1%) ( Table 1).
In the preliminary analysis correlating the sampling weights with the outcome of self-assessed health among the adults, by means of multilevel modeling, it was observed, taking into consideration the hierarchical data structure (stratification and clustering of the units), that the sampling weights did not show any statistical correlation with the self-assessed health levels among the adults (p-value = 0.444), thus indicating that the sampling weights were uninformative and that there was no need to incorporate them into the analysis, in the case of this outcome in particular. Table 2 presents the results from the fitted model comparability tests (chi-square test for pseudo-likelihood ratios). The results from the tests show that the multilevel model considering four hierarchical levels was the most appropriate one, that is, it was concluded that the random effects from the census tract and municipality separately (tests 1 and 4) or together (test 2) contributed significantly to the quality of the model. From test 3, it was also seen that the random effect of the household was significantly different from zero, when the other group effects (census tract and municipality) were kept in the model ( 2 =678.29; p-value<0.001).
From the four-level null model, that is, the ordinal logistic model that fitted only with the random intercepts at the levels of the households, census tracts and municipalities (Table 3), and variance partition coefficients (VPC) were obtained. Through the VPC calculation for the municipalities (fourth level), it was found that approximately 2.0% of the variation in the levels of self-assessed health was attributable to differences between the municipalities. The VPC for the census tracts (third level) indicated that 7.4% of the variation in the levels of self-assessed health was attributable to differences between census tracts within the same municipality. The VPC for the households (second level) was higher and showed that approximately 28% of the variation in the levels of self-assessed health was attributable to differences between households within the same census tract in the same municipality.
Although the proportion of the variation explained by differences between the municipalities was low (VPC2.0%), the municipal level was taken into account in analyzing the levels of self-assessed health because the considered municipalities represented the strata in the PNAD sampling plan.
In the multilevel model fitted only with the household location (Table 3), it was observed that the location presented a statistically significant effect on the levels of self-assessed health among the adults (OR=1.42; p-value<0.001). The odds ratio measurement of 1.42 indicated that the chance that the adults living in the considered municipalities would self-report a better state of health was 42% greater in the urban areas than in the rural areas. In comparison with the random part of the null model, it could be seen that the variance estimates of the random intercepts remained practically unaltered when the household location was introduced.
In fitting the multilevel model with the household location and all the 18 control variables that portrayed characteristics of the adults and the environments inside and outside the home, it was observed that only two variables of the environment inside the home (household occupation condition and FHP) did not present any significant effect. These were therefore excluded from the analysis, thus resulting in the model presented in Table 4. In controlling for the other variables, household location ceased to have a statistically significant effect on the levels of self-assessed health among the adults (OR=0.92; p-value=0.186).
In addition, after controlling for the association between household location and self-assessed health level using variables relating to the individuals and the environment inside and outside their homes (Table 4), it was observed that the variances of all the random intercepts of the model decreased, by the following percentages: 6.3% for the household level, 34.3% for the census tract level, and 35.7% for the municipality level. These reductions may have been due to differences in composition at house-   It was also observed that the chance of self-reporting a better state of health was 6% higher for adults living in households with adequate housing quality (OR=1.06; p-value=0.027), and 28% higher for those living in households with all the basic goods (OR=1.28; p-value<0.001). Furthermore, it was found that the chance that an adult would report a better state of health increased by 35% with an increase of one percentage point in the proportion of households with adequate housing quality in the census tract (OR=1.35; p-value<0.001).
Taking the central-western region as the reference category, it was observed that in the northern region (OR=0.86; p-value=0.007) and northeastern region (OR=0.75; p-value<0.001), there was a lower chance that an adult would self-report a better state of health, while in the southern region (OR=1.10; p-value=0.076) and southeastern region (OR=1.07; p-value=0.155), there was a higher chance, although the association found for the last two regions was not significant. It was also observed that the chance of better self-assessed health among the adults increased with increasing schooling level and per-capita household income, and this decreased with increasing physical mobility problems.

DISCUSSION
This study using multilevel analysis sought to establish the relationships between self-assessed health levels and a set of factors relating to individuals and their environment, for a complex sample of adults living in large-population municipalities. The results showed that, in comparison with rural areas, urban areas were associated with better levels of self-assessed health among the adults. However, after controlling for variables relating to the individuals and the environment inside and outside their homes, the association between the household location and selfassessed health ceased to present any significant effect. Moreover, after controlling for these variables, it was observed that the variance estimates for the random intercepts of the model underwent reductions, thus show-ing that there was a compositional effect from the housing location (municipality, census tract, and/or household) on the levels of self-assessed health among the adults. Like in this study, Oliveira et al. (2014) observed in their analysis that individuals living in urban areas had a greater chance of reporting a better state of health than those living in rural areas. In the same way, these authors did not find any significant association between the area in which the home was located and self-assessed health when they used an ordinal (that is, non-multilevel) logistic model that included variables of socioeconomic, demographic, and health-related nature.
In this study, only those adults who declared their state of health were taken into consideration. Those whose state of health was informed by other people living in the same household, or even by other people not living in the household, were excluded, given that the information provided by third parties could increase the chance of bias regarding the overall state of health. Self-assessed health is an indicator that has been surveyed in different population-based investigations within the field of healthcare, and this has been done for several reasons: its ease of measurement or application (Höfelmann and Blank, 2007); its reliability and validity as a measurement (Barros et al., 2009;Freitas et al., 2009;Peres et al., 2010); its capacity for international comparisons (Theme-Filha et al., 2008); its intrinsic subjective nature (Nogueira, 2008); its strong association with the real state of health (Camargos et al., 2009); and its capacity as a sensitive predictor of morbidity and mortality (Silva and Menezes, 2007;Idler and Benyamini, 1997).
Because of the hierarchical structure of the PNAD data, in which the adults are grouped in household units that are grouped in census tracts, which in turn are grouped in municipalities, a multilevel ordinal logistic regression model with four hierarchical levels (adult, household, census tract, and municipality) was used in this study. This model is appropriate for analyzing data from surveys that have some type of correlation structure, such as longitudinal surveys or those that use clustering, such as the PNAD surveys.
Multilevel analysis is one of the types of regression analyses that simultaneously takes into consideration multiple levels of aggregation, thereby making the standard errors, confidence intervals, and hypothesis tests correct (Laros and Marciano, 2008). In addition, this type of analytical approach does not only enables inclusion of random intercepts that represent the heterogeneity between the groups relating to the outcome of interest, but also makes it possible to consider random coefficients that, in turn, represent the heterogeneity in the relationship between the outcome and the explanatory variables (Rabe-Hesketh et al., 2011).
It also needs to be mentioned that some difficulty is involved in fitting this type of model in situations of com-plex samples, because of the need to incorporate not only information from the sampling plan (stratification, clustering, and sample weights), but also from the hierarchical data structure. Nevertheless, by taking into consideration only the adults living in the selfrepresentative municipalities, the hierarchical levels of the variables were made to coincide with the survey clustering and stratification structure that was used in these municipalities. In fitting the model using the GLLAMM software, the municipal stratum was taken to be a random effect of higher level (Sterba, 2009).
In this study, it was found that there was no relationship between the sampling weights and the study outcome. It was thus concluded that the sample weights were uninformative (Asparouhov et al., 2004), and for this reason, they were not taken into consideration in the multilevel modeling.
One of the limitations of this study may lie specifically in the definitions of urban and rural areas that are used in Brazil, which are political-administrative definitions based on municipal laws. The other limitation relates to noninclusion of other variables of importance for explaining the variation in adults' health levels, such as variables relating to nutrition and atmospheric pollution, since these did not form part of the PNAD supplement relating to health in 2008.
Independent of whether the living spaces were urban or rural, this study showed the effect of living conditions in environments inside and outside homes on self-assessed health levels among adults in these municipalities, that is, it concluded that adults who reported better health levels lived in homes of adequate housing quality, had all the basic goods, had higher per-capita household income, and lived in census tracts with higher percentages of homes of adequate housing quality. In relation to the housing question, Angel and Bittschi (2014) also obtained evidence of the effect of poor housing conditions on negative self-perceptions of health. In addition, they observed that the likelihood of suffering from chronic diseases was higher when housing problems accumulated over the course of time.
Furthermore, it was observed that there was an effect on self-reported health coming from individual factors (sociodemographic, health-related, and behavioral and lifestyle factors). Many of these factors were also shown to be associated with worse self-assessments of health in the study by Pavão et al. (2013), such as being in older age groups, having lower schooling levels, being a smoker or former smoker, not doing physical activity, and having a chronic disease.
Therefore, the need for urbanization to be guided through more effective governance is emphasized, with the aim of not worsening the social and environmental problems that exist in Brazilian cities. Urbanization should be accompanied by social and healthcare policies, so as to avoid its adverse effects on the population's health.