Comparing inequalities in health outcomes in European countries

Quality healthcare system is a priority for citizens of any country. Citizens’ health is also a core EU priority. The objective of this article is application of multidimensional statistical techniques as a tool for information value added on health outcomes data in European countries and their further comparison. To achieve this objective, factor analysis and multidimensional comparison methods have been applied to the matrix of 16 healthcare indicators on 25 selected European countries. The synthetic variable allows transforming the countries described by a variety of healthcare indicators into one-dimensional space that considerably simplify monitoring of healthcare inequalities. The obtained results are compared with the results on the self-perceived health status provided by the citizens of the same countries. The results of this comparison have demonstrated significant similarity between self-reported statuses and objectively measured healthcare statuses. The results are presented in a visual form using tables and graphs.


INTRODUCTION
Huge differences in health and healthcare exist between and within EU countries and regions.The level of disease and the age at which people die are strongly influenced by such factors as employment, income, education and ethnicity, as well as access to healthcare.For example, life expectancy at birth varies by 10 years between the EU countries (European Union, 2013).

LITERATURE REVIEW
There are collected and regularly updated on-line published a few databases about health, health care and expenditures on health at regional, national, EU member countries, OECD countries and on the world level.The basic source of data is the database of the World Health Organization (WHO), WHO/Europe portal, which provides a selection of core health statistics covering basic demographics, health status, health determinants and risk factors, and health-care resources, utilization and expenditure in the 53 countries in the WHO European Region.Eurostat Health Database collects data on a wide range of themes including health statistics.The data navigation tree contains a number of folders under the two main headings of public health and health and safety at work.
Above mentioned Databases are used as the basis for many publications containing the key indicators of health and health systems of countries or regions and their comparison in the form of tables, graphs and by various forms of data visualization.That are, for example, annual publications as statistical yearbooks, the European health reports, Economic information of health care, eight issues of OECD Health at a Glance publications since 2010, a lot of Institute for Health Metrics and Evaluation (IHME) and Global Burden of Disease (GBD) analysis and publications.
Individual data from medical records in the registers are the basis for many analyses, which results are published mostly in medical journals, like cardiology and oncology journals.These analyses often use advanced statistical methods and statistical models, mainly logistic regression, survival models, Markov chains, stochastic models, etc.The indicators obtained in the registers are essential not only for improving treatment of diseases, but also for public health policy and efficiency of public health systems.
We will focus on actual publications whose content and methods are related to the goals and contents of this article.Álvarez-Gávlez & Castillo (2018) have used multilevel models to test the hypothesized impact of social expenditure on reducing health inequalities.Their results show that health inequalities are lower in countries where social expenditure is higher.Moscelli et al. (2017) using advanced quantitative methods have confirmed substantive differences in waiting times within public hospitals between patients with different socioeconomic status.Waiting time of hospital treatment we consider as a serious factor of health outcomes, but data on this indicator we have not used in the analysis because they are not available in the databases that we used.
The paper of Olsen & Dahl (2007) has examined self-reported health among individuals in 21 European countries based on data from the European Social Survey (ESS) conducted in 2003.Based on hierarchical modelling has tested how societal features, such as public expenditure on health, socioeconomic development, lifestyle, and social capital were related to subjective health.Because by the obtained results GDP per capita is the indicator that is the most strongly associated with better health, the eastern European countries stand out as the countries where individuals report the poorest health.The results of our comparative analyses also unfortunately confirmed the significantly worse health situation in post-socialist countries compared to other EU countries.Comparable results based on application of multivariate statistical methods have obtained also the authors of publications Jindrová (2013), Jindrová & Kopecká (2017), Pacáková et al (2016), Pacáková & Papoušková (2016).

DATA AND METHODOLOGY
In accordance with the stated objectives of article have been chosen 16 health indicators (Table 1) for further statistical analysis.The basis for analysis have been their values from the OECD Health Statistics 2017 online database (2015, or nearest year available) for 25 European countries which are the members of OECD (Table 4).Indicators (variables) H1-H5 together characterize the state of health, E1-E3 the state of healthcare expenditure, C1-C6 the level of healthcare and variables Y1 and Y2 represent the survey results of the self-assessment of the own health status by inhabitants in monitored countries.We have chosen two multivariate methods, namely Factor Analysis and Multidimensional comparisons for solution of multidimensional problem that is measuring and comparison inequalities in health outcomes in European countries.

Factor Analysis
This frequently used statistical method is described in detail in many foreign and domestic publications, for example (Hebák et al. 2007;Stankovičová &Vojtková, 2007;Hair et al., 2007;Johnson & Wichern, 2007).Its application is not possible without using any statistical software package.We have used the statistical software Statistica 12, licensed from the University of Pardubice.This article contains only the information that is necessary for understanding of computer output of factor analysis.
Factor analysis (FA) is a statistical approach that can be used to analyse interrelationships among a large number of variables and to explain these variables in terms of their common underlying factors.The general purpose of factor analytic techniques is to find a way of condensing (summarizing) the information contained in a number of original variables into a smaller set of new composite factors with a minimum loss of information.Numerous variations of the general factor model are available.The two most frequently employed approaches are principal component analysis and common factor analysis.The component model is used when the objective is to summarize most of the original information (variance) in a minimum number of factors.The Scree Plot can be very helpful in determining the number of factors to extract, because displays the eigenvalues associated with a component or factor in descending order versus the number of the factors.
An important concept in factor analysis is the rotation of factors.In practice, the objective of all methods of rotation is to simplify the rows and columns of the factor matrix to facilitate interpretation.The Varimax criterion centres on simplifying the columns of the factor matrix.With the Varimax rotation approach, there tend to be some high loadings (i.e., close to -1 or +1) and some loadings near 0 in each column of the matrix.The factor loadings show the correlation between the original variables and the factors and they are the key to understanding the nature of a particular factor.The Factor Scores in output of Factor analysis procedure display the values of the rotated factor scores for each of n cases, in our analysis for each of 25 European countries.Factor score show where each country falls with respect to the extracted factors.

Multidimensional comparative analysis
Multidimensional comparative analysis deals with the methods and techniques of comparing multifeature objects, in our case selected European countries.One of the particular problems here is that of establishing a linear hierarchy (linear ordering) among a set of objects in a multidimensional space of features, from the point of view of certain characteristics which cannot be measured in a direct way (the level of socio-economic development, the standard of health care, health status, etc.).We can also consider them as methods of linear ordering of multidimensional objects using a synthetic variable created from the original variables.The synthetic variable allows to replace the whole set of variables into one aggregated variable.Number of applications of these methods can be found in the publications of Polish statistics and econometrics, for example (Sokolowski, 1999;Kuc, 2012).Examples of their use in publications of Czech authors are (Pacáková et al., 2016;Pacáková & Papoušková, 2016;Kopecká & Jindrová, 2017).
At the beginning of the analysis, the type of each variable must be defined.It is necessary to identify whether the high values of a variable positively influence the analysed processes (such variables are called stimulants) or whether their low values are favourable (these are called destimulants).The original variables are usually expressed in different units of measurement and must be standardised to create a synthetic (aggregate) variable.A number of formulas are used for standardisation.
The synthetic variable allows to replace the whole set of origin standardised variables into one aggregated variable.There is variety of methods for creating a synthetic variable.In this paper the synthetic variable for i-th country,  i,j ,  = 1, 2, … , , has been calculated as the average of the values  i,j ,  = 1, 2, … , , where the subscript i stands for the country number, and the subscript j stands for the variable number.
The matching in the order of the countries by each pair of synthetic variables can be quantify using Spearman's rank correlation coefficient, which for any two variables X, Y and their ranks ix, iy can be calculated according to the formula These rank correlation coefficients range between values -1 and +1 and inform about degree of compliance of the ranks.

Factor analysis results
The purpose of the analysis by 3.1 is to obtain a small number of factors which account for most of the variability in the 14 original variables H1-C6 from Table 1, which characterize the health outcomes in 25 European countries (Table 4).In this case, 4 common factors have been extracted, since 4 factors had eigenvalues greater than or equal to 1 (Figure 1).Together they account for 84.0543 % of the variability in the original data.The factorability tests provide indications of whether or not it is likely to be worthwhile attempting to extract factors from a set of variables.The KMO statistic provides an indication of how much common variability is present.For factorization to be worthwhile, KMO should normally be at least 0.6.Since KMO = 0.64, factorization is likely to provide interesting information about any underlying factors.
Interpretation of the four extracted factors is based on the significant higher loadings after Varimax rotation in Table 2. Factor 1 (F1), which explains 53.39% of the total variability in the data, has six significant loadings, four with positive signs with variables H3, E1, E2, C1 and two with negative signs with variables H4, E3.Therefore, this factor F1 can be interpreted as factor of good healthcare conditions and results.Using the analogous procedure, we have identified three other factors as F2 -factor of the bad health state, F3 -factor of high morbidity and mortality for serious illness and a short life expectancy at birth and F4 -factor of the number and intensity of use of MRI units.
The Table 2 allows the calculation the values of each factor F1, F2, F3 and F4 for each of 25 selected countries, named as Factor scores.The first rotated factor has been calculated by the equation 0,452583*H1 + 0,459865*H2 + 0,720985*H3 -0,534135*H4 -0,261672*H5 + 0,82191*E1 + 0,916132*E2 -0,839152*E3 + 0,88108*C1 -0,392778*C2 + 0,00777301*C3 + 0,295482*C4 -0,155185*C5 -0,0363104*C6 where the values of the variables in the equation are standardized by subtracting their means and dividing by their standard deviations(see Hair et all, 2007;Stankovičová & Vojtková, 2007).By analogy calculation were obtained a matrix type 25 x 4 of factors scores, which is not published in article due to a limited scope.Graphical form has been preferred for presentation of factor scores to assess of causal relationships between extracted common factors in monitored European countries.Figure 2 illustrates a negative dependence of the values of the factors F1 and F2.The best situation we can see in the Scandinavian countries SE -Sweden and NO -Norway, the bad situation is evident in former socialist countries, particularly in SK -Slovak Republic, HU -Hungary and LV -Latvia.In the Figure 3 we can see that three different groups of countries have been created.The first one with high values of the F1 and low values of factor F3 form the countries SE -Sweden, NO -Norway, SW -Switzerland, IS -Iceland, DK -Denmark, LU -Luxembourg and FI -Finland.The low level in health care conditions and results (F1) and high morbidity and mortality for serious illness and a short life expectancy at birth (F2) there are unfortunately again typical for former socialist countries.The remaining countries belong to a cluster of countries with a medium level of both factors.Surprisingly is strong causality the F1 -factor of good health care conditions and results from the F4 -factor of the number and intensity of use of MRI units (Figure 4).The worst situation by both factors is again in the countries LV -Latvia, HU -Hungary and SK -Slovak Republic, the best level is in Scandinavian countries SE -Sweden and NO -Norway.The highest level of factor F4 is in DE -Germany, but the level of the factor F1 is lower.All these facts confirm that there are still significant differences in health status and healthcare among European countries.The causal relationship between extracted the factors F1 -F4, and the degree of conformity in the order of the monitored countries according to these factors with the results of perceived health status (Y1, Y2) we have quantified by the Spearman rank coefficients.The results are shown in Table 3.The synthetic variable Score(H1-C6) has been created based of all the indicators in Table 1, except the indicators of perceived health status Y1 and Y2, which together create synthetic variable Score(Y1, Y2).Spearman rank correlations between synthetic variables According to the value 0.7923 of the Spearman rank correlation coefficient between these both synthetic variables, there is a consistency in the order of the monitored European countries to about 80%.The highest values of perceived health status were found in Iceland, Sweden and Switzerland and considerably exceeded the values of the synthetic variable Score(H1-C6) created based on health outcomes indicators.A less optimistic self-reported health status in comparison with the real health situation can be observed in Germany and Portugal, in other countries the differences are insignificant.

CONCLUSION
Health status is a fundamental objective of health care systems, but improving health status also requires a wider focus on its determinants.In accordance with the objectives of the article the results of selected multidimensional methods confirm significant causal relationships between health status, health expenditures and health care resources and also indicate significant health outcome inequalities across monitored European countries.The results obtained in this article confirm the appropriateness of used multivariate methods and the suitability of the chosen indicators for comparison of health outcomes in monitored countries.The factor analysis methods have enabled to extract four common factors instead of the original 14 variables.Graphical display of countries in a two-dimensional coordinate system with the axes of the extracted common factors allows to quickly assess the observed situation in each country and also compare the situation in different countries.We can also observe clusters of countries with a high level of a certain health dimension, presented by relevant factors, as well as clusters with a medium or low level of these factors.Unfortunately there has always created a cluster with the worst level according to both factors of the former socialist countries.
The synthetic variable allows to replace the whole set of variables into one aggregated variable and transform multidimensional space in one-dimensional.Created synthetic indicators in article allow to quantify the interrelation of indicators of health status, health expenditure, personnel and technical resources of health care and subjective assessment of the health status of the population in monitored European countries.

Table 1
Y1Perceived health status good/very good health status, total 15+ (% of population, 2015 or the nearest) Y2Perceived health status Bad/very bad health status, total 15+ (% of population, 2015 or the nearest)

Table 2
Factor loading matrix after Varimax rotation Source: Authors' calculations, output from Statistica 12.

Table 3
Spearman rank correlations between values of common factors and perceived health status indicators

Table 5
Spearman rank correlations between values of synthetic variables in European countries