What does Bayesian probit regression tell us about Turkish female-and male-headed households poverty ?

The objectives of the study are to examine the determinants of the poverty status and to illustrate the probabilities of household poverty in Turkey using the Household Budget Survey which was prepared by the Turkish Statistical Institute, 2013. The data is reorganized as rural and urban area considering femaleand male-headed households so that to analyze the determinants of household poverty. Bayesian probit regression is applied using a Markov Chain algorithm, Gibbs sampler. The results of the study show that the most effective variables, which cause a decrease of the probability of living under poverty line, are education level of bachelor for 4 years, master and PhD for female-headed households and household type of being single adult for male-headed households in urban area, working full time for maleand female-headed households in rural area. However, other most remarkable variables, which cause an increased risk of poverty, are being elderly, disabled or inoperable for male-headed households, being illiterate for female-headed households in urban area and for rural area, being elderly, disabled or inoperable for maleand marital status of being single for female-headed households.


INTRODUCTION
The most common qualitative choice model is the probit model that belongs to the class of latent variable threshold models for analyzing dichotomous variables.This model is appropriate when the response takes one of only two possible values representing success and failure, or more generally the are introduced in Section 4. Variables are presented in Section 5. Section 6 reports the estimation results.The final section presents the conclusions.

LITERATURE REVIEW
In the literature, Bayesian logit and probit models are applied to make a comment for different topics such as credit scoring, life satisfaction or generally, to compare Bayesian and classical form of the models.For instance; Altaleb, Chauveau (2001), Mila, Michailides (2006), Genkin, Lewis & Madigan (2007), Tektas & Guney (2008), Cengiz, Terzi, Şenel, Murat (2012), Lund & Sørensen (2012), Merino, Olmos, Cebollero (2012), Acquah (2013), etc. and all those studies found that Bayesian approach gives better results than classical approach.However, this paper is the first study, as far as the authors know, which applies Bayesian approach for dichotomous probit regression to examine the determinants of household poverty.
The probit model provides a range of applications in both econometrics and poverty analysis.Poverty is a common problem and threating the whole world leading a lot of problems.According to current studies, people still lives under the risk and tend to live under the poverty line in their future.Therefore, several studies focused on issues related to the household poverty in frame of the dichotomous probit model in the empirical literature.Li et al. (2011) employed probit regression model to make an empirical analysis for poverty of peasant households in minority regions.According to results, some variables such as educational level of family members, healthy condition and outside labor service have an important effect on poverty of peasant households.
Using the household survey data in Fiji, Gounder (2013) estimated the correlates of household consumption and poverty in Fiji with ordinary least squares modelling and in order to check robustness, probit model was also estimated.The findings indicate that higher education level, supporting agricultural growth policies in rural areas and reallocation of labor into the formal sector of the economy are effective factors to reduce household poverty.Ataguba et al. (2013) analyzed multidimensional poverty in Nsukka, Nigeria using different approaches including probit model and according to findings of the study, large family size, low level of education, living in rural area, poor health and employment can be assumed as major determinants of poverty.
Using the ordinary least squares and probit model, Adjasi & Osei (2007) examined correlates of poverty in Ghana using household living standards survey.Findings reveale that expenditure inequality is high and greater in the rural areas compared to the urban areas and probability of living under poverty line is low for households when the head of households are educated.However, household heads who are employing in the clerical, sales, services, and agricultural sectors are more likely to be poor compering with others.Szulc (2006) analyzed the robustness of poverty measures for Poland employing probit model of risk of poverty.In the study, variables such as low education, unemployment, rural residence, large number of children are found as robust correlates of high poverty.Oluwatayo (2014) preferred probit model in order to examine the determinants of poverty among male and female farmers along with simple descriptive statistics.The results of the probit model indicate that some variables such as age, gender, level of education, household size, major occupation have significant effect on the poverty status.Khalid and Akhtar (2011) employed probit regression for female heads in Pakistan.Results show that illiteracy became less important in 2004-05 and living in rural areas had also a low effect on determining the incidence of poverty.However, self-employment in agriculture in 2004-05 had a reducing effect on poverty.

Ebru Çağlayan-Akay, Gülşah Sedefoğlu
What does Bayesian probit regression tell us about Turkish female-and male-headed households poverty?
Mainly all those studies examined the determinants of household poverty in urban and rural area or just in female-and male-headed households but in our study, determinants of household poverty are analyzed in terms of male-and female-headed households along with urban and rural area.

BAYESIAN PROBIT REGRESSION
The analysis of dichotomous response data is considered as extremely important in applied microeconometrics.Dichotomous random variables  1 , … ,   defined by, where,   is a dichotomous response variable that takes only two values 0 and 1 for n observations.  * is not observable, as it is a latent variable.Dichotomous response variable can be analyzed by probit regression model.The probit regression model for dichotomous outcomes can easily and precisely be explained using different normal distributions for latent modeling.The probit model specifies the conditional probability, such as: where, ∅ is the standard normal cumulative distribution function, with derivative ) , which is the standard normal density function. is a px1 vector of unknown parametes, X is a vector of known covariates.Estimation of the probit model is usually based on maximum likelihood methods.Likelihood function for the probit regression can be written as follows: Parameters of the probit regression model is estimated by maximum likelihood function.Normal distribution is also considered for all estimations in the model.However, there are some disadvantages in the estimation of the probit regression, specially, in small samples.The model gives non-normal and inefficient results as a result of having small samples.On the other hand, probit model is often applied without testing the normality.Besides, when nonnormal distribution exists in the model, standard maximum likelihood estimator of the probit model is mostly biased (Greene 2002& Wilde, 2007).For this reason, different approaches are needed to obtain better results besides the classical approach, and Bayesian approach is one of the alternative methods to classical approach.The key to the Bayesian approach is the use of a prior probability distribution that favors sparseness in the fitted model, along with an optimization algorithm and implementation tailored to that prior.Zellner and Rossi (1984) are the first authors using bayesian analysis of qualitative choice models in the econometrics literature.Yatchew & Grilliches (1985) provide an insightful discussion of what in Bayesian terms are sensitivities to prior belief in probit models.
Bayesian probit regression model combines the prior probability distribution and likelihood function to estimate the posterior probability distribution.Using Bayes' Theorem, the posterior probability distribution for  can be written as follows: (: , ) ∝ () (: , ).
where, (: , ) is posterior probability distribution function and (: is likelihood function.Posterior distribution contains all information regarding the uncertainty of the parameters.() represents the prior distribution which reflects little prior information.Zellner and Rossi (1984) laid out analytical approximations to the posterior distributions of the parameter of probit model, under diffuse and informative priors.Posterior inference in probit model can be carried out using Gibbs sampler with data augmentation.In Bayesian probit regression, Gibbs sampler, algorithm of the Markov Chain Monte Carlo (MCMC), is used instead of the maximum likelihood method.The posterior probability distribution includes all available information about the parameters.However, two types of prior distributions can be defined as informative and non-informative.If something is known about the unknown parameters or the prior distribution plays important role in the analysis, it is defined as informative prior distribution.On the other hand, if the prior distribution plays an insignificant role, the non-informative prior distribution can be employed (Acquah, 2013).
Bayesian approach which has been showed an increasing development simultaneously development of the MCMC method and the MCMC method is seen as a revolution in statistical application (Jackmon, 2000).The aim of the MCMC method is create Markov chains using iterative Monte Carlo simulations (Sorensen & Gianola 2002).The Markov chain can be defined as a stochasting process where past, present and future states are accepted as independent.For creating to Markov chains, Metropolis Hasting and Gibbs Sampling are usually preferred by researchers (Tierney 1994).In this study, Gibbs sampler algorithm is used to estimate the parameters.The Gibbs sampler is a special case of the Metropolis Hasting algorithm and it has been introduced by Geman & Geman (1984).In line with the explosion of work using Markov Chain Monte Carlo methods, Albert & Chib (1993) show how data augmentation, in conjunction with Gibbs sampler, can be used to estimate posterior ditributions of interest for probit regression model.

DATA AND SAMPLING PROCEDURES
The data used in the study come from a household budget survey carried out in 2013 by the Turkish Statistical Institute (TURKSTAT).The sample contains 10051 Households.7051 households are in urban areas and 3000 households are in rural areas for this data.To analyze the determinants of household poverty in Turkey the data is reorganized as rural and urban area by gender of the head of households and Bayesian probit regression model is fitted to the study data.
Demographic and socioeconomic characteristics of household heads for rural and urban area by gender are given in Table 1.Both females and males have the highest rate in primary and middle school within other education levels in urban area.In rural area, 51.83% of the females are illiterate having the greatest rate, while the rate is approximately 7% for males.However, primary and middle school has the highest percentage for males in rural area comparing to other education levels.It is observed that household type of nuclear family experienced the highest rate for males and single adults for females for both rural and urban area in 2013.Furthermore, the rate of working full time for males is higher than females for urban Ebru Çağlayan-Akay, Gülşah Sedefoğlu What does Bayesian probit regression tell us about Turkish female-and male-headed households poverty?
and rural area and the rate of being elderly, disabled or inoperable is around 34% in rural area while the rate is around 14% in urban area for females.

DEPENDENT AND EXPLANATORY VARIABLES
Poverty can be separated into four groups as absolute, relative, subjective and objective poverty.In this study, relative poverty line is considered as many studies prefer using relative poverty line as dependent variable.Relative poverty line (z) is generally constructed to determine poverty status of household, which is a dichotomous dependent variable, using the OECD modified equivalence scale.According to OECD scale, a value is assigned to household size such as value of 1 is assigned to first adult who is generally head of the household, of 0.5 to each additional adult and of 0.3 to each child.In the study, household size is reorganized by OECD modified equivalence scale and household income is weighted by the composed household size has been obtained with respect to 50% of the median income.After assigning the values, equivalised household income (EHI) is derived by using calculated OECD modified scale.The dependent variable is defined as follows: The dependent variable is dichotomous: 0 if a household above the poverty line and 1 if a household below the poverty line.A dichotomous response model is estimated by Bayesian probit model in the study.Bayesian probit model is used to analyze quantitative data reflecting a choice between these two alternative situations, being considered as poor or non-poor.The model measures the relation between demographic and socio-economic characteristics of household heads, (which are the explanatory variables) and their poverty status.The specifications help to define a probability to monitor poverty among households.
In the study, demographic and socio-economic characteristics of household heads are used as explanatory variables.
The explanatory variables are given in Table 2.They contain both dichotomous and continous variables.These variables are consumption, marital status, health insurance, education, household types, house types, house size, household size, employment status, having second house, age, ownership status of house.Nevertheless, common variables which are utilized in most of the studies are education level, age of the head of the household, household size, marital status, household types, etc.In this study, variables such as being elderly, disable or inoperable, having health insuance, house type and having second house are also used as independent variables along with other variables mentioned above.

Ebru Çağlayan-Akay, Gülşah Sedefoğlu
What does Bayesian probit regression tell us about Turkish female-and male-headed households poverty?

RESULTS OF BAYESIAN PROBIT ESTIMATION
This section presents the Bayesian probit estimation results.Estimations have been made using R program and Zelig packages.In the Zelig packages, the prior distribution is accepted as standard normal distribution and to estimate the posterior distribution Markov Chain Gibbs sampler is employed (Imai, King and Lau 2008).
A couple of tests can be applied to assess Markov chain convergence.However, in this study, Geweke Statistics and Heidelberger and Welch tests have been applied to test the convergence.According to Geweke (1992)

55
Table 3, Table 4, Table 5 and Table 6 indicate the estimation of the year of 2013 for Bayesian probit regression in urban and rural area considering male-and female-headed households.Mean of posterior distribution, standard deviation and standard error are given in the Tables.Quantiles for each variables are in the last three column of the tables.Quantiles give information about range of the variables; median is a middle value of mean of the posterior distribution and quantile of 2.5% gives the minimum value of mean and quantile of 97.5% maximum value of mean of the posterior distribution.
According to Table 3 results show that mean of the posterior distribution of consumption with 0.00037, having high school, technical and industrial vocational high school degree with 0.4015, having bachelor, master or PhD degree with 0.0691, having a nuclear family with 0.3940, a couple without child with 0.7563, household type of being single adult with 0.8675, having health insurance with 0.7114, having second house with 0.5593, working full time with 0.4476 decrease the probability of living under poverty line while being illiterate with 0.6642, living in a detached house with 0.3445, being elderly, disabled or inoperable with 1.6182, single adult with 0.4941 boost the probability of living under poverty line.Clearly, the table indicates that marital status of being single , a couple without child, having health insurance and having second house have a great impact to reduce the probability of living under poverty line for males in urban area and results are as expected because for example, in previous studies, an increase in household size causes a rise on poverty so a couple without child as a household type variable has a more decreasing impact than a couple with child or nuclear family on poverty.Also, being single can be related to household size since being single as a household head means no extra expenditure for other household members and so the probability of living under poverty line which is calculated considering the household income level is low for single adults.
Table 4 represents the Bayesian probit regression results for female-headed households in urban area and results indicate that mean of the consumption with 0.00030, having high school, technical and industrial vocational high school degree with 0.2888, having bachelor, master or PhD degree with 0.8606, household type of being single adult with 0.5253, having health insurance with 0.7706, living in an apartment with 0.1886, working full time with 0.8559 and an increase at age decrease the probability of living under poverty line.Being illiterate with 0.5504, having a nuclear family with 0.5471, being elderly, disabled or inoperable with 0.0877 and marital status of being single with 0.2208 increase the risk of living under poverty line.According to household budget survey data in 2013 in Turkey, 24% of female headed households are illiterate, 13.96 % are elderly, disabled or inoperable and however, those variables are main indicators on a person who cannot enable to earn money or to find a job.Furthermore, many studies indicate that there is a direct correlation between education level and finding a job or good job.Nevertheless, there is also a correlation between having a good job and level of income.Finding a good job is important to have enough income to meet all needs in a household and/ or to live above of the poverty line.**All variables are between -2 and 2 so Markov Chain reaches its convergence.***According to results, the chain comes from a covariance stationary process and all variables passed the test so sample size is adequate to estimate the posterior distribution.vocational high school degree with 0.6121, having bachelor, master or PhD degree with 0.2736, having health insurance with 0.5904, working full time with 0.7224, an increase in house size with 0.00538 and at age with 0.00673 and being tenant have a decreasing effect on living under poverty line.However, being illiterate with 0.3505, elderly, disabled or inoperable with 1.3580, marital status of being single with 0.2911 and an increase in household size cause a rise on poverty.
Table 6 shows Bayesian probit regression results for female-headed households and according to results of the regression, an increase in consumption with 0.00068, having health insurance with 0.8060, working full time with 0.9120, an increase in house size with 0.0053 have a negative effect on living under poverty line.Nevertheless, being illiterate with 0.2438, having high school, technical and industrial vocational high school degree with 0.5140, marital status of being single with 0.8005, being elderly, disabled or inoperable with 0.4333, an increase at age with 0.0069 and in household size with 0.2788, being tenant with 0.6042 increase the probability of living under poverty line in the year of 2013.

Differences between urban-rural poverty:
For male-headed households, the effect of the education level of high school, technical and industrial vocational high school is higher than other education levels in urban and rural area.However, the effect of education level of high school, technical and industrial vocational high school and bachelor, master or PhD in rural area is higher than urban area.Working full time is most remarkable variable for rural area but for urban area, it is less valuable than rural area.Marital status of being single in urban area is more effective than rural area for male-headed households.
For female-headed households, the effect of the marital status of being single in rural area is higher than urban area.While the education level of high school, technical and industrial vocational high school has a negative effect on poverty in urban area, it has a positive effect on poverty in rural area.

Differences between female-and male-headed household:
Working full time is most important indicator for females while household type of single adult is most remarkable variable for male headed households in urban area.Education level of bachelor, master or PhD is more important than other education levels for female-headed households but for male-headed households, effect of the education level of high school, technical and industrial vocational high school is higher than other education levels in urban area having reducing effect on the risk of poverty although it was expected that the highest level of education has a highest effect to reduce the poverty for both female and male headed households.Education is a common factor for all the world and Turkey to reduce poverty as obtained in this study and other related studies in the literature.This is because education level comes at the top to treat some socio-economic and political problems for instance Turkey is at the list in which almost the lowest female employment rates are seen comparing to great part of the countries in the world, specially in OECD countries, and education level of the females is most remarkable factor on it.Moreover, household type of being single adult for male-headed households and having nuclear family for femaleheaded households are more important than other household types.In rural area, working full time and having education level of high school, technical and industrial vocational high school are more remarkable variables than others for male-headed households causing a decrease on probability of living under poverty line.However, working full time and having health insurance have a highest decreasing effect on poverty and the education level of high school, technical and industrial vocational high school has an increasing effect unexpectedly on poverty for female-headed households unlike male-headed households.
Having health is insurance also a significant indicator of household poverty for both female-and maleheaded households in urban and rural areas having a decreasing effect on poverty.It appears that generally people who lives under poverty line have not a health insurance due to having not enough income to meet all needs or working as uninsured as mentioned Caglayan & Sedefoglu (2016) in their studies.

CONCLUSIONS
Poverty is a common problem and threating the whole world leading a lot of problems, also in Turkey.According to current studies, people still lives under the risk and tend to live under the poverty line in their future.According to TURKSTAT poverty statistics , poverty rates in rural area were higher than urban area between the year of 2006 and 2013.Even though the poverty gap, which gives the depth of poverty, in urban area surpassed the poverty gap in rural area in 2006 and 2013, poverty gap in rural area was more than urban area for other years between 2006 and 2013.
This study examines the determinants of poverty status and focuses on male-and female-headed households poverty for both urban and rural areas in Turkey.The findings of the study can be summarized as follows: Education: The effect of education level of high school, technical and industrial vocational high school is higher than the education level of bachelor, master or PhD for male-headed households while education level of bachelor, master or PhD is more effective than other variables on poverty for female-headed households in urban area.For rural area, education level of high school, technical and industrial vocational high school is more important than the education level of bachelor, master or PhD causing a decrease on poverty for male-headed households.However, both being illiterate and education level of high school, technical and industrial vocational high school have an increasing effect on probability of living under poverty line for female-headed households.
Marital Status: Being single increases the probability of living under poverty line for both male-and female-headed households in urban and rural area.Nevertheless, in rural area, it shows a quite high effect on poverty for female-headed households in comparison with male-headed households in urban and rural area and female-headed households in urban area.
Health: Having health insurance is one of the most important variables causing a drop on poverty for both male-and female-headed households in urban and rural area.
Household Types: Household type of being single adult has a highest value decreasing risk of poverty for male-headed households but for female-headed households, although it has a decreasing effect on poverty, the effect is not high as much as male-headed households in urban area.
Employment Status: Working full time has a reducing effect on the risk of poverty for all estimations.Moreover, effect of the working full time for female-headed households in rural and urban area is higher than male-headed households.Being elderly, disabled or inoperable boosts the probability of living under poverty line for male-and female-headed households in urban and rural area but for male-headed households, it has highest value causing increasing effect on poverty.
Others: An increase in consumption, having second house reduce the risk of poverty while living in a detached house rises the risk of poverty in urban area for male-headed households.For female-headed households, an increase in consumption and at age, having second house cause a decrease on the risk of poverty.Moreover, an increase in house size and at age, being tenant boost the risk of poverty for maleheaded households in rural area.For female-headed households in rural area, an increase at age rises the risk Ebru Çağlayan-Akay, Gülşah Sedefoğlu What does Bayesian probit regression tell us about Turkish female-and male-headed households poverty?
of poverty unlike variable of age seen in other estimations.Furthermore, being tenant rises the probability of living under poverty line in rural area for female-headed households as expected.This study also uses Bayesian approach for binary regression models with parametric link is derived utilizing a Markov chain Monte Carlo algorithm to simulate from the joint posterior distribution of the regression and the link parameter.However, the limitations of the study, Bayesian probit regression, are the difficulty of decision of the prior distribution and subjectivity of the decisions taken by researchers.Nevertheless, this approach is more flexible than probit regression.Because it does not require any assumption as maximum likelihood estimation such as testing normality and heteroskedasticity.It may provide researchers a useful alternative approach to get rid of the normality and small sample problems of the probit regression.
We believe that Bayesian probit can be useful approach for future studies not just for analyzing determinants of household poverty but also other topics in social science which uses small samples.

Table 1
Demographic and Socioeconomic Characteristics of Household Heads in 2013

Area Observation for Female: 967 Observation for Male: 6084 Rural Area Observation for Female: 382 Observation for Male: 2618
Source: Turkish Statistical Institute, Household Budget Survey, 2013 and authors' calculations.

Table 2
Description of Explanatory Variables statistics, if variables get values between -2 and +2, it means that Markov chain reaches its convergence or desired posterior distribution.Heidelberger and Welch test (1993) consists of two parts which are stability and halfwith tests.Stability test gives the probability values and considering the probability values, it shows that whether the chain comes from a covariance stationary process or not and the halfwith test results illustrate if the sample size is adequate to estimate the posterior distribution.In this study, all variables pass the Geweke statistics taking values between -2 and +2.Moreover, results of the Heidelberger Welch test give the information that chain comes from a covariance stationary process and sample size is enough to estimate the posterior distribution.Table 3 Bayesian Probit Regression Results for Male-headed Households in Urban Area* Notes: R program gives the results whether the variables pass the test or not and considering to test results, all variables passed the test and sample size is adequate to estimate the posterior distribution.Besides, it is possible to calculate the test statistics for the Halfwidth test.The value of halfwidth is divided into halfwidth mean and if the absolute value of results is smaller than the at least one of the values of 0.01, 0.05, 0.1, it shows that sample size is adequate to estimate the posterior distribution.*Basic categories for this model are EDU2, HTYPE5, HOUSE2, EMPSTATU2 and MARITALST2.**According to p values, if the null hypothesis is rejected, it shows that the chain comes from a covariance stationary process.
**When the variables get values between -2 and +2, Markov Chain reaches its convergence.*

Table 4
Bayesian Probit Regression Results for Female-headed Households in Urban Area*

Table 5
Bayesian Probit Regression Results for Male-headed Household in Rural Area* *Basic categories for this model are EDU2, EMPSTATU2, MARITALST2 and OWNSTATU1.**Allvariables are between -2 and 2 so Markov Chain reaches its convergence.***According to results, the chain comes from a covariance stationary process and all variables passed the test so sample size is adequate to estimate the posterior distribution.

Table 5
depicts the results of Bayesian probit regression for male-headed households in rural area.Results indicate that mean of the consumption with 0.00033, having high school, technical and industrial