Using household surveys for deriving labour market, poverty and inequality trends in South Africa

Yu, Kwan Cheung Derek
Journal Title
Journal ISSN
Volume Title
Stellenbosch : Stellenbosch University
ENGLISH ABSTRACT: In order to evaluate the extent to which South Africa achieve the objectives of poverty and inequality reduction as well as job creation, up-to-date and reliable data are required. Since the transition, various survey data have been commonly used for these analyses, namely Census, Community Survey (CS) 2007, Income and Expenditure Survey (IES), October Household Survey (OHS), Labour Force Survey (LFS), Quarterly Labour Force Survey (QLFS), General Household Survey (GHS), Project for Statistics on Living Standards and Development (PSLSD), National Income Dynamics Study (NIDS) and All Media Products Survey (AMPS). However, these datasets are not fully comparable, due to differences in the sampling design, sample size, questionnaire structure, methodology to derive labour market status, as well as the way the income and expenditure information was collected. Hence, this dissertation begins by analysing these issues in each survey in Chapter 2. With regard to the income and expenditure information, it was collected differently in the surveys: the recall method was used in all surveys except IES 2005/2006, the only survey that adopted the diary method; respondents were asked to report the actual amount in some surveys but only asked to declare the relevant interval in others; for the former approach, respondents could either declare the single estimate amount or amounts for sub-categories that were then aggregated; for interval data, various methods can be used to determine the amount in each interval. Thus, Chapter 3 begins by discussing the merits and drawbacks of these approaches, as well as how they would affect the reliability and comparability of income and expenditure variables across the surveys. In some surveys (e.g., the two censuses and CS 2007), quite high proportions of households incorrectly reported zero income or expenditure or did not specify their income or expenditure. Poverty and inequality estimates could be influenced by either including or excluding these households from the analyses. Hence, various approaches to deal with these households are examined in Chapter 3. As the surveys typically under-captured income or expenditure when compared with the national accounts income, the validity of the resultant poverty and inequality estimates might be affected. Hence, arguments for and against adjusting the survey means in line with the national accounts mean (e.g. by shifting the survey distribution rightwards) are discussed. As the survey data are, strictly speaking, crosssectional and not designed for time-series labour market, poverty and inequality analyses, it is sometimes argued that the data should be re-weighted to be consistent with demographic and geographic numbers presented by the Actuarial Society of South Africa (ASSA) and Census data. This cross entropy re-weighting approach is discussed in Chapter 3. Finally, the chapter examines the labour market status derivation methodology in all OHSs, LFSs and QLFSs in greater detail, and investigates how the changes across the surveys could possibly affect the comparability of labour market estimates throughout the years. The dissertation then examines the labour market trends since the transition by using the OHS, LFS and QLFS data, and it is found that both the labour force and employment numbers increased in general since the transition, but the latter increase was not rapid enough to absorb the expanding labour force. In addition, the number of narrow unemployed doubled between 1994 and 2009, and the narrow unemployment rate showed an upward trend and peaked at just above 30% in 2003. It decreased between 2004 and 2007, before rising again in 2008- 2009 due to the impact of global recession. Application of the cross entropy approach does not substantially affect labour market trends, suggesting that the trends (including the abrupt increase in labour market estimates during the changeover from OHS to LFS) were either real or took place due to the improvement of the questionnaire to capture the labour market status of the respondents better. Furthermore, the application of the LFS 2000b-LFS 2007b methodology on the earlier surveys reduced the extent of the abrupt increase of the number of broad unemployed and broad unemployment rates during the changeover between OHS and LFS. Finally, the use of the QLFS methodology (which required minor revisions) on the LFSs greatly reduced the extent of the abrupt decrease of unemployment aggregates between LFS 2007b and QLFS 2008Q1, thereby improving the comparability of these aggregates across the surveys. In Chapter 5 poverty and inequality concepts are reviewed, followed by a detailed explanation of the sequential regression multiple imputation (SRMI) technique to deal with households with zero or missing income or expenditure, as well as the derivation of real income, expenditure and consumption variables in each survey. Poverty and inequality trends since the transition are examined in Chapter 6. With regard to poverty, with the exception of AMPS, the poverty trends were very similar across the surveys, that is, poverty increased since the transition, before a downward trend took place since 2000. As far as inequality is concerned, both the levels and trends in the Gini coefficients differed a lot amongst the surveys, as the estimates were very stable in the AMPSs, showed an upward trend in surveys like IESs and GHSs, but first increased until 2000 before a downward trend took place in others (e.g., the two censuses and CS 2007). The levels of inequality also differed when comparing the surveys. The abovementioned poverty and inequality estimates and trends could in part be affected by the various issues discussed in Chapter 3, thus there is a need for careful analysis. The impact of the number and width of intervals in which income or expenditure data are recorded on poverty and inequality estimates and trends are dealt with in greater detail in Chapter 6 by applying various intervals on the three IESs and NIDS 2008. It is found that the number and width of intervals only had some impact on these estimates and trends in some surveys. The effect of adjusting the survey means in line with the national accounts mean is also investigated. Finally, the application of the cross entropy re-weighting technique did not have any significant impact on the poverty and inequality estimates and trends.
AFRIKAANSE OPSOMMING: Data wat op datum en betroubaar is word vereis om te kan evalueer in watter mate Suid- Afrika sy doelwitte rakende die vermindering van armoede en ongelykheid en die skepping van werkgeleenthede bereik. Sedert die politieke oorgang word verskeie opnamedatastelle gewoonlik vir sulke ontledings gebruik, byvoorbeeld Sensusse, die Gemeenskapsopname van 2007, Inkomste- en Bestedingsopnames, Oktober-huishoudingsopnames, Arbeidsmagopnames, Kwartaallikse Arbeidsmagopnames, Algemene-Huishoudingsopnames, die Nasionale-Inkomste-Dinamika-Studie en die Alle-Media-en-Produkte-opnames. Weens verskille in steekproef-ontwerp, struktuur van die vraelyste, metodologie om arbeidsmarkstatus te klassifiseer, asook maniere waarop inligting oor inkomste en besteding ingewin is, is hierdie datastelle egter nie ten volle vergelykbaar nie, Gevolglik begin hierdie proefskrif in Hoofstuk 2 om elk van hierdie kwessies in elke opname te ontleed. Inkomste- en bestedingsinligting is in die opnames verskillend ingewin: In die meeste opnames is respondente gevra om aan te dui hoeveel hulle in die verlede bestee of verdien het, maar in die Inkomste- en Bestedingsopname van 2005/2006 is die dagboekmetode gebruik; respondente is in party opnames gevra om die presiese bedrag te vermeld, terwyl hulle in ander opnames die betrokke inkomste- of bestedingsinterval moes aandui; vir eersgenoemde is hulle gevra om òf die enkelbedrag te verklaar, òf hulle moes ‘n aantal sub-komponente onderskei; vir intervaldata kan verskillende metodes gebruik word om skattings van die inkomste in elke interval te maak. Dus begin Hoofstuk 3 met ‘n oorsig van die voor- en nadele van die verskillende benaderings en ‘n bespreking van hoe dit die betroubaarheid en vergelykbaarheid van inkomste- en bestedingsveranderlikes oor die opnames beïnvloed. In party opnames (bv. die twee sensusse en die Gemeenskapsopname van 2007) dui heelwat huishoudings verkeerdelik aan dat hulle geen inkomste verdien of uitgawes aangaan nie, of hulle spesifiseer nie hoeveel hulle verdien of bestee nie. Ramings van armoede en ongelykheid kan geraak word deur sulke respondent in te sluit of deur hulle uit te laat in die ontledings. Gevolglik word verskeie benaderings in Hoofstuk 3 bespreek om hiermee om te gaan. Omdat opnames vergeleke met die nasionale rekeninge tipies inkomste of besteding onderskat, mag dit die geldigheid van daaruitvoortspruitende armoede- en ongelykheidsramings raak. Gevolglik word argumente vir en teen die aanpsssing van die opname-data om dit in ooreenstemming te bring met die nasionale rekeninge (d.w.s. deur die verdeling na regs te verskuif) bespreek. Ten slotte, omdat die opnamedata streng gesproke kruissnitdata is en nie ontwerp is vir tydreekse van die arbeidsmag, armoede en ongelykheid nie, word soms aangevoer dat die gewigte van die data herweeg moet word om in ooreenstemming te wees met demografiese en geografiese data soos verkry van die Aktuariële Vereniging van Suid-Afrika en sensusdata. Hierdie kruisentropie herwegingsmetode word in Hoofstuk 3 bespreek. Ten slotte ondersoek die laaste hoofstuk die metodologie vir die bepaling van arbeidsmarkstatus in all die OHS, LFS en QLFS opnames in groter besonderhede, en ook hoe die veranderings oor die verskillende opname-reekse heen dalk die vergelykbaarheid van arbeidsmarkramings deur die jare kan beïnvloed. Die proefskrif ontleed daarna arbeidsmarktendense sedert die politieke oorgang met gebruik van die Oktober-huishoudingsoponames, Arbeidsmagopnames en Kwartaallikse Arbeidsmagopnames. Beide die arbeidsmag en indiensneming het sedert die transisie toegeneem, maar die toename in indiensneming was onvoldoende om die uitbreiding van die arbeidsmag te absorbeer. Verder het die getal eng-gedefinieerde werkloses tussen 1994 en 2009 verdubbel, en die eng werkloosheidskoers het ‘n toename getoon en in 2003 ‘n toppunt van 30% bereik. Dit het daarna tussen 2004 en 2007 gedaal voordat dit weer in 2008-2009 gestyg het weens die wêreldreseessie. Die toepassing van die kruisentropie-benadering het arbeidsmarktendense nie noemenswaardig beïnvloed nie, wat daarop dui dat hierdie tendense (insluitende die skielike toename in arbeidsmagramings in die oorgang van die Oktoberhuishoudingsopname- data na die Arbeidsmarkopname-data) werklik was, of anders plaasgevind het weens veranderings in die opnamevraelyste om respondente se arbeidsmarkstatus beter te probeer bepaal. Verder het die toepassing van die LFS2000b tot LFS 2007B metodologie op die vroeëre opnames die abrupte verlaging in die oorgang tussen die OHS en LFS in die getal breed-gedefineerde werkloses en breë werkloosheidkoerse verminder. Ten slotte het die gebruik van die QLFS-metodologie op die LFS (wat kleiner hersienings benodig het) die abrupte verlaging tussen LFS2007b en QLFS2008Q1 aansienlik verminder, en dus die vergelykbaarheid van hierdie groothede oor die opnames heen verbeter. In Hoofstuk 5 word eers ‘n oorsig van armoede- en ongelykheidsbegrippe gegee, waarma die sekwensiële-regressie-veelvoudige-imputasie-tegniek in besonderhede bespreek word. Hierdie tegniek word veral gebruik vir gevalle waar huishoudings aandui dat hulle inkomste of besteding nul is, of waar hulle nie antwoord nie. Daar is ook ‘n bespreking van die bepaling van reële inkomste, besteding of verbruiksveranderlikes in elke opname. Armoedeen ongeleykheidstendense word in Hoofstul 6 bespreek. Rakende armoede is daar, met uitsondering van die Alle-Media-en-Produkte-opname, eenstemmigheid dat dit sedert die politieke oorgang eers gestyg het voor dit sedert 2000 begin daal het. Sover dit ongelykheid aanbetref verskil neigings in die Gini-koëffissiënt baie tussen die opnames, want die ramings is stabiel oor die periode vir die Alle-Media-en-Produkte-opname, styg vir die Inkomste- en Bestedingsopname en die Algemene-Huishoudingsopnames, en styg tot 2000 voordat dit afneem in ander opnames (bv. die twee sensusse en die Gemeenskapsopname van 2007). Vlakke van ongelykheid verskil ook tussen die opnames. Deels kan die genoemde tendense in armoede- en ongelykheid dalk toegeskryf word aan die kwessies wat in Hoofstuk 3 bespreek is. Die effek van die getal en wydte van die intervalle waarin inkomste- en bestedingsdata ingewin word op ramings van armoede en ongelykheid word in meer besonderheid in Hoofstuk 6 bespreek. Deur die toepassing van verskillende intervalle op data van die drie Inkomste- en Bestedingsopnames en die Nasionale-Inkomste-Dinamika-studie word bevind dat die getal en wydte van intervalle ‘n beperkte effek op hierdie ramings en tendense het. Verder word gekyk na die effek van die aanpssing van die opname-data om dit in ooreenstemming met die nasionale rekeninge te bring. Ten slotte word getoon dat die gebruik van die kruisentropie-metode nie enige beduidende uitwerking op armoede- en ongeleykheidsramings en -tendense het nie.
Thesis (PhD)--Stellenbosch University, 2012.
Labour market -- South Africa, Poverty -- South Africa, Inequality -- South Africa, Household surveys -- South Africa, Dissertations -- Economics, Theses -- Economics