The estimation of spatial distribution patterns of different socio-economic status (SES) groups by housing advertisement data and machine learning techniques: a case study in brooklyn, new york

Research Article
Open access

The estimation of spatial distribution patterns of different socio-economic status (SES) groups by housing advertisement data and machine learning techniques: a case study in brooklyn, new york

Wei Yuan 1*
  • 1 The University of Edinburgh    
  • *corresponding author ivyweiyuan@hkust-gz.edu.cn
Published on 14 June 2023 | https://doi.org/10.54254/2755-2721/6/20230742
ACE Vol.6
ISSN (Print): 2755-273X
ISSN (Online): 2755-2721
ISBN (Print): 978-1-915371-59-1
ISBN (Online): 978-1-915371-60-7

Abstract

Poverty eradication has long been a central issue for sustainable development goals (SDGs), which draws attention to the issue of urban inequalities that can hinder regional economic development and increase unemployment and crime rates. It is critical to understand the local socio-economic distribution pattern for better urban policies and planning strategies. Traditional SES measurements are mainly based on census data and surveys, which are slowly updated and often fail to apply in the latest analysis. The SES inference methods using other data (e.g., satellite maps, nighttime lighting data) lack a theoretical basis and are of coarse resolution. The study takes advantage of the latest data (i.e., online housing advertisement data) and point of interests (POIs) to infer fine-grained block-group-level SES in Brooklyn through machine learning techniques. In addition, natural language processing (NLP) methods are used to derive twelve housing-related SES predictors. The results show that the speculative models and predictors are feasible, and the Global decision tree (GBDT) algorithm is the most efficient of the seven algorithms. The SES distribution map demonstrates a clear socio-economic stratification in Brooklyn. The rich are mainly concentrated in the western and northern areas with a high density of facilities. Based on the analysis of the local SES, three policy recommendations are proposed. First, for the inequitable distribution of facilities, additional investment should be made in the central and eastern regions. Second, a high level of greenery should be given priority in urban planning. Third, in terms of housing, disadvantaged groups should be given attention.

Keywords:

socio-economic status, machine learning, natural language processing, SES predictors, Brooklyn.

Yuan,W. (2023). The estimation of spatial distribution patterns of different socio-economic status (SES) groups by housing advertisement data and machine learning techniques: a case study in brooklyn, new york. Applied and Computational Engineering,6,1316-1328.
Export citation

References

[1]. Baker EH. Socio-economic Status, Definition. In: The Wiley Blackwell Encyclopedia of Health, Illness, Behavior, and Society. John Wiley & Sons, Ltd; 2014, 2210–4.

[2]. Arrow K, Bowles S, Durlauf SN. Meritocracy and Economic Inequality. Princeton University Press; 2018. 367 p.

[3]. Oakes JM, Kaufman JS. Methods in Social Epidemiology. 2017;603.

[4]. Singh GK, Ghandour RM. Impact of Neighborhood Social Conditions and Household Socioeconomic Status on Behavioral Problems Among US Children. Matern Child Health J. 2012, 16(1):158–69.

[5]. Wang L, He S, Su S, et al. Urban neighborhood socio-economic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements. Habitat International. 2022, 124:102572.

[6]. Ilic L, Sawada M, Zarzelli A. Deep mapping gentrification in a large Canadian city using deep learning and Google Street View. Ribeiro HV, editor. PLoS ONE. 2019, 14(3):e0212814.

[7]. Zhang G, Guo X, Li D, Jiang B. Evaluating the Potential of LJ1-01 Nighttime Light Data for Modeling Socio-Economic Parameters. Sensors. 2019, 19(6):1465.

[8]. Abitbol JL. Interpretable socio-economic status inference from aerial imagery through urban patterns. 2020;2:12.

[9]. Niu T, Chen Y, Yuan Y. Measuring urban poverty using multi-source data and a random forest algorithm: A case study in Guangzhou. Sustainable Cities and Society. 2020, 54:102014.

[10]. Sheehan E, Meng C, Tan M, et al. Predicting Economic Development using Geolocated Wikipedia Articles. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage AK USA: ACM; 2019. 2698–706.

[11]. Suel E, Bhatt S, Brauer M, Flaxman S, Ezzati M. Multimodal deep learning from satellite and street-level imagery for measuring income, overcrowding, and environmental deprivation in urban areas. Remote Sensing of Environment. 2021, 257:112339.

[12]. Bourdieu P. Distinction a Social Critique of the Judgement of Taste. In: Inequality Classic Readings in Race, Class, and Gender. Routledge; 2006.

[13]. Hu M, Liu B. Mining and Summarizing Customer Reviews. 10.

[14]. Chen W, Wu X, Miao J. Housing and Subjective Class Identification in Urban China. Chinese Sociological Review. 2019, 51(3):221–50.

[15]. Leslie E, Cerin E, Kremer P. Perceived Neighborhood Environment and Park Use as Mediators of the Effect of Area Socio-Economic Status on Waiking Behaviors. 10.

[16]. Dodson J, Gleeson B, Sipe N. Transport Disadvantage and Social Status: A review of literature and methods. 63.

[17]. Mouw T. Job Relocation and the Racial Gap in Unemployment in Detroit and Chicago, 1980 to 1990. American Sociological Review. 2000;65(5):730–53.

[18]. As Long as There are Neighborhood - John R. Logan, 2016.


Cite this article

Yuan,W. (2023). The estimation of spatial distribution patterns of different socio-economic status (SES) groups by housing advertisement data and machine learning techniques: a case study in brooklyn, new york. Applied and Computational Engineering,6,1316-1328.

Data availability

The datasets used and/or analyzed during the current study will be available from the authors upon reasonable request.

Disclaimer/Publisher's Note

The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of EWA Publishing and/or the editor(s). EWA Publishing and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

About volume

Volume title: Proceedings of the 3rd International Conference on Signal Processing and Machine Learning

ISBN:978-1-915371-59-1(Print) / 978-1-915371-60-7(Online)
Editor:Omer Burak Istanbullu
Conference website: http://www.confspml.org
Conference date: 25 February 2023
Series: Applied and Computational Engineering
Volume number: Vol.6
ISSN:2755-2721(Print) / 2755-273X(Online)

© 2024 by the author(s). Licensee EWA Publishing, Oxford, UK. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license. Authors who publish this series agree to the following terms:
1. Authors retain copyright and grant the series right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this series.
2. Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the series's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this series.
3. Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See Open access policy for details).

References

[1]. Baker EH. Socio-economic Status, Definition. In: The Wiley Blackwell Encyclopedia of Health, Illness, Behavior, and Society. John Wiley & Sons, Ltd; 2014, 2210–4.

[2]. Arrow K, Bowles S, Durlauf SN. Meritocracy and Economic Inequality. Princeton University Press; 2018. 367 p.

[3]. Oakes JM, Kaufman JS. Methods in Social Epidemiology. 2017;603.

[4]. Singh GK, Ghandour RM. Impact of Neighborhood Social Conditions and Household Socioeconomic Status on Behavioral Problems Among US Children. Matern Child Health J. 2012, 16(1):158–69.

[5]. Wang L, He S, Su S, et al. Urban neighborhood socio-economic status (SES) inference: A machine learning approach based on semantic and sentimental analysis of online housing advertisements. Habitat International. 2022, 124:102572.

[6]. Ilic L, Sawada M, Zarzelli A. Deep mapping gentrification in a large Canadian city using deep learning and Google Street View. Ribeiro HV, editor. PLoS ONE. 2019, 14(3):e0212814.

[7]. Zhang G, Guo X, Li D, Jiang B. Evaluating the Potential of LJ1-01 Nighttime Light Data for Modeling Socio-Economic Parameters. Sensors. 2019, 19(6):1465.

[8]. Abitbol JL. Interpretable socio-economic status inference from aerial imagery through urban patterns. 2020;2:12.

[9]. Niu T, Chen Y, Yuan Y. Measuring urban poverty using multi-source data and a random forest algorithm: A case study in Guangzhou. Sustainable Cities and Society. 2020, 54:102014.

[10]. Sheehan E, Meng C, Tan M, et al. Predicting Economic Development using Geolocated Wikipedia Articles. In: Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. Anchorage AK USA: ACM; 2019. 2698–706.

[11]. Suel E, Bhatt S, Brauer M, Flaxman S, Ezzati M. Multimodal deep learning from satellite and street-level imagery for measuring income, overcrowding, and environmental deprivation in urban areas. Remote Sensing of Environment. 2021, 257:112339.

[12]. Bourdieu P. Distinction a Social Critique of the Judgement of Taste. In: Inequality Classic Readings in Race, Class, and Gender. Routledge; 2006.

[13]. Hu M, Liu B. Mining and Summarizing Customer Reviews. 10.

[14]. Chen W, Wu X, Miao J. Housing and Subjective Class Identification in Urban China. Chinese Sociological Review. 2019, 51(3):221–50.

[15]. Leslie E, Cerin E, Kremer P. Perceived Neighborhood Environment and Park Use as Mediators of the Effect of Area Socio-Economic Status on Waiking Behaviors. 10.

[16]. Dodson J, Gleeson B, Sipe N. Transport Disadvantage and Social Status: A review of literature and methods. 63.

[17]. Mouw T. Job Relocation and the Racial Gap in Unemployment in Detroit and Chicago, 1980 to 1990. American Sociological Review. 2000;65(5):730–53.

[18]. As Long as There are Neighborhood - John R. Logan, 2016.