Heterogeneous Job and Skill Landscapes in the Italian Labor Market

Abstract

Traditional labor market data provide very limited knowledge about the type and skill content of jobs. Yet, the profound labor market transformations due to rapid technological change call for a better understanding of work skill attributes and how they characterize modern jobs. In this paper, I use machine learning methods to analyze a large and highly unstructured novel data source of online job vacancies posted in Italy in 2014-2019. I exploit information on industry of posting firms, location and occupation of job positions, as well as on hundreds of granular skill requirements given in the form of short strings univocally identified by tags. I assign these short strings to skill categories in the standard Deming and Kahn (2018) taxonomy through a keyword- based matching routine. Next, I use a bag-of-tags approach in two natural language processing routines to detect patterns among jobs based on the similarity of their skill requirements. Firstly, I run a K-means model, which assigns jobs to mutually exclusive clusters representing job types. Secondly, I run a Latent Dirichlet Allocation model, which probabilistically assigns skill tags to mutually exclusive latent topics representing work domains and returns their prevalence within jobs. The two methods capture different nuances in the data while delivering comparable job groupings. Lastly, I explore variation in job types and work domains across locations, industries, and occupations to describe the heterogeneous job and skill landscapes in the Italian labor market.

Giuseppe Grasso
Giuseppe Grasso
Ph.D. Candidate in Economics