4Alpha Research: Is there a systemic overestimation of U.S. employment data?
4Alpha Research Researcher: Kamiu
In today's global economic landscape, the importance of employment data to global macro monetary policymakers and trading markets is self-evident. As a crucial indicator of economic development, the U.S. non-farm payroll data has always attracted significant attention. However, there has long been a questioning voice in the market: why do U.S. employment data and CPI trends often diverge, and why are there substantial discrepancies between household survey and business survey data? This divergence has led some to doubt the non-farm payroll data released by the U.S. Department of Labor, suggesting that there may be errors, or even systematic overestimations, especially as the non-farm data has shown anomalies since 2024, with a significant unexpected drop in July 2024 non-farm data further escalating systemic doubts about the non-farm data.
Next, we will explore the reasons behind this phenomenon and its potential impact on market analysis and policy-making.
1. Why has U.S. employment data long been suspected of being inaccurate, or even systematically overestimated?
The non-farm payroll employment data released monthly by the U.S. Bureau of Labor Statistics (BLS) includes employment numbers, unemployment rates, and is regarded as one of the most important macroeconomic indicators. The number of new non-farm jobs reflects the number of new jobs created in the U.S. non-agricultural sector, encompassing all industries outside of government, such as manufacturing, services, construction, etc. This data helps to understand the pace of expansion in the U.S. job market and the tightness of the labor market. The unemployment rate refers to the proportion of the labor force that is unemployed over a certain period. It is another important indicator of economic health, reflecting the degree of underutilization in the labor market. Average hourly earnings reflect the income level of U.S. workers and are an important indicator of consumer purchasing power and potential inflationary pressures.
Non-farm data significantly impacts financial markets, government policy-making, and economic forecasting. Investors, economists, and policymakers closely monitor this report to assess the trajectory of the U.S. economy, thereby making corresponding investment and decision-making. The performance of non-farm data often influences the Federal Reserve's monetary policy, which in turn affects global financial markets. However, in recent years, an increasing number of viewpoints suggest that U.S. employment data may be inaccurate and could be systematically overestimated, primarily due to the following reasons:
The discrepancies between non-farm data from different sources are becoming increasingly pronounced (detailed below), and the lack of robustness in the data has become more evident, leading to questions about the credibility of non-farm employment data;
There are certain underlying contradictions between different macro data. In light of the recent significant decline in CPI data, the employment market continues to show a trend of moderate growth, as illustrated below:
January 2024:
- CPI: According to data from the U.S. Bureau of Labor Statistics, the CPI in January decreased by 0.1% month-on-month and increased by 6.4% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in January was 517,000, with the unemployment rate remaining at 3.4%.
February 2024:
- CPI: The CPI in February remained flat month-on-month and increased by 6.0% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in February was 311,000, with the unemployment rate slightly decreasing to 3.3%.
March 2024:
- CPI: The CPI in March decreased by 0.2% month-on-month and increased by 5.2% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in March was 235,000, with the unemployment rate remaining unchanged.
April 2024:
- CPI: The CPI in April decreased by 0.4% month-on-month and increased by 4.9% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in April was 213,000, with the unemployment rate slightly rising to 3.4%.
May 2024:
- CPI: The CPI in May decreased by 0.3% month-on-month and increased by 4.0% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in May was 184,000, with the unemployment rate remaining at 3.4%.
June 2024:
- CPI: The CPI in June decreased by 0.2% month-on-month and increased by 3.2% year-on-year.
- Non-farm employment data: The number of new non-farm jobs in June was 176,000, with the unemployment rate slightly decreasing to 3.3%.
The above data depicts a somewhat peculiar scenario, where in the first half of 2024, the CPI in the U.S. shows a month-on-month downward trend, while non-farm employment numbers continue to rise moderately, demonstrating strong resilience, which does not align with observers' simple predictions based on the Phillips curve. Although the Phillips curve has historically been proven to have very limited predictive power for actual situations, and its specific elasticity remains a long-standing debate in the macroeconomic community, the sustained deviation of data from the Phillips curve over a longer time frame since 2023 still raises questions about the data itself (this article will temporarily set aside the discussion on the statistical criteria for CPI);
The various sub-data contained in non-farm data contradict each other. For example, in the non-farm employment data for May 2024, which is widely regarded as one of the most bizarre in the last decade, employment numbers recorded significant growth, yet the unemployment rate rose significantly without a noticeable increase in the labor force, creating a self-contradictory situation (of course, the number of new jobs in May's non-farm data was significantly revised downward in June, which further exacerbated market and commentary doubts about the reliability of the initial data);
Since the beginning of 2024, non-farm employment data has been revised downward multiple times. Since 2023, the non-farm employment data released by the U.S. Bureau of Labor Statistics has frequently shown downward revisions. For example, the non-farm data for May 2024 showed an increase of 272,000 jobs, far exceeding the market expectation of 185,000, but the multiple downward revisions of previous non-farm data have led the market to question the accuracy of this data. The Philadelphia Fed even suggested that the non-farm data for 2023 may have overestimated job creation by as much as 800,000;
Non-farm employment data contradicts other employment survey data and consistently exceeds economists' collective predictions. In recent months, the Quarterly Census of Employment and Wages (QCEW) and the U.S. private sector employment numbers (ADP) have already indicated signs of a cooling job market, yet non-farm data continues to show that the U.S. employment situation exhibits unexpected resilience. It is generally believed that non-farm employment data does not distinguish between formal and informal employment, while QCEW and others focus more on formal employment statistics, with limited statistics on informal and part-time employment.
2. A brief introduction to how non-farm employment data is calculated
The BLS compiles non-farm data based on a series of detailed surveys and statistical methods. The following are the key steps and methods for calculating non-farm employment data:
Sample Survey: The BLS collects data through household surveys (Current Population Survey, CPS) and business surveys (Current Employment Statistics, CES). The household survey is primarily used to calculate the unemployment rate and labor force participation rate, while the business survey is used to calculate the number of jobs added and average hourly earnings;
Industry Classification: Non-farm employment data categorizes employment into different industry categories, such as manufacturing, construction, services, etc., for more detailed analysis of employment conditions in each industry;
Data Adjustment: This mainly includes seasonal adjustments and Birth/Death adjustments:
- To ensure the accuracy of the data, the BLS performs seasonal adjustments to eliminate the impact of seasonal factors on employment data. Specifically, the BLS first analyzes historical data to identify and quantify seasonal patterns. Seasonal patterns refer to fluctuations in employment data caused by regular or predictable factors (such as holidays, weather changes, school vacations, etc.) during specific time periods. Secondly, the BLS uses S-ARIMA time series analysis methods to fit model parameters that make the residuals white noise, performing seasonal differencing on the raw data to eliminate seasonal fluctuations.
- Additionally, since the CES survey cannot capture real-time employment changes in newly established and closed businesses, the BLS uses the Birth/Death Adjustment model to estimate these changes to more accurately reflect the actual situation in the job market. The Birth Model estimates the jobs created by newly established businesses. This model is based on historical data, considering growth trends in different industries and macroeconomic conditions to predict the contribution of new businesses to the job market; the Death Model estimates the jobs lost due to closed businesses. This model is also based on historical data, analyzing the frequency and patterns of business closures, as well as the impact of macroeconomic conditions on business survival.
3. Conclusion: Is U.S. employment data intentionally overestimated?
The author believes that at this level of questioning, CPI and non-farm data share similarities; these two monthly data points with significant macro implications have long been repeatedly questioned by the market regarding whether they are artificially manipulated to meet the needs of current political figures in the U.S. for support and votes, thus questioning the independence of the Federal Reserve. Of course, the author cannot completely rule out the possibility of this conspiracy theory being valid, but still believes that the various anomalies and inconsistencies in non-farm data in recent years are more due to outdated statistical methods, structural changes in the U.S. economy post-pandemic, and the accelerating influx of illegal immigrants, which are interrelated reasons.
- Outdated Statistical Methods
As mentioned below, the operational model of the U.S. economy may have undergone structural changes, but the seasonal adjustments and B/D adjustments of CES data heavily rely on historical data patterns, which may lead to significant biases, particularly with the B/D adjustment being the most criticized.
According to the data, of all the jobs added in May's non-farm data, 231,000 came from the B/D model, which is based on estimates of new business formations. These jobs were not actually counted as created but were assumed to exist and directly included in the data. Since April 2023, the B/D model has added 1.9 million jobs, accounting for 56% of all new jobs during the same period. This means that over half of the "job growth" in the past year has come from adjustments, leading most market opinions to point to the B/D model as the culprit for the "outrageous" non-farm data in May 2024, as shown in the figure below. In recent years, the percentage difference between CES and CPS results has been increasing, which is also considered strong evidence that the CES sampling method and statistical adjustment methods have severely failed.
- Structural Changes in the U.S. Economy Post-Pandemic
Before and after the COVID-19 public health event, there has been a noticeable surge in the proportion of informal work and a rapid decline in the willingness of young people to seek employment, a phenomenon that has persisted to this day. Currently, there is no particularly strong explanation for this phenomenon. Some viewpoints suggest that the increase in informal work and the decline in employment willingness may be due to long COVID (LC) reducing overall labor capacity at the population level, but this remains inconclusive. Regardless, it is certain that the increase in part-time work significantly complicates the statistics of non-farm employment. Since non-farm data is collected through sampling surveys, the simultaneous engagement of the same person in multiple part-time jobs inevitably leads to an overestimation of job statistics compared to actual conditions, while eliminating these noises would lead to disproportionately high survey costs. At the same time, a large number of eligible individuals exiting the labor force (the denominator of the unemployment rate) also distorts the statistics of unemployment rates and job increases.
- Ineffective Border Control and Accelerating Influx of Illegal Immigrants
This point is closely related to the aforementioned changes in economic structure, as undocumented immigrants have a significantly higher probability of engaging in informal work. Additionally, the employment of illegal immigrants may lead to potential sampling biases.
The BLS's non-farm employment data is derived from CES sampling surveys. If the sample does not adequately represent the employment situation of illegal immigrants, the survey results may deviate from reality. For example, if the CES survey's sampling (with employers as sampling units) covers more large enterprises that tend to hire legal workers while neglecting the smaller or underground enterprises where illegal immigrants are more likely to work, then employment data is likely to be significantly overestimated.