Monday, June 20, 2016

How to Count: Choosing the Right Data Source (Michael Simkovic)

How to Count: Choosing the Right Data Source

In response to my last post, a reader asked why different data sources give different counts for the total number of lawyers in a given year.

Regardless of the survey used, the number of jobs for “lawyers” has increased above 2007 levels in recent years, and thus we can have confidence that the number of working lawyers is higher today than it was before the recession (as is the total amount of money being spent on lawyers).

In the U.S. there are several important and widely used labor market and demographic surveys with different sampling methods and different rules for inclusion. These surveys include the Department of Labor, Bureau of Labor Statistics Occupational Employment Statistics (BLS OES)—a survey of businesses that participate in state unemployment insurance programs—and several surveys of households by the Census Bureau including the Current Population Survey (CPS) and the American Community Survey (ACS), the decennial Census, and a lesser known but extremely useful survey called the Survey of Income and Program Participation (SIPP).

Each of these surveys has strengths and weaknesses, and the best survey or combination of surveys depends on the particular question the researcher is trying to answer, the particular population he or she hopes to measure, and the relevant time period.[i] There are tradeoffs between the sample size (larger sample sizes increase precision of estimates, especially for small subgroups), the frequency with which the survey is conducted, the amount of detailed information available about each respondent, the reliability of certain kinds of information, the specificity of geography or industry or occupation or type of education, how many years of data are available, and so on.

CPS and ACS generally report reasonably similar numbers in a given year. ACS has a much larger sample size and can estimate better for more granular geographies or smaller groups. CPS has annual historical data going back many more years.

BLS OES reports a much lower number of lawyers than Census household surveys because of the nature of the BLS OES sample (businesses that participate in unemployment insurance rather than all households) and the rules for inclusion (business owners and self-employed individuals do not count as employees). However, BLS applies these rules across occupations, so it likely also undercounts other occupations where business ownership (for example, through a partnership structure) or self-employment are common. BLS also applies these rules across years, so the same undercounting principle applies in every year—although the undercounting rate could be different. BLS OES will double-count individuals who hold more than one job with different employers since OES surveys employers about positions.

Picking the right survey or surveys requires detailed research and planning and the professional judgment that comes with extensive experience as an empirical researcher.

Because labor market data is typically going to be used for comparative purposes—for example, comparing earnings and employment rates by level of education; or comparing growth rates overtime—the important thing is that each comparison should be apples-to-apples. Definitions should be consistent, the sampling methods and rules for inclusion or exclusion should be consistent, etc.

Typically, a comparison over time or across occupations or types of education should be made within a single survey to ensure consistency and apples to apples comparisons.

This means that surveys that focus exclusively on lawyers or law school graduates—like surveys by NALP or the ABA—are much less useful than broad-based government surveys that cut across types and levels of education, occupations and industries. The advantage of single-industry or single-occupation surveys is that they might have detailed information that is not available in general purpose surveys. However, basic information such as the number of people in a given occupation, earnings, employment rates, and work hours, and so on will typically all be available from large household surveys.

Some surveys are much more reliable than others, but if several within-survey comparisons from reputable and well-designed surveys report a similar relationship, this correspondence across surveys increases confidence in the size and direction of the relationship.

There is little doubt that there are more working lawyers as of the most recently available data (generally 2014 or 2015, depending on the source) than there were in 2007 and that this has been the case in most or all years since 2009.


[i] Frank McIntyre and I have used SIPP extensively in our research because we are interested in the earnings premium to a law degree for all law school graduates, including the large proportion who do not work as lawyers and even including the smaller group who do not pass a bar exam. SIPP is the only major national survey that identifies “law degree holder” rather than merely identifying “lawyers” or “professional degree holders.” SIPP is a household survey and so—unlike After the JD—does not require individuals to have passed a bar exam to be included. But SIPP has several limitations, which we’ve discussed in our research. We’ve also supplemented SIPP with other data sources when other data sources have important advantages.


