non-profits' distribution of management expenses

At the recent DC Data Dive, GuideStar put out for analysis some IRS financial data for non-profit organizations.  The non-profits are identified only by an ID number, topic of work, and geographic scope. The question for the data dive was: “What financial data may be predictive of an organization’s defunct status in two years?”  Volunteer data analysts went at that question with various data analysis and modeling strategies.

The financial data for the non-profits includes expense data partitioned into the categories program services, management, fundraising, and payments to affiliates.  A rudimentary means for evaluating non-profits is to look at management, fundraising, or overhead expense ratios.  The meaningfulness of those ratios depends on accounting distinctions between expenses.  Is formulating plans to expand a program a management expense or a program expense?  Is reviewing a program a management expense or a program expense?  The answers to these sorts of accounting questions aren’t obvious, but the answers clearly affect expense ratios.  Moreover, accounting, like everything else, responds to incentives.  Since non-profits are commonly measured on expense ratios, innovative accounting is likely to favor lower expense ratios.

Analysis of digit distributions can provide insight into data-generating processes.  Across orders of magnitude, numbers that are the outcomes of exponential growth processes, or are choices from uncorrelated probability distributions, have first-digit distributions that follow Benford’s Law. Numbers that human or industry conventions select are less likely to follow Benford’s Law.  Does the digit distribution of non-profits’ management expenses follow Benford’s Law?

In any data analysis, understanding the data is fundamental.  The non-profit dataset consists of multiple years of observations for many non-profit organizations.  Time-series data for a single organization is generated differently from cross-sectional data for a set of organizations.  Nonetheless, I combine these two different data types to get a sufficiently large sample to analyze the non-profits by topic.  The analysis thus conflates data generation within and across organizations.

Preliminary review of the data suggested that management expenses for organizations serving “At-Risk Youth” and “People with Disabilities” provide an interesting comparison.  Both types of organizations serve individuals.  Their shares of reviewer-identified “top” organizations are relatively close.  Both have about 900 management expense figures in the dataset.  Thus distribution tests for these two groups have considerable and roughly comparable power.  Kernel density plots show that the management expense distributions for each group are similar and have much probability density across more than two orders of magnitude.  Hence fitting their first-digit distributions to Benford’s Law seems reasonable.

Despite organizational and statistical similarities, the first-digit distributions for “At-Risk Youth” and “People with Disabilities” management expenses differ significantly.  Management expenses for “At-Risk Youth” organizations are strongly inconsistent with Benford’s Law.  Management expenses for “People with Disabilities” organizations show no evidence of being inconsistent with Benford’s Law.  What accounts for that difference?

A quantile-quantile (Q-Q) plot of management expenses (plotted in base-10 logarithms) indicates that the distributions differ most in their tails.  Nonetheless, if the digit-distribution test is applied only to management expenses greater than $10,000 and less than $10,000,000, the results still differ significantly.  Across that subset of management expenses, the p-values for log-likelihoods for Benford Law are 0.0039 and 0.0601 for “At-Risk Youth” and “People with Disabilities” organizations, respectively.  The Q-Q plot over that range displays a undulation apparently associated with the different digit distributions.  Management expenses for “At Risk Youth”, combined across organizations and over time, appear to be more conventional and less naturally selected than those for “People with Disabilities.”

Natural selection, for numbers, organizations, and organisms, is generally associated with increased fitness.  I thus tentatively predict that organizations serving “People with Disabilities” are more effective and less likely to go defunct than those serving “At-Risk Youth.”

All the data used in the above analysis are publicly available.  Do your own analysis to evaluate my prediction and to formulate your own predictions.

*  *  *  *  *

Analysis note:  Digit distributions were tested using Ben Jann’s Digdis Stata module from the Statistical Software Components (ssc) archive. Benford’s Law log-likelihood ratio p-values for all the topics are available in this summary table.  Organizations serving at-risk youth and people with disabilities average about 9 years of management-expense figures per organization.  The figures are mainly from 1998 to 2008.

The original dataset is available here (Excel file).  Here’s some data documentation.  Here’s a tab-separated text version of the dataset, with area and topic created from the original cause field.

Leave a Reply

Your email address will not be published. Required fields are marked *

Current month ye@r day *