Analyzing public investment with Lorenz curves
This Jupyter notebook describes an approach to measuring equality in public investments. The proposed approach utilizes the well-known Lorenz curve graph, as well as the complementary Gini coefficient.
The notebook begins by providing some basic background on both the Lorenz curve and Gini coefficient. It then generates a straightforward example using Python code. The code is thoroughly annotated with comments. While having a programming background would obviously be useful, a careful reader of the code comments should be able get an understanding of each code line’s purpose.
The Lorenz curve & Gini coefficient
Typically, the Lorenz curve is used to visualize the distribution of income or of wealth. It was created by Amercian economomist Max O. Lorenz in 1905 as a graphing approach for representing inequality of wealth distribution. The image below is an example of this use. The “B” area represents the cumulative resources owned by the population. The established convention is that the X-axis goes from the most marginal group (e.g. the lowest income categorization) starting on the right and ends on the left with the most powerful group.
The “A” area represents the area or “gap” between the line of perfect equality and the actual distribution. The Gini coefficient is the ratio of the area that lies between the line of equality and the Lorenz curve (“A”) over the total area under the line of equality (“A” and “B”).
Equitable Public Investment
While equitable public investment is an oft-mentioned goal of policymakers, it is not always the case that spending and/or capital project allocations are rigorously measured to ensure said goal is met.
In the example below, we use both the Lorenz curve and Gini coefficient to create a plot of the distribution of investment accross a population. No specific investment type is identified - it could be transit service hours, infrastructure maintenance, park acreage, etc. The sample data is based on group categorization by ventiles (a “ventile” is one part of twenty). But this is not a requirement - you may choose your own quantile.
To illustrate the practical potential of the approach, a series of nine “small multiples” is created, each representing a hypothetical district/geography within a community. This allows for granularity about how economic marginality impacts investment in different spatial groupings within a community, as well as facilitating quick checks of broader patterns (e.g. the “North” part of the community features different investment distribution than the “South” part).
1 | # These instructions tell the program creating this website what other |
Informatx’s Jupyter notebooks are all available on Github.