Here is what I mean by astounding. As a test, I calculated the median household income and the total population within a 1-mile radius around each dance club in Arlington, TX comparing the following methods: (1) Total population using ArcMap's standard spatial join tool, (2) Total population using an area-weighted summation, (3) Average median household income using standard spatial join, (4) Average median household income using an area-weighted average.
The following table displays the results.
Field A displays the dance club's name. Fields B, C, and D above display the difference between using the standard ArcMap spatial join tool and a weighted-average spatial join tool when calculating average household income. Fields E, F, and G display the differences when calculating the total population.
The differences in both cases are quite high. I am thinking of myself and all of the students who I have seen naively rely on the standard spatial join tool for these types of calculations. Wow...
Why is there such a difference?
An area-weighted spatial join between two polygons comes in two flavors, depending on whether ti is calculating an average or a sum.
If it is calculating an average, the formula is [area-percent] * [value] + [area-percent] * [value]... The most important consideration is the percentages of the join features that are within each target feature. For example, in a particular zip code, there might be 3 block groups. Let's further suppose that block group 1 comprises 50%, block group 2 comprises 35%, and block group 3 comprises 15%.
If it is calculating a sum, the most important consideration is the percent of the join feature that actually intersects the target feature. The formula is ( [% area intersects target feature] * [value] + [% area intersects target feature] * [value] ) / number of intersecting join features. This is why you will see a much larger error when using the standard spatial join tool for summations than for averages. If 2% of a block group intersects a zip code, the standard tool will include the entire population of the block group instead of only 2%.
An area-weighted spatial join between two polygons comes in two flavors, depending on whether ti is calculating an average or a sum.
If it is calculating an average, the formula is [area-percent] * [value] + [area-percent] * [value]... The most important consideration is the percentages of the join features that are within each target feature. For example, in a particular zip code, there might be 3 block groups. Let's further suppose that block group 1 comprises 50%, block group 2 comprises 35%, and block group 3 comprises 15%.
If it is calculating a sum, the most important consideration is the percent of the join feature that actually intersects the target feature. The formula is ( [% area intersects target feature] * [value] + [% area intersects target feature] * [value] ) / number of intersecting join features. This is why you will see a much larger error when using the standard spatial join tool for summations than for averages. If 2% of a block group intersects a zip code, the standard tool will include the entire population of the block group instead of only 2%.
Is This a Perfect Solution?
No. This assumes a perfectly even distribution within each join feature. It is, however, a huge improvement.
Caveat: These scripts are first drafts and have not been tested on any systems other than the ArcINFO Desktop 9.1 & 9.2 systems here at UT Arlington. There is no documentation. Also, the scripts run on the slow side. Eventually these will be optimized, but at the current time they are presented as is.
Description of the three tools:
No. This assumes a perfectly even distribution within each join feature. It is, however, a huge improvement.
Where Can I Get the Script?
Download it here. Extract the compressed archive and you will see three Python scripts and an ArcGIS toolbox. Open ArcMap or ArcCatalog, ensure ArcToolbox is visible, and add the Spatial Join Tools.tbx (single-click).Caveat: These scripts are first drafts and have not been tested on any systems other than the ArcINFO Desktop 9.1 & 9.2 systems here at UT Arlington. There is no documentation. Also, the scripts run on the slow side. Eventually these will be optimized, but at the current time they are presented as is.
Description of the three tools:
- Average Area Weighted: Use this tool to calculate an area-weighted average spatial join between two polygons.
- Sum Area Weighted Join: Use this tool to calculate an area-weighted summation spatial join between two polygons.
- Percent Area Report: Use this tool to generate a report that specifies the percent of the join layer that intersected the target layer.