Weighting the Data
In order to make disparate raw data comparable and most useful, it is relativized through a system of normalization and non-linear aggregation. These procedures are explained below.
Score Normalization
Since the range of Indicators are measured in diverse ways, the algorithm required a method of normalization such that each Indicator could be scored on a common scale.
For example, ‘Unemployment’ is generally measured as a percentage of eligible individuals who lack employment; thus, a lower percentage is desirable. By contrast, ‘Life Expectancy’ is typically measured in years, where a higher score is desirable. Thus, normalizing Indicator scores involves mathematically translating these raw values onto a 0-100 scale, where 0 is the worst possible value and 100 is the best.
Constraint of Linear Aggregation
Forms of linear aggregation allot equal weight to each part comprising the whole (such as the average or mean); thereby failing to reflect the complex relationships between Indicators and Considerations. For example, while a high score should be an indication of effective performance, aggregating scores linearly (where each is weighted equally) can obscure outlier scores, such as particularly well or poorly scoring Indicators.
To promote clarity and accuracy in exploring community performance of the 35 Considerations, CitiIQ employs non-linear aggregation to combine Indicator scores and to accurately highlight the “gaps” which may otherwise be overlooked.
The following figure illustrates the bias linear aggregation can introduce into a model. The linear score is calculated by averaging the four scores and the non-linear score is calculated on a weighted basis.