Although these issues of statistical generalization can be applied to data that is to be symbolized by points, lines, and areas, this discussion will be developed around the mapping of areas in choropleth maps. This is in part because choropleth maps are used so widely, but also because they are difficult to execute effectively. This is because choropleth maps have an inherent weakness--they involve the aggregation of data within areal units that do not correspond exactly with the underlying spatial distribution of data. By focusing on choropleth mapping in the following examples, some of these weaknesses can be revealed and discussed.
These three maps are divided into quantiles of two, five, and nine categories, respectively.
B. Comparison of maps using different ranging methods
These three maps each have five ranges of data, but they were determined using different methods. The first map uses equal steps, the second has user defined ranges, and the third is divided into quintiles.
6.3 Exploring your data and its "shape"
6.4 Commonly employed ranging methods for assigning cutpoints
In generalizing statistical distributions, cartographers use the term "cutpoint" to refer to the boundaries between categories. All the following methods pertain to the calculation or assignment of these cutpoints. Remember, all systems of classification depend upon the use of "exhaustive" and "mutually exclusive" categories. Exhaustive means that the categories classify all values of a given data range--no values within that range are omitted from the classification system. Mutually exclusive means that any given observation can be placed in one and only one category--data categories cannot overlap. Please be sure, if you are using an automated mapping system, that the the system does not assign overlapping cutpoints automatically when it creates the map legend.
The method is useful for mapping rectangular distributions. It is also useful for exploratory analysis, at times when you wish to develop a "feel" for the characteristics of a data distribution.
The method is useful for mapping rectangular distributions. It is also useful for exploratory analysis, at times when you wish to develop a "feel" for the characteristics of a data distribution.
This method can be applied effectively to data that is J-shaped with a peak at the low end of the distribution.
In this method, the widths of the category intervals are increased in size at a geometric (that is, multiplicative) rate. If your first category is 2 units wide, the second would be 2x2 or 4 units wide, the third 2x2x2 or 8 units wide, and so forth to the end of the distribution.
This method can be applied effectively to data that is J-shaped with a peak at the low end of the distribution but with a long "stretch" between low and high values.
This method can be applied to distributions that approximate a normal curve.
If your data is J-shaped with a peak at the high end of the distribution, the inverses of the arithmetic and geometric progressions can be employed. By inverting the cutpoints, the smallest intervals between cutpoints will be closest together at the high end of the distribution.
6.5 Symbolizing the Category Ranges
6.6 Statistical annotations are needed for some complex datasets
Further Reading
Coulson, Michael R.C. 1987. In the matter of class intervals for choropleth maps: With particular reference to the work of George F. Jenks. Cartographica 24 (2): 16-39.
Evans, Ian S. 1977. The selection of class intervals. Transactions of the Institute of British Geographers New Series 2: 98-124.
Jenks, George F. 1963. Generalization in statistical mapping. Annals of the Association of American Geographers 53: 15-26.
Jenks, George F. and Duane S. Knos. 1961. The Use of Shading Patterns in Graded Series. Annals of the Association of American Geographers 51: 316-334.