Evenness indices once again: critical analysis of properties

Various properties have been advocated for biological evenness indices, with some properties being clearly desirable while others appear questionable. With a focus on such properties, this paper makes a distinction between properties that are clearly necessary and those that appear to be unnecessary or even inappropriate. Based on Euclidean distances as a criterion, conditions are introduced in order for an index to provide valid, true, and realistic representations of the evenness characteristic (attribute) from species abundance distributions. Without such value-validity property, it is argued that a measure or index provides only limited information about the evenness and results in misleading interpretations and evenness comparisons and incorrect results and conclusions. Among the overabundant variety of evenness indices, each of which is typically derived by rescaling a diversity measure to the interval from 0 to 1 and thereby controlling or adjusting for the species richness, most are found to lack the value-validity property and some lack the property of strict Schur-concavity. The most popular entropy-based index reveals an especially poor performance with a substantial overstatement of the evenness characteristic or a large positive value bias. One evenness index emerges as the preferred one, satisfying all properties and conditions. This index is based directly on Euclidean distances between relevant species abundance distributions and has an intuitively meaningful interpretation in terms of relative distances between distributions. The value validity of the indices is assessed by using a recently introduced probability distribution and from the use of computer-generated distributions with randomly varying species richness and probability (proportion) components.


Introduction
There has become an embarrassment of riches for indices used to measure the evenness or uniformity of species distributions in biology. A researcher seeking an evenness index to use in a particular study is faced with a bewildering choice. Also, there does not appear to be any general agreement as to which index is to be preferred over others. Extensive reviews of such indices and their properties have been provided by Smith and Wilson (1996) and Tuomisto (2012). Table 1 summarizes proposed evenness indices that take on values over the interval from 0 to 1 (including a new index E 11 ).
Evenness indices are typically functions of some diversity measure and the number of individuals in a species sample or collection, called species richness and denoted by S, i.e., where the function f is generally such that the evenness index ranges in value from 0 to 1. While there is a plethora of measures of species diversity and no consensus about which measure is "best" (e.g., Magurran 2004, Ch. 4;Grassle et al. 1979;Tuomisto 2012), the present paper is confined to the measurement of species evenness with the focus on the properties of such measures or indices. A variety of properties have been imposed on evenness indices (see, e.g., Smith and Wilson 1996;Taillie 1979;Engen 1979;Tuomisto 2012;Gosselin 2001;Ricotta 2004). Some of these appear to be entirely appropriate and generally agreed upon, while others seem to be unnecessary and even unreasonable. This paper starts off with a discussion of properties that are considered to be necessary for any acceptable evenness index, followed by comments about some dubious properties. One of the properties introduced as necessary is the value-validity property, which basically requires that an evenness index takes on values that are all reasonable with respect to an acceptable criterion. It is argued that this property, which nearly all proposed evenness indices lack, is necessary in order for comparisons between differences in evenness values to be valid. An index lacking this property may cause misleading results and incorrect conclusions.
The value-validity requirement of an index is based on the recently introduced lambda distribution (Kvålseth 2011) and a criterion involving Euclidean distances between species abundance distributions. The assessment of whether or not an index meets this requirement is then done analytically using the lambda distribution and empirically using computer-generated random species abundance distributions. It is also proved that some well-known indices are not valid since they lack the essential strict Schur-concavity property and hence do not preserve the Lorenz order.

Properties of evenness indices
Let E(P s ), or simply E, represent the value of a generic evenness index for the species abundance distribution Table 1 Proposed evenness indices varying over the interval from 0 to 1 and based on the species abundance probabilities (proportions) p 1 , …, p S and species richness S Designation Formula Reference Notes Pielou (1966) E 2 (e H − 1)/(S − 1) Heip (1974) a Smith and Wilson (1996) E 4 −log X S i¼1 p 2 i =logS Smith and Wilson (1996)

New c
Notes: a. The H stands for the Shannon (1948) entropy defined for E 1 and with base-e (natural) logarithms (E 1 and E 4 are indifferent as to which logarithm is used). b. Engen (1979) attributed this index to F.M. Williams (1977) in an unpublished manuscript. c. The p i 's are here in descending order (p 1 ≥ p 2 ≥ … ≥ p S ). P S = (p 1 , …, p S ) where p i is the probability (proportion) of the i th species, with p i ≥ 0 for i = 1, …, S and X S i¼1 p i ¼ 1.
If each p i is based on a sample or collection as opposed to unknown population probabilities, then p i = n i /N where n i is the frequency of individuals in species i, i = 1, …, S, and n i is the sample size. The extreme distributions are then for which E is expected to take on its extremal values for any given species richness S. In the case of a censused community or collection of species, the most uneven species abundance distribution becomes (1 − (S − 1)/N, 1/N, …, 1/N) and the most even distribution equals P 1 S in (2) only when N/S is an integer. However, when the sample size N is large, these two extreme distributions become effectively equivalent to those in (2).
Based on this notation, the various potential properties of an evenness index E will be discussed, starting with an outline and explanation of those properties that are suggested as being indeed necessary. Subsequent sections of this paper will discuss the value-validity property necessary for making appropriate comparisons and interpretations of E-values.

Necessary properties
(P1) Continuity: E is a continuous function of each of the p i , i = 1, …, S, for any given S.
(P2) Symmetry: E is (permutation) symmetric in p 1 , …, p S so that the value of E is invariant with respect to all permutations of the p i 's.
(P3) Normalization: E takes on values between 0 and 1. (P4) Schur-concavity: E is strictly Schur-concave. In order to explain this property, the concept of majorization needs to be first defined. The rather vague notion that the components of P S = (p 1 , …, p S ) are "less spread out", "more nearly equal", or "more even" than are the components of the distribution Q S = (q 1 , …, q S ) can be expressed as "P S being majorized by Q S " and denoted by P s ≺ Q S . Thus, if the p i (i = 1, …, S) are ordered such that with Q S being similarly ordered, then, by definition, and with and P S is not a permutation of Q S (Marshall et al. 2011, pp. 8, 80). As a simple numerical example, consider the two distributions P 4 = (0.40, 0.30, 0.20, 0.10) and Q 4 = (0.70, 0.15, 0.10, 0.05) where the components have been ordered as in (3). It is readily apparent from (4) that P 4 is majorized by Q 4 , i.e., P 4 ≺ Q 4 Then, since the evenness index E is required to be strictly Schur-concave, E(P 4 ) > E(Q 4 ) from (5). Clearly, the components of P 4 are "more evenly distributed" or "more nearly equal" than those of Q 4 and the evenness based on P 4 should be greater than that based on Q 4 . This is precisely what Property P4 requires.
(P5) Value Validity: E takes on values that are all valid representations of the true extent of the species evenness characteristic.
This last property, which is considered necessary for making appropriate evenness difference comparisons, is explained in detail in subsequent sections of the paper.

Comment on property P1
The continuity requirement ensures that, for any given or fixed S, small changes in some of the p i 's cause only a small change in the value of E. However, since E may be a function of both the p i (i = 1, …, S) and S, a change in S and hence changes in the p i 's will necessarily result in a discontinuous change in E. In fact, if S is very small, the addition of one more species may produce a substantial jump in the value of E. Such discontinuity with varying S has been discussed by, for instance, Routledge (1983), Jost (2010), andRicotta (2004).

Other implications from property P4
The strict Schur-concavity Property P4 of E has some important implications. Since, for any P s = (p 1 , …, p s ), it is readily apparent from the definition in (4) that P 1 S ≺P S ≺P 0 S for the P 1 S and P 0 S in (2), it then follows from (5) that with the inequalities being strict if P S is not a permutation of P 0 S or P 1 S . This can be expressed as the following sub-properties: (P4a) For any given species richness S, E attains its minimum and maximum values for the species abundance distributions P 0 S and P 1 S in (2), respectively. Another consequence of Property P4 is that E satisfies the principle of transfers, which, within the context of income distributions, was introduced by Dalton (1920). That is, if p i < p j in the species abundance distribution P S = (p 1 , …, p S ) and if an amount δ is transferred from p j to p i , with δ < p j − p i , then the value of E increases. This can be seen to follow from (3)-(5). See also Marshall et al. (2011), pp. 6-8. That is, a sub-property of E is as follows: (P4b) E satisfies the transfer property. As a simple example, consider the four-species distribution Q 4 = (0.30, 0.15, 0.50, 0.05) with δ = 0.20 being transferred from p 3 to p 4 to produce the distribution P 4 = (0.30, 0.15, 0.30, 0.25). It is then clear from (3)-(4) that P 4 is majorized by Q 4 , so that, from (5) and the strict Schur-concavity of E, E(P 4 ) > E(Q 4 ).
A third consequence of Property P4 may be stated in terms of Lorenz curves as follows (see Marshall et al. 2011, pp. 5-6, 713-715): (P4c) E preserves the Lorenz order. As an explanation, the Lorenz curve for a distribution P S = (p 1 , …, p S ) is typically based on the cumulative probabilities F i ð Þ ¼ after the p i 's have been ordered such that p (1) ≤ p (2) ≤ … ≤ p (S) . The Lorenz curve (after Lorenz 1905) is then obtained by joining the consecutive points (i/S, F (i) ) for i = 0, 1, …, S and F (0) = 0 by successive line segments to connect the origin (0, 0) with the point (1, 1). If the Lorenz curve for P S is nowhere below and does not coincide everywhere with the Lorenz curve for the distribution Q S = (q 1 , …, q S ), this is equivalent to the majorization in (3)-(4) so that, from (5), E(P S ) > E(Q S ). The Lorenz curve for the uniform or completely even distribution P 1 S in (2) is the straight line between the two points (0, 0) and (1, 1) so that the value E(P S ) increases as the Lorenz curve for P S gets closer to this straight line.

Introductory comments on value validity
The motivation behind the need for an evenness index to have value validity can perhaps be best illustrated by means of some simple numerical examples. Consider, for instance, the species abundance distributions P 2 = (0.75, 0.25) and Q 5 = (0.60, 0.10, 0.10, 0.10, 0.10) and the Pielou index E 1 in Table 1 for which E 1 (P 2 ) = 0.81 and E 1 (Q 5 ) = 0.76. Both of these results would seem to indicate a "high" degree of evenness, with P 2 indicating a more even distribution than Q 5 of about 7%. However, the components of each of these two distributions are equally far from the corresponding components of the respective pairs of extreme distributions P 0 2 and P 1 2 and Q 0 5 and Q 1 5 defined in (2) so that the only reasonable evenness values should be 0.50 for both P 2 and Q 5 . Furthermore, if, say, ffiffiffiffiffi E 1 p or more generally E α 1 α > 0 ð Þ were to be considered as alternative indices, which can be shown to have the same Properties P1-P4 as E 1 , the results could be even more unreasonable such as ffiffiffiffiffiffiffiffiffiffiffiffiffiffi E 1 Q 5 ð Þ p ¼ 0:87. Of course, the purpose of using an evenness index E is to compare the degree or extent of evenness of different species abundance distributions. Thus, in simplified notation, if e 1 , e 2 , … denote values of the index E for different species distributions, the various comparisons of potential interest may be defined as follows: Difference comparisons : e 1 −e 2 > e 3 −e 4 Proportional difference comparisons : where c is some constant. Even the above interpretation of E 1 (0.75, 0.25) = 0.81 as indicating a "high" degree of evenness is basically a difference comparison as in (7b), with 0.81 − 0 > > 1 − 0.81, i.e., the 0.81 value is much further from the minimum E 1 -value of 0 than it is from the maximum E 1 -value of 1. Similarly, the above statement that E 1 (0.75, 0.25) = 0.81 exceeds E 1 (0.60, 0.10, …, 0.10) by about 7% is equivalent to e 1 = 0.81, e 2 = e 3 = 0.76, e 4 = 0, and c = 0.07 in (7c). When making such comparisons, it is essential that they are not limited to the evenness index itself, but that they provide true representations of the attribute (characteristic) being measured. Otherwise, the results obtained may be invalid and misleading, which is not uncommon in the published literature. What is needed of the index E is that it takes on values throughout its range that are all accurate, true, or valid representations of the evenness attribute. That is, E has to have value validity (Property P5). In measurement theory, the term validity of a measurement procedure means that it does in fact measure what it purports to measure. While several different types of validity have been defined (see, e.g., Hand 2004, pp. 129-134), value validity as used here applies specifically to the requirement that all potential numerical values of a measure have to be appropriate or reasonable with respect to some generally acceptable criterion as explained in the next subsection of this paper.
The importance of a concept such as value validity does not appear to have been the subject of rigorous attention in the numerous publications involving evenness measurement and indices. Some have commented that certain evenness indices "tend to strongly overestimate evenness" or "give unreasonably high values of evenness" (Bulla 1994, pp. 167, 169). The results from computergenerated combinations of species frequencies by Fager (1972) implied that indices such as Pielou's E 1 and Smith and Wilson's E 3 in Table 1 tended to overstate evenness and had "strongly skewed distributions…" indicating that those indices "would give poor discrimination over much of the range of samples (p. 301)." Molinari (1989) did consider the importance of reasonable intermediate values of an evenness index, focusing on the case of two-species communities, and Smith and Wilson (1996) as well as Jost (2010) emphasized the need to consider intuitively reasonable intermediate index values. However, no specific value-validity conditions appear to have been formulated.

Conditions for value validity
In order to establish value-validity conditions on E (Property P5), it will be convenient to use the following lambda distribution recently introduced by this author (Kvålseth 2011(Kvålseth , 2014: where λ is a parameter and S denotes the species richness. The λ is basically an evenness or uniformity parameter with λ = 1 for the completely even (uniform) distribution P 1 S in (2) and λ = 0 for the degenerate distribution P 0 S in (2). Note also that each component of P λ S can be considered the weighted mean of components of P 0 S and P 1 S , i.e., Since the strict Schur-concavity of E ensures that the inequality in (6) holds for all distributions P S = (p 1 , …, p S ), it also holds for P λ S and all S and λ. Therefore, for any given P S , there exists one, and only one, value of λ such that for any given P S and one unique λ-value Of course, for any given S, there may be any number of different P S -distributions for which the value of E would be the same and hence for which (10) would apply. Because of (10), the value validity of E and appropriate conditions can be considered in terms of the P λ S -distribution in (8).
Consider now that each distribution P S is a point or vector in S-dimensional Euclidean space with Cartesian coordinates p 1 , …, p S . The Euclidean distance between the points P S and Q S is given by A most logical requirement for value validity based on Euclidean distances may then be formulated as the following equality between distance ratios: for the P 0 S ; P 1 S ; P λ S defined in (2) and (8). Since the righthand side ratio in (11) can be seen to equal 1 − λ and from the Schur-concavity Property P4, specifically (6), the relationship in (11) can equivalently be expressed as The distance-based condition in (12) is also a logical implication from (9). That is, the evenness value in (12) consists of a weighted mean equivalent to that of the underlying species distributions in (9). The condition in (11) and hence in (12)-(13) applies specifically to the lambda distribution P λ S in (8). However, because of (10), it seems entirely reasonable to also require that (11) should generally provide a good (close) approximation when P λ S is replaced by any species distribution P S = (p 1 , …, p S ). Therefore, for any distribution P S , the approximation corresponding to (13) becomes (13) and (14) become conditions for value validity of the evenness index E.

Distance criterion
The Euclidean distance is the metric being used as the basic criterion for the value-validity conditions in (11)-(14). Other metric distance functions could have been considered such as alternative members of the Minkowski family for parameter α ≥ 1, with α = 2 being the Euclidean metric and α = 1 being the socalled city-block or Manhattan metric (e.g., Upton and Cook 2008, pp. 118-119). While it is easily seen that, if any member of this distance family is used in (11), the expressions in (12)-(13) would still apply, but the right-hand side of (14) would generally differ for different values of α. The Euclidean distance is used here since it is the standard or ordinary measure of distance used in mathematics and the various sciences. The use of any other alternative distance metric or function would seem to require some particular justification or explanation. Also, it is of importance to note that the right-hand side of (14) based on the Euclidean distance is strictly Schur-concave whereas, had it been based on the city-block distance, the corresponding expression would have been Schur-concave, but not strictly Schur-concave. This follows from the fact that (a) the Euclidean distance between P S and P 1 S is strictly Schur-convex while the city-block distance is Schur-convex, but not strictly so (Kvålseth 2011), and (b) the right-hand side of (14) is a strictly decreasing function of those two alternative distance metrics (Marshall et al. 2011, pp. 88-91, 138-139).

Bounds on E
For a censused community with species abundance frequencies n 1 , …, n S and total number of individuals N ¼ X S i¼1 n i , the most uneven species abundance distribution becomes P 0þ Þ and the most even distribution P 1− S equals P 1 S in (2) only if N/S is an integer. For these extremal distributions, none of the evenness indices in Table 1 can attain the lower and upper bounds of 0 and 1 since E i P 0þ S and P 1− S become effectively equal to the distributions in (2) and E i P 0þ S À Á ≈0 and E i P 1− S À Á ≈1. As a simple example, consider S = 5 and N = 98. Then, from the above definition, P 0þ 5 ¼ 1−4=98; 1=98; …; 1=98 ð Þ for the frequencies n i = 94, 1, …, 1 and Þ . For, say, the Williams' E 9 in Table 1, the values for these two extreme distributions are found to be E 9 P 0þ 5 À Á ¼ 0:05 and E 9 P 1− 5 À Á ¼ 0:99 as compared to E 9 P 0 5 À Á ¼ 0 and E 9 P 1 5 À Á ¼ 1 for the P 0 5 and P 1 5 in (2). Some (e.g., Fager 1972) have suggested that an evenness index should be able to attain the values 0 and 1 for any S and N while others do not impose such requirement as long as an index takes on values near 0 and 1 when the species distribution is as uneven or even as possible for any given S and N (e.g., Smith and Wilson 1996). Of course, an index E with E P 0þ S À Á > 0 and E P 1− S À Á < 1 can easily be rescaled to the interval [0,1] by defining In the above example, with S = 5 and N = 98, consider n i = 20, 5, 60, 10, 3 for which the value of E 9 in Table 1 becomes E 9 (P 5 ) = 0.46 whereas, from (15), E • 9 P 5 ð Þ ¼ 0:46−0:05 ð Þ = 0:99−0:05 ð Þ¼0:44. However, there seems to be no particular basis for imposing such a rescaling requirement on an evenness index. In fact, on intuitive grounds, it seems unreasonable. To suggest that the above distribution P 1− 5 ¼ 20=98; 20=98; 20=98; 19=98; 19=98 ð Þ indicates complete evenness is simply not appropriate. Rather, this P 1− S is as even as it can be when S = 5 and N = 98. Similarly, the above distribution P 0þ 5 ¼ 94=98; 1=98; …; 1=98 ð Þis as uneven as possible for given S = 5 and N = 98, but it should not be considered to imply perfect unevenness. The above value, E 9 P 0þ 5 À Á ¼ 0:05 rather than zero, would seem to provide a reasonable representation of the evenness.

Independence of S
It has been argued strongly by some that an evenness index E should be independent of the species richness S (e.g., Peet 1975;Hill 1973;Heip et al. 1998). Such a requirement would eliminate all indices in Table 1 for an insufficient reason. It is probably inevitable that any E with the above Properties P1-P5 will to some extent depend on S. This dependence has also been discussed more recently by Gosselin (2006) and Jost (2010).
It can certainly be argued that the dependence of E on S is not by itself an undesirable property. The purpose of an evenness index E is to measure how evenly (uniformly) the relative abundances p i (i = 1, …, S) are distributed across the S different species, irrespective of the value of S. The reason for normalizing (rescaling) an evenness index such that the values of E all fall within the interval from 0 to 1 (or nearly so) is simply to be able to properly compare values of E for species distributions with differing S -values. That is, the normalized E controls or adjusts for S. Without such control or adjustment of S, an evenness index would be a confounded representation of species richness S and the form of the species distribution. For such an index, one would not be able to tell whether different index values were due to differences in S -values or differences in the form of species distributions or both. An evenness index E ∈ [0, 1] is only supposed to measure the form of the species distribution, thereby requiring the control (adjustment) for S, and thus making E a function of both p i , …, p S and S.
In statistical analysis of categorical data, an evenness index such as E 3 in Table 1 has been used as a measure of variation and referred to as the index of qualitative variation (IQV) (e.g., Mueller and Schuessler 1977, pp. 175-181;Reynolds 1984, pp. 61-64). For such measurement, the fact that the normalized E 3 ∈ [0, 1] depends on the number of categories S does not appear to be much of an issue of concern, while papers on biological evenness seem to have made the dependence on S into an unnecessary issue. Just as it is for a measure of qualitative variation, inclusion of S into the formulation of E ∈ [0, 1], besides the ease of interpretation, is precisely for the purpose of obtaining an evenness index that can be used to compare the evenness of different species distributions of differing S.
One type of situation in which there is reason for genuine concern about E depending on species richness is when the relative species frequencies p i = n i /N (i = 1, …, S) and sample size N ¼ X S i¼1 n i are based on random observations from a population of species and when the computed (sample) value E(n 1 /N, …, n S /N) is used to make statistical inferences about the corresponding unknown population index E p . Such statistical inferences would include (a) the use of the sample value of E as an appropriate estimate of E p and (b) the construction of confidence intervals for E p . However, the necessary conditions for making such statistical inferences are rarely met in biological studies. It is typical for such sampling data that neither the total number of species S p in the defined population nor all the specific types of species are known, with S < S p and S depending to some extent on the sample size N. It is then most prudent to confine the evenness measurement to the sample itself.

Replication principle
A measurement property that has also been referred to as the "replication property" or "replication principle" was introduced by Dalton (1920) for the measurement of economic inequality. Thus, if (x 1 , …, x n ) denotes the set of incomes of n individuals and if another set of identical incomes for n other individuals are combined to produce the set of incomes (x 1 , x 1 , x 2 , x 2 , …, x n , x n ) for 2n individuals, then a measure of income inequality should have the same value for (x 1 , x 1 , …, x n , x n ) as for (x 1 , …, x n ). Any number of such replications should produce the same result. Dalton (1920) called this the "principle of proportionate additions to persons", while others (e.g., Cowell 2011, pp. 63-64) have referred to it as the "principle of population". The term "replication" property or principle, as also used for biological evenness measures (e.g., Taillie 1979;Tuomisto 2012), would seem to be most descriptive and will be used here. However, it may perhaps be questionable if such a replication property should be considered necessary or even desirable, especially when the underlying data are categorical such as biological species frequencies (counts) as opposed to quatitative data such as personal incomes. Even when measuring income inequality, some feel that the desirability of this property is debatable and not self-evident (Cowell 2011, pp. 63-64), while others do not even mention this property when discussing inequality measures (Bellù and Liberati 2006;Marshall et al. 2011, Sec 13F).
For quantitive data, it does at least make realistic sense to consider a replication (x 1 , x 1 , …, x n , x n ). However, for categorical data such as a set of biological species abundance frequencies (counts) (n 1 , n 2 , …, n S ), a replication (n 1 , n 1 , n 2 , n 2 , …, n S , n S ) would seem to be an unrealistic and intuitively meaningless concept. Each n i has to be identified with a specific species (or category) within a set of mutually exclusive and exhaustive set of species of size S. If, say, (n 1 , …, n S ) are the frequencies for a set of S different types of apple trees in one geographic area and if exactly the same frequencies are observed for the same set of S apple trees in an adjacent area, it would make no sense to determine an evenness value based on the data (n 1 , n 1 , …, n S , n S ) for the two combined areas. Such determination should be based on (2n 1 , …, 2n S ). Otherwise, one would be "comparing apples and oranges".
One situation in which it does make sense to consider replication would be when a set of S different species with frequencies (n 1 , …, n S ) are split into males and females, resulting in the species-sex frequencies (n 1 / 2, n 1 /2, …, n S /2, n S /2). The S "species" categories have been split into 2S "species-sex" categories. These are two different sets of categories for which there would seem to be no basis for requiring that the evenness should be the same. Jost (2010) has also argued against the requirement that an evenness index should possess the replication property. In fact, he argues that replication invariance would be an undesirable property for an evenness index E ∈ [0, 1].
Any proposed condition such as the one involving replication also has to make some intuitive sense when simply looking at the data, both in terms of frequencies n i and proportions (relative frequencies) p i = n i /N for i = 1, …, S and N ¼ X S i¼1 n i . For instance, consider two sets of data: frequencies (80, 20) or proportions P 2 = (0.80, 0.20) versus the replicated (80,80,20,20) or Q 4 = (0.40, 0.40, 0.10, 0.10). It would be hard to justify the proposition that both data sets reveal the same degree of evenness, especially when considering the P 2 versus the Q 4 . Clearly, Q 4 indicates greater evenness than P 2 since the components of Q 4 differ less than those of P 2 .
As an example of a replication-invariant evenness index producing such intuitively unreasonable values, consider where p 1 , …, p S are ordered as in (3) (Taillie 1979, p. 56). This E T represents a slight modification of Solomon's E 10 in Table 1. Because of its lower bound 1/S, an interpretation such as E T (1, 0) = 0.50 showing more than twice the evenness of E T (1, 0, 0, 0, 0) = 0.20 makes no intuitive sense. Similarly, in terms of frequencies n 1 , …, n S , with N ¼ with such unreasonable results as E T (99, 1) = 0.51 and E T (96, 1, 1, 1, 1) = 0.24. By comparison, from the E 9 in Table 1, E 9 (99, 1) = 0.02 < E 9 (96, 1, 1, 1, 1) = 0.05, an intuitively reasonable result. From the definition of the Lorenz curve under Property P4c, it is readily seen that the Lorenz curve of, say, P 2 = (0.80, 0.20) and Q 4 = (0.40, 0.40, 0.10, 0.10) coincide (one curve with two more points than the other), a result that holds for all replication situations. However, considering the preceding arguments, such a Lorenzcurve fact would not be sufficient to argue for a proposition that an evenness index should be invariant with respect to replication.

Assessment of proposed indices
Based on the above discussion of appropriate properties for an evenness index, the validity of various proposed indices may now be assessed. All of the indices listed in Table 1 can be seen to meet the requirements for continuity, symmetry, and normalization, i.e., Properties P1-P3. However, as will be explained next, some of those indices are not strictly Schur-concave (Property P4) and most do not satisfy the conditions for value validity (Property P5).
Among other proposed evenness indices, Smith and Wilson (1996) introduced one as a strictly decreasing function of the variance V of the logged frequencies,

=S . However, since
V is not Schur-convex and a decreasing function of V is therefore not Schur-concave (Marshall et al. 2011, pp. 89, 561), the index proposed by Smith and Wilson (1996) is not Schur-concave and is therefore not a valid evenness index. Hill (1973) proposed a family of evenness indices as with no restriction on the parameters α and β. However, in order for a member of this family to be strictly Schur-concave, it is necessary that α > 0 and β < 0. These parameter ranges do not include the specific member E H2,1 for α = 2 and β = 1 in (17) mentioned by Hill (1973) and of which the E 5 in Table 1 is a slight modification. Also, E Hαβ in (17) is undefined for β < 0 unless it is assumed that all p i are positive. A proof of these parameter restrictions is given below in Appendix B. Hill (1997) also proposed the evenness index where N 2 and N 3 are defined from (17). This index is replication invariant and relatively independent of S, but, as can be inferred from the preceding paragraph, it cannot be Schur-concave since it is a ratio of Schur-convex functions. The fact that this index is not Schur-concave and therefore not a valid evenness index is also evident from, for instance, the two distributions (0.6, 0.4) and (0.99, 0.01) for which this Hill's index takes on the respective values 0.97 and 0.99. This is clearly an absurd result since the components of (0.6, 0.4) are more even than those of (0.99, 0.01), i.e., (0.6, 0.4) is majorized by (0.99, 0.01) (see (3)-(4)).
Ricotta (2004) made an interesting attempt to formulate an evenness index by using a fuzzy set theory approach. With the p i 's ordered as in (3), this index is defined as where p S + 1 = 0. However, this index is not Schurconcave (nor is it Schur-convex) and is therefore not a valid evenness index. This fact can be proved by using a counterexample such as the two species distributions P 3 = (0.6, 0.2, 0.2) and Q 3 = (0.6, 0.3, 0.1) where Q 3 is derived from P 3 by means of the transfer of 0.1 from p 3 to p 2 . From the definition in (4), it is clear that P 3 is majorized by Q 3 , but E R (P 3 ) = 0.53 < E R (Q 3 ) = 0.57 so that, from (5), E R is not Schur-concave.

Value validity (condition (13))
Of the evenness indices identified so far in this paper, most of them lack the value validity property as defined and discussed above. Those indices cannot therefore be used to make valid difference comparisons as in (7b)-(7c) and their numerical values cannot be used as valid indicators of the true extent of evenness from a data set. Only "larger than" comparisons as in (7a) may be valid for those indices. Among the indices defined in Table 1, only E 8 , E 9 , E 10 , and E 11 can be seen to meet the condition in (13) for value validity. However, as proved above, Bulla's E 8 , while being Schur-concave, lacks the strict Schurconcavity property. As a demonstration of such lack of value validity, consider the results in Table 2 giving the values of the measures in Table 1 for the lambda distribution P λ S in (8) with some different values of λ and S. It is clearly indicated by these results that, except for E 8 , …, E 11 , the various indices violate the condition in (13) that E P λ S À Á ¼ λ, some more so than others. For some of the indices, their values for the distribution P λ S exceed the λ -values, i.e., they overstate the extent of evenness, while such bias is reversed for other indices, depending upon the values of λ and S. Although these results are based on the specific distribution P λ S in (8), they have definite implications for species distributions P S = (p 1 , …, p S ) in general because of the relationship in (10).
It is then apparent from Table 2 together with (10) that the general tendency for each of the E 1 , …, E 6 is to overstate the degree of evenness, especially when the species richness S is small. The indices E 1 , E 3 , and E 6 appear to consistently overstate the evenness. In terms of the value bias of an evenness index E (VBE) (as distinct from statistical bias of an estimator) defined as E 1 , E 3 , and E 6 appear to have such a consistent positive bias for all P λ S and hence generally for all P S = (p 1 , …, p S ) from (10). Interestingly, such positive bias for E 3 does not depend explicitly on S since, from (18) and the expression for E 3 in Table 1, this bias is found to be V BE 3 P λ which also is clearly maximal for λ = 0.5. Note also in Table 2 for S > 2 the extremely large negative value bias for the index E 7 proposed by this author (Kvålseth 1991), making it entirely invalid for the difference comparisons in (7b)-(7c).
For the evenness indices E 9 , E 10 , and E 11 , which all have Properties P1-P4 and which meet the condition in (13), it remains to be determined if those indices also meet the value-validity condition in (14). This is considered next.

Value validity (condition (14))
The Williams' E 9 meets the condition in (14) exactly since which equals E 9 in Table 1. The E 9 is also equivalent to the coefficient of nominal variation (CNV) proposed by Kvålseth (1995) as a measure of variation for nominal categorical data. It can also be expressed as a linear function of the standard deviation s S of p 1 , …, p S (with devisor S), i.e., This is so rather obvious an evenness index that undoubtedly others have also thought of it. The determination of whether or not Solomon's E 10 and this author's E 11 comply with the requirement in (14) then becomes a comparison of E 10 and E 11 with the 'gold standard' E 9 for different distributions P S = (p 1 , …, p S ). The E 11 is presented as an interesting alternative to E 10 , both of which are based on the ordering p 1 ≥ p 2 ≥ Table 2 Values of the evenness indices in Table 1 for the lambda distribution P λ S in (8) with different λ and S values … ≥ p S . The E 11 is based on the complement of the homogeneity (dominance, concentration) measure X S i¼1 p i =i proposed by Kvålseth (1993) and normed to the [0, 1] -interval. For large S, the denominator of E 11 can more conveniently be computed as 1 − (logS + 0.5772)/S + 1/2S 2 , which is found to be accurate to four decimal places when S > 11 (see, e.g., Knopp 1990, p. 538).
In order to obtain the necessary data to determine if E 10 and E 11 comply with the requirement in (14), i.e., determine the extent to which E 10 and E 11 approximate E 9 , randomly generated distributions P S = (p 1 , …, p S ) were used. For each such P S , the value of S was first generated as a random integer between 2 and 30 (inclusive) and then each p i was generated in descending order (p 1 ≥ p 2 ≥ … ≥ p S ) as random numbers (to 6 decimal places) within the following intervals: For each of the 25 such computer-generated distributions P S , the values of E 9 , E 10 , and E 11 were computed as were the values of Pielou's E 1 in Table 1. Some (five) generated data sets were excluded since the index values were all nearly equal to 0 or 1. The normalized entropy E 1 was included, although it clearly violates (13), since this entropy-based index appears to be the most popular evenness index. The results are summarized in Table 3.
As expected from the results in Table 2 together with (10), Table 3 shows that E 1 substantially and consistently violates the value-validity condition in (14) since its values generally differ considerably from those of the 'gold standard' E 9 . The overstatement (positive value bias) of evenness by Pielou's E 1 appears to be as large as up to about 150 percent. By comparison, the approximation in (14) is much better for E 10 and E 11 , i.e., the values of E 10 and E 11 in Table 3 are generally much closer to the corresponding values of E 9 than are those of E 1 . This is especially the case for E 11 .
In the case of Solomon's E 10 , the data in Table 3 indicate that values of E 10 tend to be systematically and sometimes considerably smaller than those of E 9 . For data Sets 3, 6, 12, 14, and 18, the E 10 values are half or less of the E 9 values. If E 9 is used to predict E 10 as Ê 10 = E 9 , then the coefficient of determination (when properly computed; see Kvålseth 1985) is found from Table 3 to be R 2 = 1 − ∑(E 10 − E 9 ) 2 /∑(E 10 − Ē 10 ) 2 = 0.86. Similarly, the root mean square difference (RMSD) between the values of E 10 and E 9 is found to be 0.10. When similarly comparing E 11 with E 9 , it turns out that R 2 = 0.99 and RMSD (E 11 , E 9 ) = 0.03. That is, nearly 100% of the total variation of E 11 (about its mean) is explained by the 'fitted' model Ê 11 = E 9 as compared to 86% in the case of E 10 . Consequently, E 11 cannot be rejected for lacking value validity since it complies with the requirement in (13) and seems to have a high degree of approximation to E 9 as required by (14). However, in the case of Solomon's E 10 , while it complies with (13), one could certainly question its value validity since its approximation to E 9 is rather marginal.

Concluding comments
As advocated in this paper and about which most researchers seem to agree, an evenness index E should take Table 3 Values of indices E 1 , E 9 , E 10 , and E 11 defined in Table 1 for species distributions P S = (p 1 , …, p S ) with randomly generated S and p i (i = 1, …, S) on values over the interval from 0 to 1, with E P 0 S À Á ¼ 0 and E P 1 S À Á ¼ 1 for the distributions in (2). These fixed bounds are preferable when comparing the evenness of different species collections or communities with differing species richness S. Some indices have been proposed, however, for which the bounds depend on S such as E P 0 (16) by Taillie (1979) and E C by Camargo (1993) defined as The E T is seen to be a simple transformation of E 10 in Table 1, i.e., E T = [(S − 1)E 10 + 1]/S. The E C index has been referred to as Camargo's evenness index (e.g., Magurran 2004, p. 118) although it is the same as the E T proposed earlier by Taillie (1979). The fact that E C = E T follows directly from the equality where p 1 , …, p S on the right-hand side are ordered as in (3). Since all the indices in Table 1 can be seen to have Properties P1-P4, with the exception of Alatalo's E 5 and Bulla's E 8 that are not strictly Schur-concave (Property P4), those indices can reasonably be expected to provide valid size (order) comparisons as in (7a). That is, for an evenness index E with Properties P1-P4 and for which, say, E(P S ) = 0.90 and E(Q R ) = 0.60, one can reasonably say that the species with abundance distribution P S has higher evenness than that with abundance distribution Q R , but nothing more can be said with validity. One cannot credibly say that one has "considerably higher" evenness than the other or that the 0.90 value shows a "very high" degree of evenness even though the 0.90 value is near the top end of the [0, 1] -interval over which E is defined. Such interpretations and the types of difference comparisons in (7b)-(7c) require that E meets conditions (13)-(14) for value validity, conditions which are only clearly satisfied by Williams' E 9 and the new index E 11 in Table 1.
Of the two indices E 9 and E 11 with the Properties P1-P5, there seems to be no particular reason for preferring E 11 over E 9 . The E 9 also has an intuitively appealing geometric interpretation in terms of the Euclidean distances between points in S-dimensional space in (19), i.e., E 9 is the relative extent to which the distance between P 1 S in (2) and the species abundance distribution (point) P S = (p 1 , …, p S ) is less than the distance between P 1 S and P 0 S defined in (2). Alternatively, since P S is majorized by P 0 S and since the distance d P S ; P 1 S À Á can be shown to be strictly Schurconvex, the E 9 in (19) can be expressed as that is, E 9 is the relative extent to which the distance between an observed P S and the P 1 S is less than the maximum such distance over all S -species distributions. For instance, from the data by Magurran (2004), pp. 226-227 of the abundances of 16 species of dung beetles around Bangalore,India,(19) or (20) gives E 9 (P 16 ) = 0.48, which means that the distance between the dung beetle abundance distributions P 16 and P 1 16 is 48% less than its maximum possible value for 16-species distributions. If, for these data with S = 16 and total number of dung beetles N = 1745, one uses P 0þ 16 ¼ 1730=1745; 1=1745; …; 1=1745 ð Þ and P 1− 16 ¼ 110=1745; 109=1745; …; 109=1745 ð Þ instead of the P 0 16 and P 1 16 in (2) as discussed above, one obtains the distances d P 16 ; P 1− 16 À Á ¼ 0:5034 and d P 0þ 16 ; P 1− 16 À Á ¼ 0:9588 so that, from (19), E 9 (P 16 ) = 1 − 0.5034/0.9588 = 0.47.
In order to avoid misrepresentations and incorrect interpretations by using evenness indices lacking important properties, the conclusion from the analysis in this paper seems clear: E 9 is the index of choice. Having all of the necessary properties (Properties P1-P5), E 9 is a most informative index for which all of the comparisons in (7a)-(7c) can reasonably be considered valid.

Appendix A
In order to prove that Bulla's E 8 in Table 1 is Schurconcave, but not strictly Schur-concave, consider the function where α is some arbitrary parameter. This F α is simply the sum of arithmetic means m αi of order α and of which E 8 is the particular limiting member F − ∞ as α goes to − ∞. By taking the partial derivatives of m αi with respect to p i , it is found that which is negative for α < 1 so that m αi is a strictly concave function of p i for α < 1 and fixed S. Thus, being a sum of strictly concave functions, F α is strictly Schurconcave for α < 1 (Marshall et al. 2011, p. 92). However, in the limit as α → − ∞, when (F α − 1/S)/(1 − 1/S) becomes Bulla's E 8 , the strict Schur-concavity does not hold as demonstrated by the earlier counterexample.