Devolution and grant-in-aid design for the provision of impure public goods

Traditional fiscal federalism theory postulates that devolution for the provision of local public goods increases welfare. However, most of the services offered at local level are local impure public goods whose characteristics may prevent devolution from being efficient. Our paper shows that devolution is the optimal choice only for local impure public goods. For an environment characterised by coordination and asymmetry of information problems, we propose the optimal grants-in-aid formula that Central Government should use to reduce welfare losses and we compare it with what suggested by the mainstream literature. Finally, we show under which conditions devolution should be preferred to a centralised solution. From a policy point of view, our paper may explain the heterogeneity in the choices made by countries in terms of devolution in the provision of merit and impure public goods.

health care and education, local transport, sporting facilities, waste disposal. These services generate utility directly to users (as private goods do), but also indirectly to non users (as local public goods do). Their beneficial effect is usually not confined to users in a specific jurisdiction and this why spillovers arise. According to OECD statistics, in 2014 public expenditure for health, and education accounted for about 12.2 % of total GDP, in the US the share is even higher (OECD 2016).
For these services, there is no consensus in the literature on which is the most appropriate government tier to deliver the good. 2 The institutional setting is quite heterogeneous among countries. Dziobek et al. (2011) examine the fiscal decentralisation index for several countries and observe a large variation that does not simply depends on economic or geographical factors; for education Turati et al. (2011) show a quite heterogeneous picture, which is even greater for health and social care (Costa-Font and Greer 2013). In Alcidi et al. (2014) several measures of fiscal decentralisation are studied for the European Union, showing that different decentralisation strategies have been adopted, even within the same country. For Europe as a whole fiscal decentralisation (defined as the ratio of local expenditure to total expenditure) is around 20 % for health 40 % for education and 80 % for environment protection. Some countries are below all these levels (Austria, Belgium, Germany, Ireland, Luxembourg, Macedonia, Slovenia), a second group are above the average (Lithuania, Poland) while a third group of countries are well above the limit for some services and below for others. 3 The reason for this observed heterogeneity is that the determination of the optimal quantity to be produced is more complex than for public goods. For the latter, the quantity supplied and financed also represents the quantity consumed by all the individuals in the community. For impure public goods a pseudo demand exists, and Government should use indirect instruments (such as subsidises to the price) to match demand with supply and to produce the optimal quantity. However, the information necessary to achieve a First Best (FB) result is usually not accessible and only second best solutions can be attained.
The traditional literature on fiscal federalism (Oates 1972;Tresch 2002) argue that the allocation of functions between Central and local Governments should follow efficiency principles. Production should be assigned to the tier which is better informed on local preferences, while Central Government (CG) may use grants for equity and efficiency reasons. Second generation models 4 suggest that the success of fiscal federalism depends on the information the agents possess about specific parameters (Akai and Mikami 2006;Levaggi 2002;Wildasin 2004;Snoddon and Wen 2003) and on the level of coordination in the actions of the different agents (Besley and Coate 2003;Köthenbürger 2008;Petretto 2000;Ogawa and Wildasin 2009). Both issues have been widely studied by the literature, which shows the existence of a trade-off between autonomy and control.
In this article we compare centralisation (defined as the provision of goods at local level by CG) with devolution (defined as the provision of goods at local level by a local government) for the provision of an impure local public good with spillovers in an environment characterised by asymmetry of information. We show that the traditional "devolution is always welfare improving" is valid only in the absence of spillovers. When they do exist, the gain in utility derived from devolution must be sufficiently high to compensate the welfare loss deriving from asymmetry of information and lack of coordination. In this setting, we derive the optimal grants-in-aid formula and compare it with what suggested by the literature for local public goods.
From a policy point of view, our paper may explain why some countries have preferred centralisation to devolution for the provision of services such as education and health care. Calsamiglia et al. (2006) argue that it may depend on altruism; in this article we show that efficiency may be the driving factor. For services whose comparative advantage in being locally produced is limited, while their consumption produces spillovers, centralisation may be the second best choice.
The organisation of the paper is as follows: in "The model" the main features of the model and the FB are presented; in "Centralisation" the centralised solution is analysed, while in "Devolution" devolution is presented; they are then compared in "Centralisation versus devolution"; finally in "Conclusions" the conclusions are drawn.

The model
A country, whose population is normalised to one, is divided into two local authorities j ∈ {1, 2}, equal in everything but their preferences for the impure public good y. 5 Each individual has an exogenous money income M in the range [M, M], whose distribution in each region has density function 1 2 f (M). Then, total income is: and total income in local authority j equals Y j = 1 2 Y . Income is used to buy private commodities and one or zero units of an impure local public good y, whose user charge is equal to p j u . y is an impure public good, which means that it has a double effect on the utility function of each individual: • as a private commodity; • for the whole quantity that is produced.
Let us first consider the private characteristic. A good of quality θ l produces an utility equal to αθ, where α is a taste parameter in the range (0, β), equally distributed among the population. Therefore, if the difference αθ l − p j u is positive the consumer buys y, otherwise he does not buy the good and private utility is zero. To simplify the algebra, we assume that M > p j u for all j, i.e. everybody can afford y. 5 We share this assumption with Wildasin (2001). In a more general setting, a reasonable assumption would be to assume that the two local authorities have a different average income. For the purpose of this analysis such assumption is irrelevant. Petretto (2000) and Levaggi and Menoncin (2014) present a model with these characteristics as regards the distribution of income.
The nature of impure public good means that the commodity y accrues utility to users and non users; in our model this characteristics is represented by the term z j (θ l (Q j + k Q −j )), where is the total quantity demanded in each community and Q −j is the quantity produced outside jurisdiction j. Let us examine each element: z j represents the relative importance that each community attaches to the public goods characteristic of y; the second important element is the parameter k ∈ [0, 1], which captures spillovers and represents the utility that consumers in jurisdiction j attribute to the production of good y outside their region. For k = 1, y is a national impure public good: y produces the same level of utility irrespective of its geographical location. For k = 0 consumers only care about the quantity produced locally (impure local public good) and for 0 < k < 1 we have a local impure public good with spillovers, i.e. consumers derive a higher level of utility from the quantity produced in their jurisdiction, but also care about the level produced elsewhere (see also Wildasin 2001Wildasin , 2004. The preferences for the impure public good are linear and homogeneous within each local authority, but specific to each of them. 6 y can be supplied at central level by CG or by an autonomous lower government tier (LG) in each jurisdiction. In line with the fiscal federalism theory, y produces a level of utility θ C = 1 if it is supplied by CG and θ L = θ > 1 when it is produced by LG. This classical hypothesis in the theory on fiscal federalism (Oates 1972;Tresch 2002) reflects the assumption that at local level the preferences for the community can be better matched by local supply. 7 Finally, we assume that, while the cost to produce the good is p, users pay only a fraction p j u of such cost. The difference is financed using income taxes at rate t j in each jurisdiction.
As in Besley and Coate (2003), we assume that utility is additive in its components and that taxation is linear. The use of a linear utility function allows concentrate the analysis on efficiency and to rule out distributional issues (Levaggi and Menoncin 2014).
The utility function for a representative individual living in community j is therefore written as: In this paper we take into consideration the problem of CG, that has to decide whether to delegate the supply of y to LGs and if so how to implement devolution. The objective of CG is to maximise the welfare of the population, defined as the aggregation of individual preferences. Asymmetry of information prevents CG to observe z 1 and z 2 ; CG therefore uses some expected value z for both, so that in general the goal is to maximise the following function: 6 The use of more general utility functions is presented in Levaggi (2010). 7 For example, let us consider local transport: the same number of bus services produces a different level of utility according to its hourly distribution within the day. (2) by reducing the user charges p j u , j = 1, 2. The traditional theory on fiscal federalism Oates (1972) postulates that when preferences are not homogeneous and the goods produced at local level have a higher level of utility, CG should always devolve any decision to lower tiers. However, coordination problems and information asymmetry may prevent the attainment of this objective when the good to be produced is, as in the present case, an impure public good with spillovers.

First Best (FB)
To start with, let us define the solution for an ideal world with perfect information, where a benevolent social planner can supply the good of the highest quality, is able to observe the local preference parameters z 1 , z 2 and can therefore supply the optimal amount of the local public good in each region.
This represents the ideal, FB allocation that will be used as a benchmark to evaluate the relative benefits of implementing either a centralised solution or devolution.
Let us examine (2) to understand the problem faced by the regulator. The quantity demanded depends on the private utility that each consumer receives from the good, but the benefit produced by that commodity also depends on the utility that the community as a whole derives from its provision. 8 This causes the usual market failure that the literature on impure public goods has long studied and calls for an intervention of the public decision maker (Musgrave and Musgrave 1989). By subsidising y, the regulator reduces the user charge p j u , and increases demand, so that the optimal equilibrium between the benefit (public and private) produced by the good is equal to its marginal cost. If the regulator finances a fraction (1 − ρ j ) of p in each region through a (national) linear income tax at rate t, the user charge becomes p j u = ρ j p and from (1) the budget constraint is equal to: The regulator has to find the optimal values of ρ 1 and ρ 2 for which total welfare in (3) is maximised. The optimal solution in terms of price subsidy p(1 − ρ FB j ), quantities produced in each region Q FB j , total quantity Q FB and welfare W FB is presented in Table 1 and derived in Appendix 1.
The first line shows the price subsidy the planner should provide. As expected, it depends on the utility generated by the public good characteristics, gross of the spillover it creates.
The optimal total quantity reflects the three components that characterise the impure public good: 1 − p θβ is the demand for its private good characteristic, while (1+k)(z 1 +z 2 ) 2β is the willingness to pay for its public good aspect, gross of the spillovers effects.
To understand this point, consider for example the case α = 0: these individuals do not buy y, but as long as z j is non negative, they attribute a value to the production of that good.
W FB represents the ideal level of welfare, given preferences and resources, but this outcome cannot be obtained. CG provision implies that only the good with the lowest quality (in terms of utility), i.e. θ = 1 is supplied and that at central level z 1 and z 2 cannot be observed. On the other hand, LGs produce the goods with highest utility, but they do not take spillovers into consideration. CG may influence the decisions of lower tiers, but does not have enough information to define an optimal policy. In this environment it is necessary to find the second best solution that allows the minimum welfare loss with respect to the FB solution. In what follows we will analyse and compare two alternatives: • Centralisation. CG produces the less productive variety of y. The quantity is uniform across regions and it is set according to estimated preferences for the good; • Devolution. CG delegates the production of y to lower government tiers (LG). A matching grant (King 1984) is supplied by CG to reduce the negative effects of spillovers.

Centralisation
Let's examine the case where CG produces the goods with lower productivity (θ C = 1).
Since z j cannot be observed, we assume that an expected estimate z is used to determine the optimal provision. CG has to set the optimal subsidy that maximises welfare in this context. The optimal subsidy and quantities can be found by substituting θ with θ C = 1 and z 1 = z 2 = z in the formulas in the first two rows of Table 1 and are presented in Table 2. The welfare W C is obtained by substituting the optimal user charges pρ C 1 and pρ C 2 in (3). Let's compare the results in Tables 1 and 2. Even when asymmetry of information is minimal, e.g. when z 1 = z 2 = z, the subsidy is lower than in FB: for the quality provided at central level the willingness to pay is lower and thus the subsidy. This in turn means that the quantity of good produced is not optimal. As a consequence, welfare is not maximised and W FB − W C measures the welfare loss.

Devolution
In the previous section we showed that centralised provision does not allow to reach the FB allocation; here we consider the alternative solution of devolving production to lower tiers that know local preferences and can provide a good that fits best user needs. In this section we consider and compare two possible alternatives: • "pure devolution" where LGs are solely responsible for the provision of the impure public good (the case will be indicated by the letter F); • devolution where LGs provide the good, but CG influences their decisions using a matching grant. 9

Pure devolution
If the goods produced at local level were local public goods, devolution would always allow to reach the welfare level of FB (Oates 1972). However, the presence of spillovers means that also devolution is a second best option. In this case each Local Government (LG) maximises its utility function, but it does not take into account the accrued utility experienced by users in other local authorities. As for centralisation, each LG has to find the optimal user charge p j u = pν j that maximises the aggregate utility of jurisdiction j. The subsidy will be financed using a (local) linear income tax at rate t j ; the budget constraint in this case is LG has to find the value of ν j that maximises the following objective function: Table 3 shows the results derived in Appendix 2 in terms of the optimal subsidy p(1 − ν F j ), quantities Q F j , total quantity Q F and total welfare W F . Comparing Table 1 with Table 3 it is straightforward to see that the two results coincide for k = 0, i.e. when there are no spillovers. In all the other cases, the subsidy is too low, total quantity falls short of the optimal level, and the welfare level attained is lower than in FB. This is the first result of our model: the findings of the traditional literature on fiscal federalism are valid also for the provision of impure public goods, provided that there are no spillovers among regions. In all the other cases, "pure" devolution produces a welfare loss which is equal to In the fiscal federalism literature matching grants are transfers from higher to lower government tiers designed to complement sub-national contributions. They have a form similar to a price subsidy.

Centralised allocation
Subsidy

Devolution with a matching grant
CG may try to influence the choice of the local subsidy using a matching grant. 10 This solution has some drawbacks: the asymmetry of information that prevents Central Government from providing the optimal quantity of y may also influence the optimal matching grant setting decision, for two main reasons: • coordination problems, fiscal illusion and spillovers: due to the specific characteristics of y, any change in Q j affects the level of utility in authority − j and its decision on Q −j . The matching grant introduces another interdependence in decisions, because the level of local expenditure has an impact on the national tax rate, hence on the welfare of each jurisdiction. If local decision makers misperceive the effects of their actions, CG may be unable to attain FB, even when it can observe the reaction function of LGs; • asymmetry of information: CG cannot observe local preferences parameters and the reaction function of each local government.
The environment is characterised by the following assumptions: y is subsidised by LG, which does not take into account the spillovers created by its production. CG influences the behaviour of LGs using a matching grant.
The timing of the game is as follows: (a) CG sets the grant to maximise total welfare using its beliefs on LGs' behaviour and users' preferences; (b) LGs set their reaction function and their local tax rate.
Although CG cannot observe some relevant parameters, LGs are followers, i.e. we rule out the possibility that they may act strategically in setting their reaction function. 11 The problem can be solved through backward induction: in the first stage LG decisions and reactions to a grant setting are considered; in the second stage CG finds the optimal grant, given its information set.
The analytical model is presented in Appendix 3. In what follows we discuss the intuition behind the findings.

LG reaction function
LG receives a matching grant at rate (1 − ρ j )p from CG. If it thinks that the good should be further subsidised, it will introduce a supplementary subsidy at rate η j , to be financed 10 A matching grant is a form of subsidy to LGs proportional to its expenditure. In other words, the use of a matching grant means that a fraction of the total cost for the provision of y is borne by CG. 11 Wildasin (2001) and Köthenbürger (2008) show how local government could play strategically in a game where the goods to be produced is a local public good with spillovers.
by a proportional tax on local income at rate τ j . The user charge for the service will be equal to p(ρ j − η j ).
LG decision has a twofold impact on total welfare: on the revenue side it will change the national tax rate; on the expenditure side it will alter the quantity of the impure public good. This is one of the novel elements of our model: given the nature of y, each LG has to foresee the behaviour of the other local authority and should take into account the impact of increasing its expenditure. These effects may not be correctly perceived by LGs. In our model we consider three alternative behaviours for the LGs.
(FC) Each LG thinks that the other local authority will replicate the same strategy (full coordination-FC), i.e. it will subsidise local production by the same amount. On the revenue side the overall change in the national tax rate is taken into account and on the expenditure side for each j the quantity Q −j is updated accordingly.
(PC) Each LG thinks that the other local government does not further subsidise the good (partial coordination-PC). In this case the quantity Q −j will not change, while a one-sided correction of the national tax rate is considered.
(FR) Each LG thinks that the effects of its expenditure on the tax rate are marginal, so that its decision influences neither the rate, nor Q (free rider, FR). The latter hypothesis may not be reasonable in a model with two local authorities, but in a more general context where the number of jurisdictions is fairly large this behaviour may be quite plausible. It is interesting to note that this is the hypothesis that has been used by the traditional literature in defining grant in the presence of spillovers.
The detailed derivation of the formulas for the local subsidy p η j is presented in Appendix 3 and reported in Table 4. For each LG behaviour two values are reported, because a second source of asymmetry of information has to be taken into account. The actual subsidy set at local level also depends on the local preferences z j , which cannot be observed; as in section "Centralisation" we assume that an estimate z, equal for both local authorities is used by CG to guess the subsidy level that will be set by LGs.
From Table 4 we note that the subsidy set by a LG behaving as a FR is higher than that in the PC case, as one might expect. A "free rider" behaviour implies that the LG does not take into account the increase in the national tax rate that is action is causing, i.e.
LG underestimate the tax price for good y and will be prepared to subsidise it at a higher rate. A more general conclusion cannot be drawn: the relative effects of spillovers and national matching grant will determine the result. The use of a matching grant may allow CG to improve on pure devolution only if the grant is set correctly, but CG can observe only a subset of the parameters. For this reason, it will have to devise a strategy to minimise the negative effects due to the lack of information.

Table 4 Local subsidy in case of devolution with a matching grant
LG behaviour Actual local subsidy Subsidy estimated by CG

Grant setting
In the second stage CG has to set the matching grant, based on his beliefs about the behaviour of the LGs and the estimation of the unobservable preferences. We assume that CG expects LGs to have the same behaviour (i.e. both are either FC, PC or FR), thus the matching grant will be equal for the two regions (i.e. ρ = ρ 1 = ρ 2 ). The estimated reactions of the LGs (the subsidy in the last column in Table 4) are then used to determine the welfare function W B (ρ) for B=FC,PC,FR using (3). CG has to decide which of the LG reaction is more plausible and assigns a probability π B to each of the three possible reactions; the matching grant ρ will be then found by maximising the expected welfare E(W ) = B π BWB (ρ). The analytic derivation of the optimal grant in the general case is presented in Appendix 3. The solution depends on the probabilities π B ; in Table 5 the optimal subsidies for the following relevant cases are shown: CG believes that LGs will behave as either FC, PC or FR (i.e. π B = 1; π −B = 0 for each possible value of B) and the case where CG assigns equal probability to each of the behaviours, i.e. π B = 1 3 for all B (the abbreviation "Equi" is used for this case). Note that if LGs act as FC, CG cannot influence the expected welfare and no matching grant will be used. For PC the grant is twice the size than for FR, as one might expect. Finally, if all the reactions functions are taken in consideration with equal weight (Equi) the optimal grant is slightly higher than for FR, but quite close. The traditional literature and most actual grant formulae use FR assumption to model the behaviour of LGs. FR behaviour has a boosting effect on expenditure; by assuming the worst scenario in terms of effects on expenditure, CG tries to reduce the negative impacts on its expenditure of LG choices.
Ex-post welfare analysis Once the grant has been set by CG, the LGs will further subsidise the good following the scheme presented in the second column in Table 4. In general, for each case of the grant setting, three different reactions of the LGs are possible. The state contingent solution is presented in Table 8 in Appendix 3. In what follows we examine the results from a more qualitative point of view; the discussion will be supported by the graphical visualisation of the main findings.
Let us start by examining the case where CG assumes that LGs reaction is of the "fully coordinated" type (FC). It turns out that the difference of the user charge with respect to the FB case does not depend on CG grant, which will therefore be set to zero. If the action of the LGs is either PC or FR, the outcome of the "pure" devolution case is replicated. The difference in quantities with respect to the FB case is: Note also that, if the reaction of LGs is FC, the outcome does not depend upon the action of CG: the total quantity is equal to the one in FB, but if the preferences in the two regions differ, it is not correctly distributed among them. In fact the difference in each region is k 2β (z j − z −j ) and this causes a welfare loss with respect to the FB case equal to LGs reaction is either PC or FR, CG can reduce the difference in the subsidy and total quantity using a matching grant, by an amount that depends on z. If CG beliefs are fulfilled, the result is the same in both cases (PC and FR) and, if z = z 1 +z 2 2 the total quantity produced is optimal, while the welfare loss is half the one obtained when LGs react as FC. Again, the total quantity is optimal, but its distribution across local authorities is different from FB and the welfare level is lower.
When the reaction of the local authorities is uncertain, the analysis has to be carried out by comparing the mean value of the welfare functions. In all cases the term k 2 θ β can be factored out, and the comparison only depends on z, i.e. on the quality of the information that CG has on the preferences of the two regions. The analytical comparison can be made by standard algebraic calculations; Fig. 1 illustrates the results by showing the relative position of the average welfare losses (wrt FB) under the different assumptions of CG about the reaction function of LGs. With the exception of the FC case, welfare losses are convex in z. The assumption that LG reacts as PC minimises the welfare loss only if CG considerably underestimates the average z = z 1 +z 2 2 of local preferences z < 3 4z . Let us call W FR and W Equi the ex-post average welfare differences (the last column in Table 8) under the two assumptions "FR" and "Equi" of CG on the behaviour of LG. If z ranges in 3 4z , 12 11z then W Equi is the lowest and has a minimum for z =z. The welfare loss that can be expected using FR performs better for higher values of z and has its minimum value for z = 6 5z . W Equi increases at a higher rate than W FR as the distance of z from z increases because the grant under FR is lower than with the other assumptions. At the left of z the grant may be too low to make local authorities react optimally and local preferences are underestimated. At the right of z the two effects may offset each other. The two welfare losses are equal for z = 12 11z ; this implies that for values of z lower than or very close to the mean, using a grant that minimises the expected welfare loss is preferred to assuming that LGs are free riders. In setting the matching grant, most traditional literature on fiscal federalism implicitly or explicitly 12 assumes that CG may observe local preference on average and sets the grant as if local authorities were free riders. If CG can observe the mean of true preferences (i.e. z =z), it should use the grant that minimises the welfare loss, but the mistake made using FR is small.
The comparison with the "pure devolution" case is less clearcut: the welfare loss W F does not depend on z and in a graphical comparison similar to the one in Fig. 1 it is represented as a horizontal line. Its relative position in the picture depends on the ratio of the preferences in the two regions. If the ratio is not too high, �W F > �W FC and obviously any solution with a matching grant is preferable. As the ratio gets larger, "pure devolution" could anyhow be a viable option only in extreme cases, where either z greatly underestimates z, or if the magnitude of the two parameters z 1 and z 2 is extremely different. 13 Thus, ruling out unrepresentative cases, we can then conclude that a matching grant generally improves welfare, i.e. in the presence of spillovers it is optimal for CG to induce LGs to change their expenditure patterns using a matching grant.

Centralisation versus devolution
Centralisation correctly takes into account spillovers, but it never allows to reach FB because goods produced in centralisation have the lowest quality level (θ c = 1). On the other hand, devolution has drawbacks in terms of coordination, because LGs are unable to take into account the utility that users outside their jurisdiction attaches to their production. In the previous section we have shown that "pure devolution" is not optimal: CG intervention with a matching grant improves welfare. In this section we compare centralisation with this model of devolution. The welfare loss in the centralised decision depends on the quality gap (θ) and on the difference between the true preferences for the public good (z 1 and z 2 ), and the estimate used by CG (z). The one in devolution depends on the information CG has on local preference and LG's reaction to the grant. Table 6 summarises the welfare losses under the various assumptions. The welfare loss for centralisation derives from under-provision and from the lower utility each unit of impure public good produces; for devolution the loss derives from the provision of the wrong quantity of impure public good. The welfare loss for devolution is zero if k = 0: if the goods produces spillovers (k > 0) there might be scope for centralised provision.
The two parameters θ and k are thus fundamental in the comparison between W C , the welfare loss for centralisation, and W D , the welfare loss for devolution (the latter depends on CG's information set and we denote it generally by a subscript D). As for the 13 In fact W F ≥ W FC whenever 0.268 ≈ 2 − √ 3 ≤ z1 z2 ≤ 2 + √ 3 ≈ 3.732, while W F is lower than the minimum welfare loss obtained with some matching grant only if z1 z2 > 29 + 2 √ 210 ≈ 57.983 or z1 z2 < 29 − 2 √ 210 ≈ 0.017. Fig. 1 Comparison of the mean welfare loss produced by the matching grant for various levels of z and under the different assumptions on the behaviour of LGs dependence on θ, the function W C is increasing and convex, while W D is linear in this variable; both are quadratic polynomials in k. For W C , an increase in k reduces the loss deriving from choosing a uniform level of provision in the two local authorities, but as θ increases, the loss caused by producing a goods of relatively lower quality increases. The prevailing effect depends on the values of k and θ.
If we assume that CG can observe average local preferences, i.e. z = z 1 +z 2 2 , the lowest value for W D is equal to W Equi . In Appendix 4 it is shown that the welfare loss increases more rapidly for centralisation than for (the best case of ) devolution. Therefore devolution is the best option for any θ whenever W C ≥ W Equi for θ = 1. In all other cases there exists a (unique but depending on k) value θ * > 1 for which if θ < θ * centralisation performs better than (the best case of ) devolution. The sign of W C ≥ W Equi depends on k and the following can be proved (see Appendix 4): The value of k * is shown in (16); the one for θ * can be found explicitly, but the expression is quite cumbersome.
The above analysis has the following economic interpretation: for θ = 1, the welfare loss in centralisation is due only to the use of z instead of z 1 and z 2 , which becomes less and less important as k increases. At the same time the loss in devolution increases if spillovers become important. For sufficiently high values of k centralisation should be preferred. On the other hand, for a fixed value of k, an increase in θ has the same effect on both welfare losses, but has a comparatively greater effect on W C and this gradually (as the spillover increases) offsets the gain produced by a mitigation of the mistake produced by the uniform distribution of the goods between the two regions.
The dependence of the difference of welfare levels on both k and θ is depicted in Figs. 2 and 3, while a numerical example is presented in Table 7. In Fig. 2 one can observe that as θ increases [i.e. passing from the graph (a-d)] the scope for centralisation gradually reduces. In (d) the two welfare losses are depicted for θ > θ m , where θ m is the value of θ * for k = 1: in this case devolution always performs better than centralisation.
Devolution with matching grant From the above results it also follows that the abscissa of the intercept between W C and W FR as functions of θ is smaller that θ * . Thus, assuming that LGs are free riders leaves more scope for devolution than what would be optimal.
If CG is able to predict the behaviour, but cannot observe preferences, the minimum productivity differential θ * for which devolution should be preferred to centralisation may be evaluated using the same approach presented above. In this case, given that in devolution the reduction in welfare is lower than for the general case, θ * will be lower than in the model just presented, but there will still be an area in which devolution is not the best option.
These results are confirmed by the numerical simulations presented in Table 7. The first two simulations shows the effect of the spillover k on the optimal choice for a good whose public good component is relatively high. Preferences are rather homogeneous Dependence on θ of the welfare losses in centralisation and devolution for z =z for k < k * and k > k * across regions and the quality produced in devolution is substantially higher than in centralisation. In this case devolution with a matching grant (Equi) is the second best choice, but the loss produced by FR is rather low. On the contrary, if we consider a case where the quality differential is not important and preferences are heterogeneous, the public good aspect is less significant and centralisation is the best choice. In both cases, k increases the distance between welfare losses.

Conclusions
This paper studies the conditions under which devolution is a second best choice for the provision of goods and services are impure public goods with spillovers. Devolution calls for coordination in the actions of lower tiers to prevent a failure of the process, even when CG observes the relevant parameters and could use a matching grant to internalise spillovers.
In the more general case where CG cannot observe LG's reaction function and local preferences, devolution may cause large welfare losses. Its size depends on the spillovers effect k, on the level of the productivity θ of the goods produced at local level and on how well CG can predict the parameters it cannot observe. Our main conclusion is that devolution should be used only if the productivity (in terms of utility) of the goods produced at local level is sufficiently high to counterbalance the welfare loss produced by coordination, asymmetry of information and spillover effects. In all the other cases a centralised solution might be more effective. The traditional results of the literature on fiscal federalism are replicated by our model: for a local public good (k = 0), the FB solution is centralisation. In this respect our model can be considered a generalisation of the framework proposed by the traditional literature. The second interesting result of our model is that CG correction through a matching grant is always welfare improving, provided the grant that minimises the expected loss is chosen. From a policy point of view the results of our model show that in the presence of asymmetry of information CG has to balance autonomy with control and it may prefer the former to the latter also for welfare maximisation reasons. When the quality of the goods does not depend on the tier that produces it (θ close to 1), but spillovers are important (k > k * ), a centralised solution might be optimal in a second best environment. This is the choice that has been made by some countries like the UK where the provision of services like health and education are still very centralised and in general it may explain why the level of decentralisation in decision making is lower in Europe than in the US.
The work presented in this paper could be extended in several directions: first of all, redistribution policies could be considered by introducing a specific income distribution at national and local level. This point is very important because the interaction between equalisation grants and fiscal federalism may produce perverse effects and it may be one of the causes for failures in the fiscal federalism structure such as soft budget constraint policies (Breuillé and Vigneault 2010;Levaggi and Menoncin 2013). Secondly, political considerations could be introduced by assuming that political parties compete for votes on different objectives and in a different setting at national and local level. and Since in this notation we have γ FB j = θ(z j + kz −j ), for any two couples (γ 1 ,γ 2 ), (γ 1 ,γ 2 ) the welfare difference can be written as and the total quantity and welfare differences wrt the FB case amount to: For each case of the grant setting, three different reactions of the LGs are possible. Ex post, the average welfare loss wrt FB is: where γ B j = p(1 − ρ * + η B j (ρ * )) is the unit subsidy. The overall state contingent solution is presented in detail in Table 8. The first column reports CG beliefs about LG's reaction function; 14 then for each case in the second column are listed the three possible LG behaviours. Reading further from left to right, for each (expost) case (CG belief, LG behaviour) the differences with respect to the FB case of the total subsidy γ j for region j, the total produced quantity Q and the welfare are shown. Finally, the last column presents, for each possible entry of the first column, the mean ex-post welfare loss. Lines in bolditalics refer to the case where CG beliefs are fulfilled and the only mistake that CG makes is due to its imperfect observation of local preferences.
interchanged. Apart from this, unless ν is very large, the considerations done above for ν = 1 are still valid in the general case.