Thomas Piketty, Paris School of Economics
Academic year 2011-2012
Course Notes:
Optimal redistributive taxation of labor income
The optimal labor income tax problem
Mirrlees (1971) : basic labor supply model used to analyse optimal labor income taxes:
- each agent i is characterized by an exogeneous wage rate wi (=productivity),
- labor supply li
- pre-tax labor income yi = wili
- income tax t = t(yi)
(t(yi) can be >0 or <0 ; if <0, then this is an income transfer, or negative income tax)
- after-tax labor income zi = yi – t(yi)
- agents choose labor supply li by maximizing U(zi,li)
- social welfare function W = ∫ W(U(zi,li)) f(yi)dyi subject to budgetary constraint: ∫ t(yi) f(yi)dyi > 0 (or >G, with G = exogenous public spendings)
- if individual productivities wi were fully observable, then the first-best efficient tax system would be t=t(wi), i.e. would not depend at all on labor supply behaviour, so that there would be no distorsion = fully efficient redistribution
- however if the tax system can only depend on income, i.e. t = t(yi), e.g. because of unobservable productivites wi (adverse selection), then we have an equity/efficiency trade-off
>>> Mirrlees 1971 provides analytical solutions for the second-best efficient tax system in presence of such an adverse selection pb
But problems with the Mirrlees 1971 formula:
(i) very complicated and unintuitive formulas, hard to apply empirically
(ii) only robust conclusion: with finite number of productivity types wi ,…, wn, then zero marginal rate on the top group = completely off-the-mark
>>> Diamond (1998), Saez (2001): continuous distribution of types (no upper bound, so that the artificial zero-top-rate result disappears), first-order derivation of the optimal tax formulas, very intuitive and easy-to-calibrate formulas
First-order derivation of linear optimal labor income tax formulas
Linear tax schemes: t(y) = ty – t0
I.e. t = constant marginal tax rate
t0 >0 = transfer to individuals with zero labor income
Define e = labor supply elasticity
I.e. if the net wage (1-t)wi increases by 1%, labor supply li (and therefore labor income yi) increases by e%
E.g. if U(zi,li) = zi - V(li) (separable utility, no income effect), with V(l)=l1+µ/(1+µ), then e=1/µ
More generally, whatever the random labor income generating process yi = y(productivity wi, labor supply li, effort ei , luck li), one can define e = generalized labor supply elasticity = if the net-of-tax rate (1-t) increases by 1%, labor income y increases by e%
Assume we’re looking for the tax rate t* maximizing tax revenues R = ty
(revenue-maximizing tax rate t* = top of the Laffer curve)
(revenue-maximizing tax rate t* = social optimum if W = Rawlsian, i.e. W=0 for all U>Umin, i.e. social objective = maximizing transfer t0)
(= useful reference point: by definition, socially optimal tax rates for non-rawlsian welfare functions will be below revenue-maximizing tax levels)
First-order condition: if the tax rate goes from t to t+dt, then tax revenues go from R to R+dR, with:
dR = y dt + t dy
with dy/y = - e dt/(1-t)
I.e. dR = y dt – t ey dt/(1-t)
dR = 0 if and only if t/(1-t) = 1/e
I.e. t* = 1/(1+e)
I.e. pure elasticity effect : if the elasticity e is higher, then the optimal tax t* is lower
I.e. if e=1 then t*=50%, if e=0,1 then t*=90%, etc.
= the basic principle of optimal taxation theory: other things equal, don’t tax what’s elastic
(other example: Ramsey formulas on optimal indirect taxation: tax more the commodities with a less elastic demand, and conversely)
First-order derivation of non-linear optimal labor income tax formulas
General non-linear tax schedule t(y)
I.e. marginal tax rates t’(y) can vary with y
Note f(y) the density function for labor income, and F(y) the distribution function
Assume one wants to increase the marginal tax rate from t’ to t’+dt’ over some income bracket [y; y+dy]. Then tax revenues go from R to R+dR, with:
dR = (1-F(y)) dt’ dy – f(y)dy t’ey dt’/(1-t’)
dR = 0 if and only if t’*/(1-t’*) = (1-F(y))/yf(y) 1/e
I.e. two effects:
Elasticity effect: higher elasticities e imply lower marginal tax rates t’*
Distribution effect: higher (1-F)/yf ratios imply higher marginal rates t’*
Intuition : (1-F)/yf = ratio between the mass of people above y (=mass of people paying more tax) and the mass of people right at y (=mass of people hit by adverse incentives effects)
For low y, the ratio (1-F)/yf is declining: other things equal, marginal rates should fall
But for high y, the ratio (1-F)/yf is increasing: other things equal, marginal rates should rise
>>> for constant elasticity profiles, U-shaped pattern of marginal tax rates
Asymptotic optimal marginal rates for top incomes
With a Pareto distribution 1-F(y) = (k/y)a and f(y)=aka/y(1+a), then (1-F)/yf converges towards 1/a, i.e. t’* converges towards:
t’* = 1/(1+ae)
with e= elasticity, a = Pareto coefficient
Intuition: higher a (i.e. lower coefficient b=a/(a-1), i.e. less fat upper tail) imply lower tax rates, and conversely
Exemple: if e=0,5 and a=2, t’* = 50%
Note : key property of Pareto distributions: ratio average/threshold = constant
Note y*(y) the average income (or wealth, or wage) of the population above threshold y. Then y*(y) can be expressed as follows :
y*(y) = [ òz>y z f(z)dz ] / [ òz>y f(z)dz ] = [ òz>y dz/za ] / [ òz>y dz/z(1+a) ] = ay/(a-1)
I.e.
y*(y)/y = b = a/(a-1)
(and a = b/(b-1) )
In practice :
For top incomes:
France today: b = 1.7-1.8 (a=2.2-2.3)
France interwar, US interwar, US today: b = 2.2-2.3 (a=1.7-1.8)
For top wealth:
France today: b = 2.2-2.3 (or 2-2.5)
Higher b coefficients = fatter upper-tail = higher concentration