Thomas
Piketty,
Academic
year 2010-2011
Course Notes:
Optimal redistributive taxation of
labor income
The
optimal labor income tax problem
Mirrlees
(1971) : basic labor supply model used to analyse optimal labor income
taxes:
- each
agent i is characterized by an exogeneous wage rate wi
(=productivity),
- labor
supply li
- pre-tax
labor income yi = wili
- income tax t = t(yi)
(t(yi)
can be >0 or <0 ; if <0, then this is an income transfer, or
negative income tax)
- after-tax labor income zi = yi
– t(yi)
- agents
choose labor supply li by
maximizing U(zi,li)
- social
welfare function W = ∫ W(U(zi,li))
f(yi)dyi subject
to budgetary constraint: ∫ t(yi)
f(yi)dyi > 0 (or >G, with G = exogenous public
spendings)
- if
individual productivities wi were fully observable, then the
first-best efficient tax system would be t=t(wi), i.e. would not
depend at all on labor supply behaviour, so that there would be no distorsion =
fully efficient redistribution
- however
if the tax system can only depend on income, i.e. t = t(yi), e.g.
because of unobservable productivites wi (adverse selection), then
we have an equity/efficiency trade-off
>>>
Mirrlees 1971 provides analytical solutions for the second-best efficient tax
system in presence of such an adverse selection pb
But
problems with the Mirrlees 1971 formula:
(i) very
complicated and unintuitive formulas, hard to apply empirically
(ii) only
robust conclusion: with finite number of productivity types wi ,…, wn, then zero marginal
rate on the top group = completely off-the-mark
>>>
Diamond (1998), Saez (2001): continuous distribution of types (no upper bound,
so that the artificial zero-top-rate result disappears), first-order derivation
of the optimal tax formulas, very intuitive and easy-to-calibrate formulas
First-order
derivation of linear optimal labor income tax formulas
Linear tax schemes: t(y) = ty – t0
I.e. t =
constant marginal tax rate
t0
>0 = transfer to individuals with zero labor income
Define e
= labor supply elasticity
I.e. if
the net wage (1-t)wi increases by 1%, labor supply li
(and therefore labor income yi) increases by e%
E.g. if
U(zi,li) = zi - V(li) (separable
utility, no income effect), with V(l)=l1+µ/(1+µ), then e=1/µ
More
generally, whatever the random labor income generating process yi =
y(productivity wi, labor supply li, effort ei
, luck li), one can define e = generalized labor supply elasticity =
if the net-of-tax rate (1-t) increases by 1%, labor income y increases by e%
Assume
we’re looking for the tax rate t* maximizing tax revenues R = ty
(revenue-maximizing
tax rate t* = top of the Laffer curve)
(revenue-maximizing
tax rate t* = social optimum if W = Rawlsian, i.e. W=0 for all U>Umin,
i.e. social objective = maximizing transfer t0)
(= useful
reference point: by definition, socially optimal tax rates for non-rawlsian
welfare functions will be below revenue-maximizing tax levels)
First-order
condition: if the tax rate goes from t to t+dt, then tax revenues go from R to
R+dR, with:
dR = y dt + t dy
with dy/y = - e dt/(1-t)
I.e. dR = y dt – t ey dt/(1-t)
dR = 0 if
and only if t/(1-t) = 1/e
I.e. t*
= 1/(1+e)
I.e. pure
elasticity effect : if the elasticity e is higher, then the optimal tax t*
is lower
I.e. if e=1
then t*=50%, if e=0,1 then t*=90%, etc.
= the basic principle of optimal taxation
theory: other things equal, don’t tax what’s elastic
(other
example: Ramsey formulas on optimal indirect taxation: tax more the commodities
with a less elastic demand, and conversely)
First-order
derivation of non-linear optimal labor income tax formulas
General
non-linear tax schedule t(y)
I.e.
marginal tax rates t’(y) can vary with y
Note f(y)
the density function for labor income, and F(y) the distribution function
Assume
one wants to increase the marginal tax rate from t’ to t’+dt’ over some income
bracket [y; y+dy]. Then tax revenues go from R to R+dR, with:
dR =
(1-F(y)) dt’ dy – f(y)dy t’ey dt’/(1-t’)
dR = 0 if
and only if t’*/(1-t’*) = (1-F(y))/yf(y) 1/e
I.e. two effects:
Elasticity
effect: higher elasticities e imply lower marginal tax rates t’*
Distribution
effect: higher (1-F)/yf ratios imply higher marginal rates t’*
Intuition
: (1-F)/yf = ratio between the mass of people above y (=mass of people paying
more tax) and the mass of people right at y (=mass of people hit by adverse
incentives effects)
For low
y, the ratio (1-F)/yf is declining: other things equal, marginal rates should
fall
But for high
y, the ratio (1-F)/yf is increasing: other things equal, marginal rates should
rise
>>>
for constant elasticity profiles, U-shaped pattern of marginal tax rates
Asymptotic
optimal marginal rates for top incomes
With a Pareto distribution 1-F(y) = (k/y)a and f(y)=aka/y(1+a),
then (1-F)/yf converges towards 1/a, i.e. t’* converges towards:
t’* = 1/(1+ae)
with e= elasticity, a = Pareto coefficient
Intuition: higher a (i.e. lower coefficient b=a/(a-1),
i.e. less fat upper tail) imply lower tax rates, and conversely
Exemple:
if e=0,5 and a=2, t’* = 50%
Note :
key property of Pareto distributions: ratio average/threshold = constant
Note
y*(y) the average income (or wealth, or wage) of the population above threshold
y. Then y*(y) can be expressed as follows :
y*(y) = [ òz>y z f(z)dz ] / [ òz>y f(z)dz ] =
[ òz>y dz/za ] / [ òz>y dz/z(1+a) ] = ay/(a-1)
I.e.
y*(y)/y = b = a/(a-1)
(and a =
b/(b-1) )
In
practice :
For top
incomes:
For top
wealth:
Higher b
coefficients = fatter upper-tail = higher concentration