Thomas Piketty, Paris School of Economics

Academic year 2011-2012

Course Notes:

Optimal redistributive taxation of labor income

The optimal labor income tax problem

Mirrlees (1971) : basic labor supply model used to analyse optimal labor income taxes:

- each agent i is characterized by an exogeneous wage rate wi (=productivity),

- labor supply  li

- pre-tax labor income yi = wili

- income tax t = t(yi)

(t(yi) can be >0 or <0 ; if <0, then this is an income transfer, or negative income tax)

 - after-tax labor income zi = yi – t(yi)

- agents choose labor supply  li by maximizing U(zi,li)

- social welfare function W = ∫ W(U(zi,li)) f(yi)dyi  subject to budgetary constraint: ∫ t(yi) f(yi)dyi > 0 (or >G, with G = exogenous public spendings)

- if individual productivities wi were fully observable, then the first-best efficient tax system would be t=t(wi), i.e. would not depend at all on labor supply behaviour, so that there would be no distorsion = fully efficient redistribution

- however if the tax system can only depend on income, i.e. t = t(yi), e.g. because of unobservable productivites wi (adverse selection), then we have an equity/efficiency trade-off

>>> Mirrlees 1971 provides analytical solutions for the second-best efficient tax system in presence of such an adverse selection pb

But problems with the Mirrlees 1971 formula:

(i) very complicated and unintuitive formulas, hard to apply empirically

(ii) only robust conclusion: with finite number of productivity types  wi ,…, wn, then zero marginal rate on the top group = completely off-the-mark

>>> Diamond (1998), Saez (2001): continuous distribution of types (no upper bound, so that the artificial zero-top-rate result disappears), first-order derivation of the optimal tax formulas, very intuitive and easy-to-calibrate formulas

First-order derivation of linear optimal labor income tax formulas

Linear tax schemes: t(y) = ty – t0

I.e. t = constant marginal tax rate

t0 >0 = transfer to individuals with zero labor income

Define e = labor supply elasticity

I.e. if the net wage (1-t)wi increases by 1%, labor supply li (and therefore labor income yi) increases by e%

E.g. if U(zi,li) = zi - V(li) (separable utility, no income effect), with V(l)=l1+µ/(1+µ), then e=1/µ

More generally, whatever the random labor income generating process yi = y(productivity wi, labor supply li, effort ei , luck li), one can define e = generalized labor supply elasticity = if the net-of-tax rate (1-t) increases by 1%, labor income y increases by e%

Assume we’re looking for the tax rate t* maximizing tax revenues R = ty

(revenue-maximizing tax rate t* = top of the Laffer curve)

(revenue-maximizing tax rate t* = social optimum if W = Rawlsian, i.e. W=0 for all U>Umin, i.e. social objective = maximizing transfer t0)

(= useful reference point: by definition, socially optimal tax rates for non-rawlsian welfare functions will be below revenue-maximizing tax levels)

First-order condition: if the tax rate goes from t to t+dt, then tax revenues go from R to R+dR, with:

dR = y dt + t dy

with dy/y = - e dt/(1-t)

I.e. dR = y dt – t ey dt/(1-t)

dR = 0 if and only if  t/(1-t) = 1/e

I.e.   t* = 1/(1+e)

I.e. pure elasticity effect : if the elasticity e is higher, then the optimal tax t* is lower

I.e. if e=1 then t*=50%, if e=0,1 then t*=90%, etc.

= the basic principle of optimal taxation theory: other things equal, don’t tax what’s elastic

(other example: Ramsey formulas on optimal indirect taxation: tax more the commodities with a less elastic demand, and conversely)

First-order derivation of non-linear optimal labor income tax formulas

General non-linear tax schedule t(y)

I.e. marginal tax rates t’(y) can vary with y

Note f(y) the density function for labor income, and F(y) the distribution function

Assume one wants to increase the marginal tax rate from t’ to t’+dt’ over some income bracket [y; y+dy]. Then tax revenues go from R to R+dR, with:

dR = (1-F(y)) dt’ dy – f(y)dy t’ey dt’/(1-t’)

dR = 0 if and only if  t’*/(1-t’*) = (1-F(y))/yf(y)  1/e

I.e.  two effects:

Elasticity effect: higher elasticities e imply lower marginal tax rates t’*

Distribution effect: higher (1-F)/yf ratios imply higher marginal rates t’*

Intuition : (1-F)/yf = ratio between the mass of people above y (=mass of people paying more tax) and the mass of people right at y (=mass of people hit by adverse incentives effects)

For low y, the ratio (1-F)/yf is declining: other things equal, marginal rates should fall

But for high y, the ratio (1-F)/yf is increasing: other things equal, marginal rates should rise

>>> for constant elasticity profiles, U-shaped pattern of marginal tax rates

Asymptotic optimal marginal rates for top incomes

With a Pareto distribution 1-F(y) = (k/y)a  and f(y)=aka/y(1+a), then (1-F)/yf converges towards 1/a, i.e. t’* converges towards:

t’* = 1/(1+ae)

with e= elasticity, a = Pareto coefficient

Intuition: higher a (i.e. lower coefficient b=a/(a-1), i.e. less fat upper tail) imply lower tax rates, and conversely

Exemple: if e=0,5 and a=2, t’* = 50%

Note : key property of Pareto distributions: ratio average/threshold = constant

Note y*(y) the average income (or wealth, or wage) of the population above threshold y. Then y*(y) can be expressed as follows :

y*(y) = [ òz>y z f(z)dz ] / [ òz>y f(z)dz ] = [ òz>y dz/za ] / [ òz>y dz/z(1+a) ] = ay/(a-1)

I.e.

y*(y)/y = b = a/(a-1)

(and a = b/(b-1) )

In practice :

For top incomes:

France today: b = 1.7-1.8 (a=2.2-2.3)

France interwar, US interwar, US today:  b = 2.2-2.3 (a=1.7-1.8)

For top wealth:

France today: b = 2.2-2.3 (or 2-2.5)

Higher b coefficients = fatter upper-tail = higher concentration