Thomas Piketty

Thomas Piketty, Paris School of Economics

Academic year 2009-2010

Course Notes:

Optimal redistributive taxation of labor income

The optimal labor income tax problem

Mirrlees (1971) : basic labor supply model used to analyse optimal labor income taxes:

- each agent i is characterized by an exogeneous wage rate w_i (=productivity),

- labor supply l_i

- pre-tax labor income y_i = w_il_i

- income tax t = t(y_i)

(t(y_i) can be >0 or <0 ; if <0, then this is an income transfer, or negative income tax)

- after-tax labor income z_i = y_i – t(y_i)

- agents choose labor supply l_i by maximizing U(z_i,l_i)

- social welfare function W = ∫ W(U(z_i,l_i)) f(y_i)dy_i subject to budgetary constraint: ∫ t(y_i) f(y_i)dy_i > 0 (or >G, with G = exogenous public spendings)

- if individual productivities w_i were fully observable, then the first-best efficient tax system would be t=t(w_i), i.e. would not depend at all on labor supply behaviour, so that there would be no distorsion = fully efficient redistribution

- however if the tax system can only depend on income, i.e. t = t(y_i), e.g. because of unobservable productivites w_i (adverse selection), then we have an equity/efficiency trade-off

>>> Mirrlees 1971 provides analytical solutions for the second-best efficient tax system in presence of such an adverse selection pb

But problems with the Mirrlees 1971 formula:

(i) very complicated and unintuitive formulas, hard to apply empirically

(ii) only robust conclusion: with finite number of productivity types w_i ,…, w_n, then zero marginal rate on the top group = completely off-the-mark

>>> Diamond (1998), Saez (2001): continuous distribution of types (no upper bound, so that the artificial zero-top-rate result disappears), first-order derivation of the optimal tax formulas, very intuitive and easy-to-calibrate formulas

First-order derivation of linear optimal labor income tax formulas

Linear tax schemes: t(y) = ty – t₀

I.e. t = constant marginal tax rate

t₀ >0 = transfer to individuals with zero labor income

Define e = labor supply elasticity

I.e. if the net wage (1-t)w_i increases by 1%, labor supply l_i (and therefore labor income y_i) increases by e%

E.g. if U(z_i,l_i) = z_i - V(l_i) (separable utility, no income effect), with V(l)=l^1+µ/(1+µ), then e=1/µ

More generally, whatever the random labor income generating process y_i = y(productivity w_i, labor supply l_i, effort e_i , luck l_i), one can define e = generalized labor supply elasticity = if the net-of-tax rate (1-t) increases by 1%, labor income y increases by e%

Assume we’re looking for the tax rate t* maximizing tax revenues R = ty

(revenue-maximizing tax rate t* = top of the Laffer curve)

(revenue-maximizing tax rate t* = social optimum if W = Rawlsian, i.e. W=0 for all U>U_min, i.e. social objective = maximizing transfer t₀)

(= useful reference point: by definition, socially optimal tax rates for non-rawlsian welfare functions will be below revenue-maximizing tax levels)

First-order condition: if the tax rate goes from t to t+dt, then tax revenues go from R to R+dR, with:

dR = y dt + t dy

with dy/y = - e dt/(1-t)

I.e. dR = y dt – t ey dt/(1-t)

dR = 0 if and only if t/(1-t) = 1/e

I.e. t* = 1/(1+e)

I.e. pure elasticity effect : if the elasticity e is higher, then the optimal tax t* is lower

I.e. if e=1 then t*=50%, if e=0,1 then t*=90%, etc.

= the basic principle of optimal taxation theory: other things equal, don’t tax what’s elastic

(other example: Ramsey formulas on optimal indirect taxation: tax more the commodities with a less elastic demand, and conversely)

First-order derivation of non-linear optimal labor income tax formulas

General non-linear tax schedule t(y)

I.e. marginal tax rates t’(y) can vary with y

Note f(y) the density function for labor income, and F(y) the distribution function

Assume one wants to increase the marginal tax rate from t’ to t’+dt’ over some income bracket [y; y+dy]. Then tax revenues go from R to R+dR, with:

dR = (1-F(y)) dt’ dy – f(y)dy t’ey dt’/(1-t’)

dR = 0 if and only if t’*/(1-t’*) = (1-F(y))/yf(y) 1/e

I.e. two effects:

Elasticity effect: higher elasticities e imply lower marginal tax rates t’*

Distribution effect: higher (1-F)/yf ratios imply higher marginal rates t’*

Intuition : (1-F)/yf = ratio between the mass of people above y (=mass of people paying more tax) and the mass of people right at y (=mass of people hit by adverse incentives effects)

For low y, the ratio (1-F)/yf is declining: other things equal, marginal rates should fall

But for high y, the ratio (1-F)/yf is increasing: other things equal, marginal rates should rise

>>> for constant elasticity profiles, U-shaped pattern of marginal tax rates

Asymptotic optimal marginal rates for top incomes

With a Pareto distribution 1-F(y) = (k/y)^a and f(y)=ak^a/y^(1+a), then (1-F)/yf converges towards 1/a, i.e. t’* converges towards:

t’* = 1/(1+ae)

with e= elasticity, a = Pareto coefficient

Intuition: higher a (i.e. lower coefficient b=a/(a-1), i.e. less fat upper tail) imply lower tax rates, and conversely

Exemple: if e=0,5 and a=2, t’* = 50%

Note : key property of Pareto distributions: ratio average/threshold = constant

Note y*(y) the average income (or wealth, or wage) of the population above threshold y. Then y*(y) can be expressed as follows :

y*(y) = [ ò_z>y z f(z)dz ] / [ ò_z>y f(z)dz ] = [ ò_z>y dz/z^a ] / [ ò_z>y dz/z^(1+a) ] = ay/(a-1)

I.e.

y*(y)/y = b = a/(a-1)

(and a = b/(b-1) )

In practice :

For top incomes:

France today: b = 1.7-1.8 (a=2.2-2.3)

France interwar, US interwar, US today: b = 2.2-2.3 (a=1.7-1.8)

For top wealth:

France today: b = 2.2-2.3 (or 2-2.5)

Higher b coefficients = fatter upper-tail = higher concentration