**MPRA**

Munich Personal RePEc Archive

Measuring and decomposing the distance

to the Shapley wage function with

limited data

Victor Aguiar and Roland Pongou and Jean-Baptiste Tondji

University of Western Ontario, University of Ottawa, University of

Ottawa

30 August 2016

Online at https://mpra.ub.uni-muenchen.de/73606/

**MPRA** Paper No. 73606, posted 12 September 2016 08:21 UTC

Measuring and Decomposing the Distance to the

Shapley Wage Function with Limited Data ∗

Victor Aguiar † , Roland Pongou ‡ and Jean-Baptiste Tondji §

September 8, 2016

Abstract

We study the Shapley wage function, a wage scheme in which a worker’s pay depends both on the

number of hours worked and on the output of the firm. We then provide a way to measure the distance

of an arbitrary wage scheme to this function in limited datasets. In particular, for a fixed technology and

a given supply of labor, this distance is additively decomposable into violations of the classical axioms of

efficiency, equal treatment of identical workers, and marginality. The findings have testable implications

for the different ways in which popular wage schemes violate basic properties of distributive justice in

market organizations. Applications to the linear contract and to other well-known compensation schemes

are shown.

JEL: C71, C78, D20, D30, J30

Keywords: Shapley wage function, firm, fairness violations, linear contract, bargaining, limited data.

1 Introduction

We consider the classical problem of wage determination in a multi-worker firm. Shapley (1953) provides

an axiomatic solution to this problem, and determines the wage of each worker as his expected marginal

contribution to the output of the firm. The unique properties of this solution–efficiency, equal treatment

of identical workers, and marginality–also make it an important concept of distributive justice. 1 Moreover,

the Shapley value has found interesting applications in the literature on intra-firm bargaining, labor-search,

∗ We thank Greg Pavlov, Roberto Serrano and the participants of the Brown Theory Lunch for useful comments and encouragement.

† Department of Economics, University of Western Ontario, vaguiar@uwo.ca

‡ Department of Economics, Ottawa University, roland.pongou@uottawa.ca

§ Department of Economics, Ottawa University, jtond063@uottawa.ca

1 As a notion of distributive justice, the Shapley value has been widely studied (see, e.g., Moulin (1992)). Its appealing

properties have also been extended to several other economic environments (see, e.g., De Clippel and Serrano (2008)).

1

and contracts, following its derivation as the unique solution to a wage bargaining game between a firm and

its employees (Stole and Zwiebel (1996), Acemoglu and Hawkins (2014), Brügemann, Gautier, and Menzio

(2015)). Yet, despite the acknowledged theoretical appeal of this concept, it is recognized that the wage

schemes utilized in most real-life organizations (e.g., the piece-work scheme) depart from it. In this paper,

we provide a way to measure such a departure in limited datasets. In particular, we show that the distance of

an arbitrary wage scheme to the Shapley value is additively decomposable into the violations of its classical

axioms. The findings allow one to quantify the departures of any pay scheme from the testable implications

of the Shapley pay scheme and to account for the different ways in which well-known wage schemes violate

basic notions of distributive fairness in market organizations.

Our firm environment involves a finite set of workers who supply labor in discrete units (e.g., zero hour,

one hour, and so on) up to a maximum amount, and a production technology which maps each profile

of hours worked to an output. This environment is more flexible than the traditional transferable-utility

environment used by Shapley (1953) in that each worker may supply a different amount of labor. In this

more general setup, we define the Shapley wage function, a wage scheme in which a worker’s pay depends

both on the number of hours that he has worked and on the firm’s aggregate output. Importantly, the fact

that a worker’s pay also depends on the firm’s output means that it is partially determined by other workers’

inputs, which is an interesting property that reflects the complex complementarities and externalities among

workers in the production process. 2

As a preamble to the decomposition analysis set out below, we characterize axiomatically the Shapley

pay scheme in our firm environment, where, as mentioned above, the scheme determines the payoff to each

worker for a fixed production technology and for a given profile of hours worked. The first characterization

states that the Shapley value is the only wage scheme that satisfies the efficiency and marginality axioms

and that pays the same wage to identical workers. 3 The second characterization states that the Shapley

value is the only wage scheme that satisfies efficiency, the null-worker and additivity axioms, and that does

not discriminate between identical workers. 4 As is explained below, these two characterizations provide a

2 The structural nature of the Shapley wage function in our firm environment makes it possible to compare it with any other

pay scheme, in particular those that depend on the amount of labor supplied by the workers. Indeed, our environment subsumes

the classical transferable-utility environment in which it is assumed that workers have only two options (“work” and “not work”)

and that all workers work (see, e.g., Shapley (1953), Shapley and Shubik (1967)), thus keeping its conceptual basis intact but

has the advantage of being more amenable to possible empirical work.

3 Efficiency means that the entire output is shared among the workers, implying that no portion of it is wasted. The

marginality axiom, due to (1985), means that a worker should earn more under a production technology that values his input

more. This axiom is related to the null-worker and the additivity axioms (see below). The null-worker axiom says that if a

worker’s input does not affect the firm’s output, then that worker should earn nothing. The additivity axiom means that,

following an additive technological improvement, a worker’s wage should only change to the extent to which the new technology

augments the value of his input.

4 As an important remark, we note that, although two workers might be identical or interchangeable in a firm environment,

2

asis for studying the formal relationships among the different violations of these appealing axioms by an

arbitrary wage scheme.

Our most significant contribution is to compare any arbitrary pay scheme to the Shapley pay scheme

when the observer only has limited data on the production environment. Given a fixed production function

and a profile of hours worked, we measure the distance of any arbitrary pay scheme to the Shapley value.

We decompose this distance into terms that measure the violations of the aforementioned classical axioms of

this value. It is interesting that this exercise shows how the violation of the marginality property is formally

related to the violations of the null-worker and additivity properties, thus further highlighting the correlation

or the dependence between these two axioms.

Importantly, in the decomposition analysis, we allow for an observer to have a priority order over the

different fairness axioms that characterize the Shapley pay scheme. Under efficiency, there are two possible

decompositions of the Shapley distance. 5 The first corresponds to the situation where the observer values

symmetry over marginality, and the second corresponds to the situation where the observer values marginality

over symmetry. We show that each of the two situations corresponds to a different evaluation of distributive

injustice. A natural question arises in regard to which decomposition is more meaningful. One answer is

that the observer may have some type of priority ordering as a basis for deciding which axiom is more

important. Alternatively, the observer could assign a weight to each priority order, and in this case, an average

decomposition would be more preferable. We propose such a decomposition. All of these decompositions

provide a broad basis for the evaluation of pay fairness in the firm based on the observer’s tastes.

This analysis allows for the empirical examination of how popular pay schemes can violate basic notions

of distributive fairness in the firm. Indeed, given a particular type of technology, a profile of hours worked,

and a wage profile resulting from a fixed pay scheme, our analysis allows us to say whether Shapley fairness

is violated by this pay scheme, and points exactly to the axioms that are being violated. It also quantifies

the size of each violation based on tastes, thus allowing one to conclude whether a particular violation is

mild or severe.

A clear advantage of our framework is that it makes it possible to carry out the proposed empirical tests

with only limited data. For instance, even though the marginality and additivity axioms are stated using

two production functions, our framework allows us to test them only upon observing a pay scheme for a

fixed production function and for a given profile of hours worked. This is interesting because, in real-life, it

is very difficult to observe a firm’s output under two different technologies, or how a pay scheme behaves

under different production functions at the same level of labor supply.

if, at a given profile of hours worked, one worker supplies more labor than the other because of some exogenous reasons, then

our Shapley pay scheme will give a higher wage to the former. The classical Shapley value does not have such a flexibility in

wage determination.

5 Throughout the paper, the phrases “Shapley distance” and “distance to the Shapley value” are used interchangeably.

3

We develop several applications for our analysis. In particular, we examine the piece-work pay scheme,

which is also known as the linear contract. This scheme is appealing because it is by definition externalityfree,

that is, the wage that is paid to a worker does not depend on the amount of labor supplied by the other

workers. Despite its appeal, we find that, in general, this scheme violates all of the axioms that characterize

the Shapley value. Nevertheless, we are able to determine the linear contract that is the closest possible to

the Shapley value. This latter scheme satisfies all of the axioms of the Shapley value, with the exception of

the efficiency property. But it retains its externality-free appeal. Finally, we study intra-firm bargaining in

the spirit of Stole and Zwiebel (1996), focusing on the effects of bargaining power on firm unfairness. For a

particular example, we find that firm bargaining power monotonically increases the violations of symmetry

and marginality.

To the best of our knowledge, no other work has analyzed and quantified departures from the Shapley

pay scheme. The existing work that most closely resembles our own is De Clippel and Rozen (2013) that

proposes a way to test the axioms of symmetry, null player, additivity and marginality under the assumption

of efficiency. In contrast to our approach, they suggest using a regression-based methodology and restrictions

over coefficients of such regressions for testing the different axioms. On the other hand, our approach can

be applied at the individual/subject level and is deterministic. We also provide a way of quantifying the

violations of pay fairness and identifying their different sources (according to the axioms that characterize

the Shapley pay scheme and according to an observer’s tastes or priority order over these axioms). The

decomposition of a goodness-of-fit measure into components that correspond to the violations of axioms

was first explored in Aguiar and Serrano (2015) in the context of consumer theory. We study a completely

different economic environment. We add to their idea that decomposable measures of departures from classical

concepts in economic theory provide a novel way of studying empirical counterparts of such concepts that

usually do not conform to the theory. We also hope to complement the classical works of Shapley (1953)

and Young (1985) by providing a way to compare any other pay scheme to the Shapley pay scheme in a

systematic way.

The rest of this paper is organized as follows. In section 2, we provide the preliminary definitions and

introduce the notion of a dataset in a production environment. In section 3, we extend the two classical

characterizations of the Shapley value to the Shapley wage function. In section 4, we make explicit the relationship

between our firm environment and the classical transferable-utility environment, showing that the

former subsumes the latter. In section 5, we propose a local test for the violation of the axioms characterizing

the Shapley pay scheme by any observed empirical pay scheme, and a decomposable measure of such

a violation. In section 6, we propose an extension of the analysis to the case of full datasets. In section 7,

we provide applications that illustrate the usefulness of our results. We conclude in section 8. All proofs are

collected in an appendix.

4

2 Preliminaries: Firm, Pay scheme, and Dataset

In this section, we introduce preliminary definitions. A firm is modeled as a list F = (N, T, f) where

N = {1, 2, ..., n} is a non-empty finite set of workers/inputs of cardinality n; T = {0, 1, 2, ..., t} is a nonempty

finite set of actions that a worker can take, where 0 denotes a situation of inaction; and f is a

production function that maps each action profile x = (x 1 , ..., x n ) ∈ T n to a real number output f(x).

There are several interpretations of this model. First, the action set T can be interpreted as the set of

hours of labor or effort levels that a worker can supply. In this case, a worker can supply up to t hours of

labor. Here, the elements of T need not be natural numbers. Second, the set T can also be interpreted as

the set of job types that are available in the firm, where the jobs are not necessarily ranked. In this case, the

elements of T simply label the different job types in the firm. In general, all of our results hold regardless

of whether T is ordered or not. Moreover, the function f can also be interpreted as the profit or the cost

function. Interpreting it as the profit function might be useful in that it would be viewed as incorporating the

production and the cost functions. Regardless of the interpretation adopted, we assume that f(0, 0, ..., 0) = 0,

which means that if all the workers are inactive, there is no output.

Let F =(N, T, f) be a firm and S ∈ 2 N be a set of workers. We denote by T |S| the set of the possible

vectors of effort levels for the workers in S. An element x ∈ T |S| can be written as x = (x 1 , ..., x s ), where

s = |S| is the number of workers in S and where every x i ∈ T is the effort level supplied by the i th worker

in S.

Throughout this paper, we denote by e i the i th unit vector (0, 0, ..., 0, 1, 0, ..., 0), where all the entries

are zero except the i th component which is one. We will also use the symbols and ⊳, which we define as

explained hereunder. Let x, x ∈ T n be two effort profiles. We write x x to mean that x i ≠ x i ⇒ x i = 0,

and we write x ⊳ x to mean that x x and x ≠ x. For example, (1, 3, 5, 0, ....0) ⊳ (1, 3, 5, 1, 5, 0, ..., 0). We

denote by |x| = | {i ∈ N : x i > 0} | the number of workers who are not inactive at x.

A pay scheme for a firm F is a way to redistribute the output of the firm among the workers. In other

words, a pay scheme is a sharing or allocation rule. The formal definition of this concept is given below.

Definition 1. Let F = (N, T, f) be a firm. A pay scheme for the firm F is a function θ f that maps any

effort profile x ∈ T n to a non-null wage profile θ f (x) = (θ f 1 (x), θf 2 (x), ..., θf n(x)), where all i ∈ N, θ f i (x) ∈ R

is interpreted as the wage earned by i out of the output f(x).

In the remainder of the paper, we fix the set of workers N and the action set T , so that a firm will be

completely characterized by a production function f. We now introduce the notion of a dataset and related

concepts.

An “observation” is a triple (x, f, θ f (x)) where θ is a pay scheme defined for any production function f

and for any effort profile x.

5

Let F be a non-empty set of production functions and T be a set of realized effort profiles. A “dataset”

is a list of observations D = (x, f, θ f (x)) x∈T,f∈F . A “complete dataset” is a list of observations D =

(x, f, θ f (x)) x∈T,f∈F where F contains all possible production functions and where T = T n . A“limited dataset”

is a list of observations D = (x, f, θ f (x)) x∈T,f∈F where F consists of a unique production function f and

where T consists of a unique effort profile x (i.e., the triple (x, f, θ f (x)) with a fixed production function and

effort profile). In the context of a limited dataset, we may not have the details about how θ f distributes f(y)

for other effort profiles y ≠ x; we only know the realized profile of wages θ f (x) at x. However, we have full

information on f.

3 The Shapley Wage Function for the Firm

In this section, we define and characterize the Shapley wage function for the firm, which provides a basis for

analyzing the different ways in which an arbitrary pay scheme might violate the principle of fairness in wage

determination, even with a limited dataset. Shapley (1953) provides an axiomatic solution to the problem

of wage determination in a multi-worker firm. The production environment considered by Shapley (1953) is

defined by a transferable-utility function. As we show in section 5, this classical environment is much less

general than our model of a firm. We first extend the classical axioms that characterize the Shapley value to

our environment, and show that they uniquely characterize our new value, which we call the Shapley wage

function. More precisely, we provide two axiomatic characterizations of this function. The first states that

the Shapley wage function is the only function that satisfies the requirements of efficiency, equal treatment

of identical workers (symmetry or non-discrimination), and marginality. The second states that the Shapley

wage function is the only function that satisfies the symmetry, efficiency, and additivity requirements, as well

as the null-worker axiom.

In order to define these axioms in our firm environment, the following definitions are needed.

Definition 2. Let i, j ∈ N be two workers, x be an effort profile, and f be a production function.

1. Worker i is a null-worker at (x, f) if for any x ∈ T n such that x ⊳ x and x i = 0, mc(i, f, x, x) = 0.

2. Workers i and j are said to be symmetrical or identical at (x, f) if for all x ∈ T n such that x ⊳ x and

x i = x j = 0, mc(i, f, x, x) = mc(j, f, x, x).

We now define the axioms.

Axiom 1. (Equal treatment or non-discrimination)

A pay scheme θ satisfies the property of equal-treatment or non-discrimination if for any effort profile x ∈ T n ,

any production function f, and any workers i and j that are symmetrical at (x, f), θ f i (x) = θf j (x).

6

Axiom 2. (Efficiency)

A pay scheme θ is efficient if for any production function f and any effort profile x ∈ T n , ∑

θ f i

i∈N

(x) = f(x).

Axiom 3. (Marginality)

A pay scheme θ is marginal if for any production functions f and g, any worker i ∈ N and any effort profiles

x and x such that x ⊳ x with x i = 0, [f(x + x i e i ) − f(x) ≥ g(x + x i e i ) − g(x)] ⇒ [θ f i (x) ≥ θg i (x)].

Axiom 4. (Null worker property)

A pay scheme θ satisfies the property of null-worker if for any production function f, any effort profile

x ∈ T n , and any null-worker i ∈ N at (x, f), θ f i (x) = 0.

Axiom 5. (Additivity)

A pay scheme θ is additive if for any production functions f and g and any fixed effort profile x ∈ T n ,

θ f+g (x) = θ f (x) + θ g (x).

These axioms require little justification. The equal-treatment axiom is a no-discrimination condition

(horizontal equality) that requires that workers who make the same marginal contribution at an effort profile

x and a production function f receive the same pay. Efficiency requires that the output of the firm be fully

shared among the various contributors. It can also be thought of in terms of Pareto optimality because if

an allocation is feasible but not efficient, it cannot be Pareto optimal under very general conditions (on the

workers’ tastes). Marginality means that a worker’s pay should be greater under a production technology

that places a higher values on his input. This is a very appealing property because it requires that the wage

of a worker depends only on his marginal contribution given other workers’ inputs. The null worker property

requires that those who do not contribute marginally should not receive any part of the realized output.

Finally, additivity could be interpreted in terms of technological improvement. The additivity axiom means

that, following an additive technological improvement, a worker’s wage should only change by the extent to

which the new technology augments the value of his input.

Despite the appeal of these axioms, it should be noted that testing axioms that are defined using two

production functions such as marginality and additivity requires access to a complete dataset; in other words,

it requires that all the possible production functions be observed. This is not possible in a real-world setting,

as we only have access to a limited dataset. A distinctive feature of our work is that we are able to quantify

departures of any pay scheme from these axioms in limited datasets, which also means that our analysis has

testable implications.

The results set out hereunder establish the necessary and sufficient axioms that characterize the Shapley

wage function (defined by equation (1) below).

Theorem 1. Let F = (N, T, f) be a firm. There exists a unique pay scheme, denoted ϕ f , that satisfies the

7

efficiency, equal-treatment, and marginality requirements, and it is given by:

ϕ f i (x) =

∑

x⊳ x, x i=0

(|x|)!(|x| − |x| − 1)!

[f(x + x i e i ) − f(x)], for all i ∈ N. (1)

(|x|)!

Theorem 2. Let F = (N, T, f) be a firm. The pay scheme ϕ f defined by (1) is the unique pay scheme that

satisfies efficiency, the null-worker property, equal-treatment, and additivity.

For clarity, the proofs of these and all the subsequent results are provided in the appendix. In order

to understand the Shapley wage function, one should recall that for any effort profile x ∈ T n , |x| =

| {i ∈ N : x i > 0} | is the number of active workers at x (i.e, workers supplying a positive effort level in

the vector x). We assume that workers enter the production process in a random order and that all of

the (|x|)! orderings of the workers supplying a positive level of effort are equally likely. Suppose that

a vector consists of x clusters occupied by individuals. Starting at a vector x x or x ∈ T |S| , where

S = {i ∈ N : x i > 0 and x i = x i }, when a worker i who is initially inactive (x i = 0) enters the firm to find

the workers in S, he chooses his effort level x i , which means that he is affected to the i th cluster in the vector

x. It follows that the fraction (|x|)!(|x|−|x|−1)!

(|x|)!

represents the probability that a given worker i, with x i = 0,

joins a subgroup S. When a worker i joins the other workers who have already chosen their respective effort

levels according to vector x (i.e x is set with x i = 0), the new vector is x + x i e i and the firm’s outcome is

f(x + x i e i ); thus the marginal contribution of worker i is f(x + x i e i ) − f(x). The value ϕ f i

is the expected

marginal contribution of worker i in the formation of the effort profile x. This value is the “structural” form

of the Shapley value whereas, the classical Shapley value defined for transferable utility games can be seen

as the reduced form where the level of effort and the production function are subsumed within the characteristic

function. The reader must be clear that any transferable-utility game can be represented in the firm

environment, and that conversely, for any fixed effort profile, we can represent the production process as a

transferable-utility game. 6 However, our more general environment is crucial for allowing comparison of the

Shapley wage function with other pay schemes, in particular with those schemes that depend on the level of

effort exerted by workers and not just on the level of output. Examples of such schemes include the linear

pay scheme that cannot be defined in a transferable-utility game because it is a function of the effort exerted

by the workers.. Throughout this paper, we denote (|x|)!(|x|−|x|−1)!

(|x|)!

f(x + x i e i ) − f(x) by mc(i, f, x, x).

by ϕ(x, x) and the marginal contribution

One part of the scope of this paper is the comparison of any allocation rule θ with the Shapley wage

function ϕ in limited datasets. We denote the Euclidean norm defined in R n by || · || and for any fixed effort

profile x, the distance between the allocation θ f (x) ∈ R n (i.e, the real wage profile at x under the pay scheme

θ) and the Shapley wage profile ϕ f (x) ∈ R n at x is given by ||θ f (x) − ϕ f (x)||. As demonstrated below, the

square of this distance can be decomposed into terms that are connected with violations of the classical

6 This is shown formally in section 4.

8

axioms that characterize the Shapley value (equal-treatment of identical workers, efficiency, and marginality

properties). Moreover we prove that in finite datasets, these terms can be used to make partial inferences

about the violations of the axioms defined for complete datasets and complete inference about the violations

of the axioms defined for a fixed production function. This is of interest because the observer usually does

not have information about a pay scheme under different technologies (production functions), thus making

it impossible to check the validity of the axioms that require comparisons between different technologies.

The characterizations of the Shapley wage function set out above prepare us for our main task , which

is the decomposition of the Euclidean distance to the Shapley sharing rule of any pay scheme in a limited

dataset. A simple but powerful fact is that for limited datasets, whenever the distance of any pay scheme to

the Shapley pay scheme is positive, the pay scheme for full data fails to satisfy at least one of the axioms

that we have provided in the characterization theorems. Of course, our remaining task is to decompose the

square of the distance to the Shapley pay scheme into components that test each of axioms that we have

presented in this section.

Despite the appeal of these axioms, it should be noted that testing axioms such as marginality and

additivity that are defined using two production functions properties requires access to a complete dataset;

in other words, it requires that all of the possible production functions be observed. This is not possible in

a real-world setting, as we only have access to limited dataset.

Before proceeding however, we link the previous two characterizations by means of the lemma described

hereunder.

Lemma 1. The marginality property implies the null-worker property.

Evidently we can also infer that efficiency, equal-treatment and marginality taken together imply additivity.

Lemma 1 is important because it provides a way to test marginality in a limited dataset environment.

The reason is that, if a pay scheme fails the null-worker property then it must also fail the marginality

property.

4 Relation to the Transferable Utility Environment

Our firm environment is related to the traditional transferable utility environment in which the classical

Shapley value has found interesting applications (see, e.g., Shapley and Shubik (1967), Roth (1977)). Indeed,

we show that the firm environment subsumes this game environment. The lemma below summarizes this

relation:

Lemma 2. Let (N, G), G : 2 N ↦→ R, be a transferable-utility game. The game (N, G) can be represented by

a firm F = (T, N, f). For any set of workers N, any fixed effort profile x, and any production function f,

9

there is a transferable-utility game (N, G f x ) such that Gf x (S) = f(x S) where x S is defined as x S,i = 0 for all

i ∈ N\S and x S,i = x i for all i ∈ S.

The Shapley wage function for the firm can be equivalently defined for any fixed effort level x ∈ T n as the

Shapley value of the corresponding transferable-utility game G f x : ϕf (x) := ϕ T U (N, G f x ) where ϕT U (N, G f x )

is the Shapley value of game (N, G x ).

Our environment is structural in the sense that it separates explicitly the level of effort from the output

level. Usually, the observer has access to information on the effort level and on the production function, and

thus our environment is more amenable to empirical work. Furthermore, the traditional transferable-utility

game environment cannot represent sharing schemes that depend only on effort such as the linear pay scheme

where people receive an hourly rate for each unit of effort (worked hour).

5 A Decomposition of the Distance to the Shapley Pay Scheme

for Limited Datasets

We now examine how any observed pay scheme differs from the Shapley wage function by means of the

Euclidean distance between the two schemes evaluated at a fixed effort profile for a given production function.

More important is the fact that we show that this distance is additively decomposable into violations of each

of the properties that characterize the Shapley pay scheme. This approach is analogous to that of Aguiar

and Serrano (2015) who study departures of a demand function from rationality. Despite the similarity of

our respective approaches, in this paper, we are tackling a completely new question in a different economic

environment.

Formally, assume that we observe a pay scheme θ. We want to measure its Euclidean distance to the

Shapley pay scheme ϕ and to decompose this distance into three components measuring violations of symmetry

(sym), efficiency (eff), and marginality (mrg) requirements, respectively, when these measures are

determined sequentially in the aforementioned order. Indeed, we first find the closest pay scheme that satisfies

sym; then we find the closest pay scheme that satisfies eff in addition to sym; and finally we find

the closest pay scheme that satisfies mrg in addition to sym and eff, which is simply the Shapley pay

scheme. The order in which we impose these constraints is justified if an observer values certain principles of

fairness more than others. In particular, our order of imposition assumes that the observer has a preference

relation, denoted ≻ o , over the principles of Shapley fairness. We assume these preferences are captured by

the ordering: 7

sym ≻ o eff ≻ o mrg;

7 In the next section, we consider other alternatives.

10

this is a normative judgment that assigns precedence to “horizontal equality” (or the “equal pay for equal

work” principle) over all other principles, that assigns Pareto efficiency secondary point, and that assigns

marginality principle tertiary priority. Note that this is not the only way whereby to decompose the Shapley

value, and that variations, whereby we alter the observer’s priority order, will be explored in the sequel.

The particular decomposition we with we begin is meaningful as each component measures a quantity of

economic interest as explained hereunder.

We start by fixing a pair consisting of an effort profile and a production function (x, f) and consider the

distance of θ to the Shapley pay scheme ϕ at this point, which we denote by:

||e sh || = ||θ f (x) − ϕ f (x)||,

where || · || is the Euclidean distance in R n .

Let v sym be the best approximation (pointwise under the chosen norm) to any observed pay scheme

θ that satisfies symmetry axiom. We prove that each entry evaluated at f(x) is given by v sym

i (f(x)) that

corresponds to the average pay under θ f (x) among the workers who are symmetrical or identical to i under

f(x). We then establish that any pay scheme θ can be written uniquely as the summation of its symmetric

part v sym and a residual e sym that is orthogonal to v sym under the Euclidean inner product:

θ = v sym + e sym .

In a similar way, let v sym,eff be the pay scheme that is pointwise closest to the symmetric pay scheme

v sym . We prove that v sym,eff

i

is given by the summation of v sym

i

by the number of workers N. Again, we show that we can write uniquely:

v sym = v sym,eff + e eff ,

and the output wasted by θ f (x) divided

where e eff is the negative of the wasted output by θ f (x) divided by the number of workers N.

Finally, we exploit the fact that the pay scheme satisfying marginality property that is pointwise closest to

the symmetric and efficient pay scheme v sym,eff , which we denote by v sym,eff,mrg , must be the Shapley value

because of the uniqueness established in Theorem 1. Thus v sym,eff,mrg = ϕ. We let e mrg = v sym,eff − ϕ.

Notice that we can always decompose (pointwise):

θ f (x) = ϕ f (x) + e sh (f(x)),

because θ f (x) and ϕ f (x) belong to the same vector space. With this preview in hand, we establish the main

result of this section.

Theorem 3. For any given observation (θ f (x), f, x), we have the unique pointwise decomposition:

θ f (x) = ϕ f (x) + e sym (f(x)) + e eff (f(x)) + e mrg (f(x)).

11

Moreover, the distance to the Shapley pay scheme can be uniquely decomposed as :

||e sh || 2 = ||e sym || 2 + ||e eff || 2 + ||e mrg || 2 ,

into its symmetric, efficiency and marginality departures. Moreover:

(i) if ||e sh || > 0, then either equal-treatment, efficiency, or marginality fails;

(ii) if ||e sym || > 0, then equal-treatment fails;

(iii) if ||e eff || > 0, then efficiency fails; and,

(iv) if ||e mrg || > 0 and θ satisfies efficiency and equal treatment, then marginality fails.

The proposed decomposition of the Shapley distance that we just derived has economic meaning described

hereunder:

a) ||e sym || 2 = ∑ i∈N [θ i(f(x))−v sym

i

(f(x))] 2 , where for any worker i, v sym (f(x)) is the average wage within

the class [i] of workers who are equivalent to i. This means that ||e sym || 2 is a dispersion measure within

the equivalence classes of workers. In other words, this quantity measures horizontal inequality, which

is the inequality among workers who are identical.

b) ||e eff || 2 = E 2 /n, where E = [f(x) − ∑ i∈N θf i (x)] is the total waste produced by the pay scheme. This

means that ||e eff || 2 increases solely due to the lack of Pareto efficiency.

c) ||e mrg || 2 = ∑ i∈N [vsym,eff (f(x)) − ϕ(f(x))] 2 , where v sym,eff (f(x)) is the symmetrized and efficient pay

scheme that is closest under the Euclidean norm to the original pay scheme θ(f(x)). This means that

||e mrg || 2 is a measure of departures from the marginality principle conditional on fulfilling horizontal

equality and efficiency.

i

In order to prove Theorem 3, we need some preliminary lemmas that have an interest of their own.

Lemma 3. The best approximation of any pay scheme θ to the set of pay schemes that satisfy symmetry is

given by v sym , which is a pay scheme that gives the average pay of a group of symmetrical workers to each

of the workers: v sym

i (f(x)) = 1 ∑

|[i]| j∈[i] θf j (x), where [i] = {j ∈ N : j is symmetric to i in f(x)}.

Now, we present the solution to the efficiency best approximation. (Its proof is trivial and thus is omitted.)

Lemma 4. The best approximation to any pay scheme θ and in particular to the symmetric pay scheme

v sym satisfying symmetry and efficiency is given by v sym,eff , which is a pay scheme that gives each worker

i his payoff under v sym

i

plus the wasted output shared equally among all the workers: v sym,eff

i (f(x)) =

v sym

i (f(x)) + [f(x) − ∑ i∈N θf i (x)]/n. 12

We also need to prove a mathematical lemma that will be crucial to prove our decomposition result. We

first define a skew symmetric pay scheme.

Definition 3. A skew symmetric pay scheme is such that for an equivalence class defined by i ∼ sym j,

where i and j are identical workers in f(x), we have :

∑

v j (f(x)) = 0, with [i] = {j ∈ N : j is symmetric to i in f(x)} .

j∈[i]

Notice that when there are only two workers who are identical, say i ∼ sym j, we have the usual notion

of skew symmetry in that v i = −v j . Moreover, for the case of a unique worker k to whom no other worker

is identical, we have v k = 0.

Now, we are ready to prove the following property of skew symmetric pay schemes.

Lemma 5. Any skew symmetric pay scheme is orthogonal to any symmetric pay scheme.

This is the appropriate moment to prove Theorem 3. Here we only provide a sketch of the proof, the

full proof is set out in the appendix. To establish the decomposition, we prove that the different residuals

e sym , e eff , e mrg are orthogonal with respect to each other. To establish the moreover statement, we exploit

the implications of θ having the different axioms and the uniqueness of the Shapley pay scheme established

in Theorem 1.

The previous decomposition is a simple way to test the different axioms for any observed pay scheme θ

even if we have only finite datasets. More importantly, it allows us to quantify the size of such departures.

And the other hand, it is not very conclusive for the marginality axiom. For that reason, we exploit the

second characterization of the Shapley value using additivity and the null worker property (in addition to

efficiency and symmetry) and the result that the marginality property implies the null-worker property to

improve upon the proposed local measure of violations of marginality. Here, we first impose the null-worker

property, and then the additivity axiom. Once again the observer has a preference:

null ≻ o add,

that captures the idea that the null-worker principle is more important to the observer’s idea of fairness than

the additivity principle.

Denoted by v sym,eff,null is the closest pointwise approximation of a symmetric and efficient pay scheme

v sym,eff satisfying the null-worker property.

Lemma 6. The best approximation to any pay scheme θ and in particular to the symmetric pay scheme

v sym,eff satisfying the equal-treatment, efficiency, and null-worker axioms is given by v sym,eff,null ; it is the

sum of v sym,eff and the rule e null

i (f(x)) that extracts all of the payoffs from the null players and shares it

13

equally among all of the remaining workers, while giving zero to the null workers. Formally:

∑k∈N

e null

i (f(x)) = −

[vsym,eff k

(f(x)]

[n − |N |]

for i ∈ N\N ,

and

e null

k

= −v sym,eff

k

for k ∈ N ,

with N being the set of null workers in f(x). Thus,

(i) v sym,eff,null

i

= v sym,eff

i +

(ii) v sym,eff,null

k

= 0 for all k ∈ N .

∑k∈N [vsym,eff

k (f(x)]

[n−|N |]

for all i ∈ N\N ; and,

We also notice that v sym,eff,null,add is the Shapley value ϕ, because of Theorem 2, and we define the

residual e add = v sym,eff,null − ϕ.

We establish this second decomposition theorem which allows us to test and quantify in a more accurate

fashion the departures from marginality requirement through the departure from the null-worker property.

Theorem 4. For any given observation (θ f (x), f, x), we have the unique pointwise decomposition:

θ f (x) = ϕ f (x) + e sym (f(x)) + e eff (f(x)) + e null (f(x)) + e add (f(x)).

Moreover, the distance to the Shapley pay scheme can be uniquely decomposed as:

||e sh || 2 = ||e sym || 2 + ||e eff || 2 + ||e add || 2 + ||e null || 2 + 2〈e add , e null 〉,

into its symmetric, efficiency, null-worker and additivity departures (with ||e mrg || 2 = ||e add || 2 +||e null || 2 +

2〈e add , e null 〉, and 〈e add , e null 〉 ≠ 0 in general). Moreover,

(i) If ||e null || > 0 and θ satisfies efficiency, then the null-worker and marginality properties fail; and,

(ii) If ||e add || > 0 and θ is symmetric, efficient and satisfies the null-worker property, then additivity fails.

The previous theorems establish tractable and easy ways to understand measures of departures from the

properties of the Shapley pay scheme. More importantly, they work for limited datasets, which is a realistic

situation, in the sense that the observer may not observe the behavior of an allocation rule θ over all possible

technologies for a given effort profile.

We finish this section by presenting a result that establishes ||e mrg || as a bona fide error of marginality.

We first need the definitions set out hereunder. We first make some notations, and then we define a “random

value” and the constant that we call “marginality upper bound”.

14

We denote by R(N) the set of all possible linear orderings defined on N; γ ∈ ∆(R(N)) the simplex of

probabilities defined over it. Let f be a production function, x be an effort profile, and r ∈ R(N) be a given

order of workers; we define the vector x(r i ) as described hereunder:

⎧

x j if r(j) < r(i)

⎪⎨

x j (r i ) = 0 if r(j) > r(i) ,

⎪⎩

0 if i = j

and we denote mc(i, f, r(x), x) = f(x(r i ) + x i e i ) − f(x(r i )), the marginal contribution of a worker i ∈ N in

the vector [x(r i ) + x i e i ].

Definition 4. (Random value) An allocation θ is a random value if it admits a representation:

for any f and fixed effort profile x.

θ f i (x) =

∑

r∈R(N)

γ(r) × mc(i, f, r(x), x),

Under efficiency axiom, marginality property implies that θ is a random value (this is established in Theorem

2 in Khmelnitskaya (1999) and is translated to our environment using Lemma 2 relating a transferableutility

game to the firm environment). We define the “marginality upper bound” of a pay scheme.

Definition 5. (Marginality upper bound) The marginality upper bound of any pay scheme θ is the

following non negative constant:

K f (x) =

max

⎧

∑ ⎨

γ∈∆(R(N))

i∈N

⎩

∑

r∈R(N)

⎫

{

γ(r) − 1 } 1 ∑

⎬

mc(j, f, r(x), x)

n! |[i]|

⎭

j∈[i]

with [i] = {j ∈ N : i ∼ sym j} where i ∼ sym j indicates that workers i and j are symmetric workers in f at

x.

The marginality upper bound is the square of the maximum possible distance from the set of symmetrized

random values at (f, x) to the corresponding Shapley value. Recall that the Shapley value is a random value

with the following uniform distribution:

ϕ f i (x) = 1 n!

∑

r∈R(N)

mc(i, f, r(x), x).

This quantity can be computed under limited datasets because it requires knowledge about f and the

realized effort profile x but it is independent of θ. The importance of this quantity is established next.

Theorem 5. If N ≥ 3, for any given observation (θ f (x), f, x), with θ satisfying efficiency, it follows that:

2

,

• if ||e mrg || > √ K f (x), then marginality fails, where K f (x) is the marginality upper bound; and,

15

• if θ also satisfies the null player axiom and ||e add || > √ K f (x), then marginality and additivity must

fail.

Notice that if θ is efficient and symmetric, we have established already (Theorem 3) that if ||e mrg || > 0,

then marginality fails. However, Theorem 5 deals with the cases where θ is efficient but fails the symmetry

axiom. In those cases, our previous results are silent about what we can infer about violations of marginality

from ||e mrg || as a whole (we know that ||e mrg || decomposes further into the null-worker property and other

parts). Theorem 5 establishes that if ||e mrg || is larger than a non-zero constant, we can conclude that

marginality fails with certainty, even in limited datasets.

Remark 1. For the case of |N| = 2, efficiency and marginality do not imply that θ is a random value, instead

additivity, efficiency and marginality imply that θ is a random value. In this case if ||e mrg || > √ K f (x), then

additivity or marginality fail when θ is efficient.

6 Full Data and Variants

6.1 Other Types of Limited Datasets and Full Datasets

We have provided our decomposition results for limited datasets (i.e., for a given observation (θ f (x), f, x)).

In this section, we first consider a situation of limited datasets where we observe more than one realized

effort profile. In that case, we can generalize our Shapley’s distance to a summation of over-all effort profiles

in a finite set. More specifically, we have the following:

||θ f − ϕ f || 2 T = ∑ ||θ f (x) − ϕ f (x)|| 2 ,

x∈T

where T ⊆ T n is a subset of the set of realized effort profiles, with the production function f being fixed.

Similarly, we can consider several technologies captured in the set F ⊂ F as long as F is finite. We then

have:

||θ − ϕ|| 2 T,F = ∑ ||θ f − ϕ f || 2 T .

f∈F

The type of dataset used by Young’s (1985) and Shapley’s (1953) characterization of the Shapley value

corresponds to a case where we fix effort, generically T = {x} and we let the production function set be equal

to the set of all the possible production functions, F ≡ F. Here we would like to consider situations where

F ⊂ F is finite and T = {x}, for a fixed effort profile x. In this case, we simplify the norm || · || T,F = || · || F .

We define e j : F ↦→ R n for j ∈ {ϕ, sym, eff, mrg, null, add} pointwise e j (f) = e j (f(x)), where the left

hand side corresponds to the prequel definitions.

Remark 2. For a given dataset (θ f (x), f, x) f∈F , if ||e j (f(x))|| > 0 for a fixed observation (θ f (x), f, x), then

||e j || F > 0 for j ∈ {ϕ, sym, eff, mrg, null, add} for the fixed effort profile x and the finite set of production

16

functions F . Moreover, Theorems 3, 4, 5 hold for the extended data, replacing || · || by || · || F .

Since we can never observe a full dataset because the set of all of the possible production functions F

is infinite in our environment, we will focus on the idea of extending an observed pay scheme behavior in a

limited dataset to the full dataset.

Definition 6. An extension of an observed allocation dataset (θ f (x)) f∈F to the full dataset F is an allocation

mapping ϑ : F ↦→ R n such that the restriction ϑ|F : F ↦→ R n satisfies: ϑ(f(x)) = θ f (x) for f ∈ F , for a fixed

realized effort profile x, and for any g ∈ F, ∑ i∈N ϑ i(g(x)) ≤ g(x).

We are going to complement the results of Theorems 3, 4, 5 deriving partial converse results of the

moreover statements using the idea of extensions to full datasets.

Theorem 6. For any finite set of production functions F and the allocation dataset (θ f (x)) f∈F :

(i) if ||e sh || 2 F = ∑ f∈F ||θf (x) − ϕ f (x)|| 2 = 0, then there is an extension ϑ of (θ f (x)) f∈F to F that

corresponds exactly to the Shapley wage function (i.e., ϑ = ϕ).

(ii) if ||e sym || 2 F = ∑ f∈F ||esym (f(x))|| 2 = 0, then there is an extension ϑ of (θ f (x)) f∈F to F that satisfies

equal-treatment for each f ∈ F.

(iii) if ||e eff || 2 F = ∑ f∈F ||eeff (f(x))|| 2 = 0, then there is an extension ϑ of (θ f (x)) f∈F

efficiency for each f ∈ F; and,

to F that satisfies

(iv) if ||e sym || F = 0, ||e eff || F = 0, and ||e null || 2 F = ∑ f∈F ||enull (f(x))|| 2 = 0, then there is an extension ϑ

of (θ f (x)) f∈F to F that satisfies efficiency and the null-worker property for each f ∈ F.

We cannot obtain corresponding converse results for ||e mrg || F = 0 and ||e add || F = 0. It is easy to find

counter-examples (e.g., Example 2 in section 7) where these are zero and there is no extension that satisfies

marginality or additivity. We can however find weaker results that suggest new axiomatizations for the

Shapley wage function.

Recalling that [i] f = {j ∈ N : i ∼ sym,f j} is the set of symmetric workers to i ∈ N in production function

f given the fixed realized effort x, we make explicit the dependence on the production function.

Axiom 6. (Average marginality)

A pay scheme θ is marginal on average if for any two production functions f and g, for all i ∈ N and for

all x ⊳ x:

[f(x + x i e i ) − f(x) ≥ g(x + x i e i ) − g(x)] ⇒ 1

|[i] f |

∑

θ f j (x) ≥ 1

|[i] g |

j∈[i] f

∑

j∈[i] g θ g j (x)].

17

This is a weakening of the marginality axiom, and we can show that it is satisfied by the Shapley pay

scheme . It says that, if any worker i ∈ N, who belongs to an equivalence class of symmetric workers in f,

has a higher marginal contribution in f than in g, then the average payoff of his equivalence class goes up.

Heuristically, it means that the worker’s “expected” payoff increases. We can think of an individual observing

that the marginal contribution of i is higher in f than in g; then the observer could expect, under the

conditions described, a higher value for i in f than in g.

More interestingly, symmetry and average marginality imply marginality. This in turn implies that efficiency,

symmetry and average marginality properties characterize the Shapley value uniquely.

Axiom 7. (Average null-worker)

Let N f be the set of all null workers in a given production function f and x ∈ T n . A pay scheme θ satisfies

∑

1

the property of the average null-worker if

|N f | j∈N f θf j (x) = 0.

This is a weakening of the null-worker property requiring only that the equivalence class of null workers

in any production function f for fixed effort x receive a mean payoff of zero. Heuristically, this again could

be interpreted as the requirement that a null-worker receives an “expected” value of zero. Symmetry and the

average null-worker properties imply the null-worker property, thus efficiency, additivity, symmetry and the

average null worker properties uniquely characterize the Shapley pay scheme.

Theorem 7. For any finite set of production functions F and the allocation dataset (θ f (x)) f∈F :

(i) If ||e mrg || 2 F = ∑ ||e mrg (f(x))|| 2 = 0 and ||e eff || F = 0, then there is an extension ϑ of (θ f (x)) f∈F to

f∈F

F that satisfies average marginality; and,

(ii) If ||e null || 2 F = ∑ ||e null (f(x))|| 2 = 0 and ||e eff || F = 0, then there is an extension ϑ of (θ f (x)) f∈F to

f∈F

F that satisfies the average null-worker property for each f ∈ F.

6.2 Variants of the Decomposition of the Shapley Distance

Under efficiency requirement, or under the assumption that all allowable allocations must be efficient for each

production function at a given realized effort profile, there are two possible decompositions of the Shapley

distance. The first one corresponds to what we present in the prequel where we first impose symmetry and

then consider the restriction to symmetry and marginality. The other possibility is first to impose marginality

and then to impose symmetry conditional on efficiency. The first possibility could be viewed as a situation in

which the observer, say a social planner, has a priority order where symmetry is more valued than marginality

in the evaluation of injustice. The second situation is the converse. We show that each situation will generally

lead to a different evaluation of injustice.

Here we explore the second possibility.

18

We first let v mrg = argmax v ||θ − v|| F subject to v : F ↦→ R n being marginal and efficient in F . We know

by Theorem 2 in Khmelnitskaya (1999) that any marginal and efficient value has the representation:

v mrg

i (f(x)) = ∑

for all i ∈ N or it must be a random value where:

r∈R(N)

γ ∗ (r)mc(i, f, r(x), x),

γ ∗ ∈ argmax{ ∑ f∈F

||θ(f(x)) − η(γ, f(x))|| 2 : γ ∈ ∆(R(N))},

and

η(γ, f(x)) =

∑

γ(r)mc(i, f, r(x), x)

r∈R(N)

(with existence being guaranteed by the compactness of the simplex ∆(R(N)).

We let r mrg (f(x)) = θ f (x) − v mrg (f(x)). Finally we let r sym (f(x)) = v mrg (f(x)) − ϕ f (x) be the residual

error of projecting v mrg into the set of allocations that satisfies efficient, symmetric and marginal axioms

(i.e., the Shapley pay scheme).

Theorem 8. Let θ be an efficient pay scheme, f ∈ F be a production function, and (θ f (x), f, x) be an

observation. Then θ f (x) = ϕ f (x) + r mrg (f(x)) + r sym (f(x)) is decomposed uniquely, with:

||e sh || = ||r mrg || 2 + ||r sym || 2 + 2 < r mrg , r sym > .

Furthermore,

(i) if ||r mrg || > 0, then θ fails marginality axiom; and,

(ii) if θ satisfies marginality and ||r sym || > 0, then θ fails equal-treatment axiom.

6.3 Average Decomposition

So far we have assumed that the observer has a preference or priority in the decomposition. A natural

question arises in regard to which decomposition is more meaningful. One answer is that the observer may

use some type of priority ordering to decide which axiom is more important. Alternatively, the observer could

attach a weight to each priority order, and in this case, an average decomposition would be more desirable.

In what follows, we omit f and x from the notation as they are fixed. We define below the goodness-of-fit

index, which proves to be a useful measure of fairness in a firm.

Definition 7. The goodness-of-fit index is: ρ sh = ||e sh || 2 /||θ|| 2 .

This index measures how much a pay scheme θ departs from the Shapley pay scheme in relative terms

(as a percentage).

19

• If ρ sh = 0, then θ is numerically equivalent to the Shapley value at f given the effort level x; and,

• If ρ sh → 1, then the farther the θ is from the Shapley wage.

Now we decompose the index ρ sh into the axioms of symmetry and marginality conditional upon θ being

efficiency at f and x.

Let ρ sh : 2 {sym,mrg} ↦→ [0, 1] be the goodness-of-fit measure of imposing axioms sym, mrg to θ, so that:

ρ sh ({i, j}) = ||θ − v {i,j} || 2 /||θ|| 2 ,

where v {i,j} is the closest vector under the Euclidean norm to θ satisfying the properties. We define the

marginal contributions of an axiom j ∈ {sym, mrg} to ρ sh with respect to the two possible orders in which

the restrictions can be imposed as described hereunder:

(i) For sym, given order sym ≻ mrg, we have:

ρ sh (sym) − ρ sh (∅) = ||θ − v sym || 2 /||θ|| 2

= ||e sym || 2 /||θ|| 2 ,

with ρ sh (∅) = ||θ − θ|| 2 /||θ|| 2 = 0.

(ii) For sym, given order mrg ≻ sym, we also have:

ρ sh (mrg, sym) − ρ sh (mrg) = ||θ − v sym,mrg || 2 − ||θ − v mrg || 2 /||θ|| 2

= ||r sym || 2 + 2 < r mrg , r sym > /||θ|| 2 .

(iii) For mrg, given order sym ≻ mrg, we have:

ρ sh (sym, mrg) − ρ sh (sym) = ||θ − v sym,mrg || 2 + ||θ − v sym || 2 /||θ|| 2

= ||e mrg || 2 /||θ|| 2 .

(iv) For mrg, given order mrg ≻ sym, it follows that:

ρ sh (mrg) − ρ sh (∅) = ||θ − v mrg || 2 /||θ|| 2

= ||r mrg || 2 /||θ|| 2 .

We are assigning the average marginal decomposition to the goodness-of-fit index ρ sh that corresponds

to the following axioms:

• For marginality, ρ mrg = (1 − α)||e mrg || 2 + α||r mrg || 2 /||θ|| 2 .

• For symmetry, ρ sym = α(||r sym || 2 + 2 < r mrg , r sym >) + (1 − α)||e sym || 2 /||θ|| 2 .

20

This is an (additive) decomposition of the goodness-of-fit index ρ sh = ρ mrg + ρ sym with respect to

the two axioms, where we assign the numerical contribution of each axiom to reduce the goodness-of-fit.

If ρ mrg > ρ sym we say that the violations of marginality are worse than the violations of symmetry and

vice-versa.

We now define a desirable property for the proposed goodness-of-fit index, “monotonicity” (I). An additive

decomposition for the axioms j ∈ {sym, mrg} of the goodness of fit index ρ sh satisfies monotonicity if for

any pay schemes θ and θ ′ such that:

{

}

[ρ θ,sh (C ∪ j) − ρ θ,sh (C) ≥ ρ θ′ ,sh (C ∪ j) − ρ θ′ ,sh (C)] j /∈ C for j ∈ {sym, mrg} and C ⊆ {sym, mrg} ,

then

ρ θ,j ≥ ρ θ′ ,j .

Theorem 9. An additive decomposition for the axioms j ∈ {sym, mrg} of the goodness-of-fit index ρ sh

satisfies monotonicity (I), if and only if:

ρ sym = α(||r sym || 2 + 2 < r mrg , r sym >) + (1 − α)||e sym || 2 /||θ|| 2 ,

and

Moreover:

ρ mrg = α||r mrg || 2 + (1 − α)||e mrg || 2 /||θ|| 2 for α ∈ [0, 1].

(i) if ρ sh > 0, then either symmetry or marginality property fails;

(ii) if ρ sym > max{α(ρ sh (sym, mrg) − ρ sh (sym)), (1 − α)(ρ sh (sym) − ρ sh (∅)}, then sym fails; and,

(iii) if ρ mrg > max{α(ρ sh (mrg) − ρ sh (∅)), (1 − α)(ρ sh (mrg, sym) − ρ sh (mrg)}, then mrg fails.

The presence of the constant α ∈ [0, 1] creates a family of decompositions. Now we pin down three cases

of interest. The first one has to do with adding an equal contributions axiom. This condition requires that,

if the additional contribution of an axiom when the other has not yet been imposed is the same for both

axioms (symmetry and marginality), then the decomposition should assign the same value to each axiom.

We call this axiom “equal contributions” (II). A marginal average decomposition of the goodness-of-fit index

satisfies the equal contributions property if:

ρ sh (mrg) = ρ sh (sym), then ρ mrg = ρ sym .

21

Corollary 1. If a marginal average decomposition for axioms j ∈ {sym, mrg} of the goodness-of-fit index

ρ sh also satisfies the equal contributions property (II), then α = 1 2

. That is,

ρ sym = 1 2 (||rsym || 2 + 2 < r mrg , r sym >) + 1 2 ||esym || 2 /||θ|| 2 ,

and

ρ mrg = 1 2 ||rmrg || 2 + 1 2 ||emrg || 2 /||θ|| 2 .

We also obtain decompositions in the line of Theorem 3 and Theorem 8.

First consider property (II’) which states that:

ifρ sym > 0, then sym fails.

Corollary 2. If a marginal average decomposition for axioms j ∈ {sym, mrg} of the goodness-of-fit index

ρ sh also satisfies property (II’), it follows that α = 0 (i.e., ρ sym = ||e sym || 2 /||θ|| 2 and ρ mrg = ||e mrg || 2 /||θ|| 2 ).

Finally consider property (II”) which states that if ρ mrg > 0, then marginality fails.

Corollary 3. If a marginal average decomposition for axioms j ∈ {sym, mrg} of the goodness-of-fit index

ρ sh also satisfies property (II”), it follows that α = 1. That is ,

ρ sym = (||r sym || 2 + 2 < r mrg , r sym >)/||θ|| 2 ,

and

ρ mrg = ||r mrg || 2 /||θ|| 2 .

Remark 3. As we can see, there is no decomposition of the goodness-of-fit measure such that properties

(II’) and (II”) are satisfied simultaneously. However, we believe that the decomposition of Theorem 3 is very

desirable because it is completely tractable. Furthermore, we have proven that marginality property can be

tested using an upper bound in Theorem 5.

The next section provides applications.

7 Applications and Examples

7.1 The Quasi-linear Contract

Our first application is to quasi-linear pay schemes in a firm with two workers, with each choosing his effort

level from a set that contains two levels. The quasi-linearity of the pay scheme means that one worker is paid

a rate on the amount of input he contributes and that the other worker receives the residual output. This

22

type of output sharing rule cannot be represented in the classical transferable-utility environment, which

again shows the empirical relevance of our firm environment.

Our objective is to measure the divergence of this allocation rule from the Shapley pay scheme and to

identify the sources of this divergence. The example set out below shows that the quasi-linear pay scheme

fails equal-treatment and marginality properties, and hence additivity axiom.

Example 1. Consider a firm F = (N, T, f) where N = {1, 2} is the set of workers, T = {0, 1} is the set of

effort levels, and f is the production function defined as follows:

⎧

⎨ 1 if x ≠ (0, 0)

f(x) =

⎩

0 if x = (0, 0)

(2)

Consider the quasi-linear pay scheme Qlc defined as follows: Qlc 1 (x) = 3 4 x 1 and Qlc 2 (x) = f(x) − 3 4 x 1 for

all x ∈ T 2 . For each x ∈ T 2 , we have Qlc 1 (x) + Qlc 2 (x) = f(x), which mean that Qlc is efficient.

We now show that Qlc does not satisfy the equal-treatment property. Consider the labor supply x = (1, 1).

The only vector x such that x ⊳ x with x 1 = x 2 = 0 is x = (0, 0); moreover we have mc(1, f, x, x) =

mc(2, f, x, x) = 1, which shows that the two workers are identical at x = (1, 1). However, Qlc 1 (1, 1) ≠

Qlc 2 (1, 1), and therefore, Qlc does not satisfy the equal-treatment property.

In order to quantify the violations of the properties that characterize the Shapley pay scheme, let us first

derive the Shapley wage of each

⎛

worker at each

⎞

vector x. The

⎛

Shapley wage

⎞

profile at each x is given by the

(0, 0) (0, 1)

(0, 0) (0, 1)

following matrices: ϕ f (X) = ⎝ ⎠, where X = ⎝ ⎠ is the matrix that contains all

(1, 0) ( 1 2 , 1 2 ) (1, 0) (1, 1)

of the possible vectors of effort levels, with the first component of each cell denoting the effort level of worker

1, and the second component denoting the effort level of

⎛

worker 2.

⎞

(0, 0) (0, 1)

The quasi-linear wage profile is given by: Qlc(X) = ⎝ ⎠.

( 3 4 , 1 4 ) ( 3 4 , 1 4

⎛

) ⎞

(0, 0) (0, 0)

Using the difference between the two matrices, ϕ f (X)−Qlc(X) = ⎝ ⎠, we can compute

( 1 4 , − 1 4 ) (− 1 4 , 1 4 )

for the distance ‖ϕ f − Qlc‖ 2 T = 1/4 for T ≡ T n .

The linear contract Qlc satisfies efficiency, but it does not satisfy symmetry. We evaluate the contribution

of the violation of each of these properties. We know that:

Qlc = ϕ f + e sym + e eff + e mrg .

1. Let e sym = Qlc − v sym . For all x ≠ (1, 1), Qlc(x) = v sym (x). For x = (1, 1), we have v sym

1 (x) =

⎛

⎞

v sym

2 (x) = 1 2 [Qlc 1(x) + Qlc 2 (x)] = 1 2 . Hence, the distribution (0, 0) (0, 1)

vsym = ⎝ ⎠. It follows that

( 3 4 , 1 4 ) ( 1 2 , 1 2

⎛

⎞

)

(0, 0) (0, 0)

e sym can be represented by the matrix : ⎝ ⎠, leading to ‖e sym ‖ 2

(0, 0) ( 1 4 , − 1 4 ) T = 2 16 .

23

2. Let e eff = v sym − v sym,eff . We have v sym,eff

i (x) = v sym f(x)− ∑ Qlc i(x i)

i

i (x) +

2

= v sym

i (x).

It follows that e eff can be represented by a null matrix and that ‖e eff ‖ 2 T = 0.

⎛

⎞

(0, 0) (0, 0)

3. Let e mrg = v sym,eff − ϕ f = ⎝ ⎠, so ‖e mrg ‖ 2

(− 1 4 , 1 4 ) (0, 0) T = 2 16 .

Now we compute the marginality bound for each effort vector.

1. For the effort vector (1, 1), the random value θ 1 = γ(r 12 )1 + γ(r 2,1 )0 and θ 2 = γ(r 1,2 )0 + γ(r 2,1 )1; the

Shapley wage profile is ( 1 2 , 1 2 ), and the symmetrized random value is vsym = γ(r 12 ) 1 2 + γ(r 2,1) 1 2 . The

bound is therefore K f ((1, 1)) = max γ∈∆(R(N)) {2(γ(r 12 ) 1 2 + γ(r 2,1) 1 2 − 1 2 )2 } = 0, which implies that, if

||e mrg,f (1, 1)|| > 0, then marginality property is violated.

2. For the effort vector (1, 0), the random value θ 1 = γ(r 12 )1 + γ(r 2,1 )1 and θ 2 = γ(r 1,2 )0 + γ(r 2,1 )0; the

Shapley wage profile is (1, 0), and the symmetrized random value is the same. The marginality bound

is given by K f ((1, 0)) = max γ∈∆(R(N)) {(0) 2 + (0) 2 } = 0.

3. For the effort vector (0, 1), the bound is K f ((0, 1)) = 0.

We can observe that ‖e mrg ‖ 2 + ‖e eff ‖ 2 + ‖e sym ‖ 2 = 1 4

. We conclude that 50% of the unfairness of the

quasi-linear pay scheme in this example is explained by the violation of the equal-treatment property, and

that the other 50% is explained by the violation of the marginality property. It is important to, note that,

notwithstanding the fact that the statement of the marginality axiom requires that all of the production

functions be known, in this example, we were able to quantify the violation of this axiom knowing only one

production function. This again shows the empirical relevance of our approach.

In what follows, we derive the closest quasi-linear pay scheme Qlc f

to the Shapley pay scheme. We

therefore have: Qlc f 1 (x) = ax 1 and Qlc f 2 (x) = f(x) − ax 1. The matrix of wage profiles for the pay scheme

⎛

⎞

(0, 0) (0, 1)

Qlc f is ⎝ ⎠. Thus, ‖ϕ f − Qlc f ‖ 2 T = D(a) = 2(1 − a)2 + 2( 1 2 − a)2 . It follows that we

(a, 1 − a) (a, 1 − a)

should find the value a that minimizes D(a): min D(a). We obtain a = 3

a

4 and Qlcf = Qlc.

7.2 The Linear Contract

Our second application is to use linear pay schemes, in which each worker’s pay is a linear function of his

effort level. We conduct two kinds of analyzes. In the first analysis, we imagine an arbitrary linear pay scheme

and study the effect of increasing the pay rate of a worker on the violation of fairness. This is a comparative

statics analysis. In the second analysis, we derive the linear pay scheme that is the closest possible to the

Shapley pay scheme. We show that the only property of the Shapley pay scheme that is violated by this pay

scheme is efficiency.

24

7.2.1 Comparative Statics

Consider a firm F = (N, T, f) and an effort profile x. The pay of each worker i at x is v lc

i

α i > 0 is the pay rate of i. The closest pay scheme that is symmetric is given by:

v sym

i = 1 ∑

α j x j

|[i]|

j∈[i]

= α i x i , where

for all j ∈ [i] where [i] = {j ∈ N : i ∼ sym j}.

The closest pay scheme that is both symmetric and efficient is given by:

v sym,eff

i

= v sym

i + 1 n [f(x) − ∑ i∈N

α i x i ]

or equivalently by:

where L(x) = ∑ α i x i .

i∈N

v sym,eff

i

= v sym

i

+ 1 [f(x) − L(x)],

n

Finally the pay scheme that is symmetric and, efficient and that satisfies marginality properties evidently

is the Shapley value of the firm given by:

ϕ f i (x) =

∑

x⊳x, x i=0

The residuals are computed as follows:

(|x|)!(|x| − |x| − 1)!

[f(x + x i e i ) − f(x)].

(|x|)!

e lc,sym = v lc − v sym , e sym

i = α i x i − 1 ∑

α j x j .

|[i]|

j∈[i]

Furthermore, the efficiency residual is e eff = v sym − v sym,eff . It therefore follows that:

e eff

i = − 1 n [f(x) − ∑ i∈N

α i x i ].

Finally, the marginality residual is e mrg = v sym,eff − ϕ f . That is:

e mrg

i (x) = v sym,eff

i (x) − ∑

x⊳x, x i=0

Simplification results in the following:

e mrg

i (x) = ∑

x⊳x, x i=0

(|x|)!(|x| − |x| − 1)!

(|x|)!

(|x|)!(|x| − |x| − 1)!

[f(x + x i e i ) − f(x)].

(|x|)!

[v sym,eff

i (x) − (f(x + x i e i ) − f(x))].

Observe that the marginality residual is a weighted average of the average linear payoff of the workers

who are symmetrical to worker i plus an equal split of any output wasted under the linear scheme minus the

marginal contribution of worker i with respect to any vector x ⊳ x:

e mrg

i (x) = ∑

x⊳x, x i=0

(|x|)!(|x| − |x| − 1)!

{ 1 ∑

α j x j + 1 (|x|)! |[i]|

n [f(x) − L(x)] − (f(x + x ie i ) − f(x))}.

j∈[i]

25

Intuitively, it is the weighted average of the difference between the corrected linear pay scheme and the

marginal contribution under the firm’s different configurations.

The total residual is a weighted average of the difference between the linear pay scheme and the marginal

contribution:

e lc

i = ∑

x⊳x, x i=0

(|x|)!(|x| − |x| − 1)!

[α i x i − (f(x + x i e i ) − f(x))].

(|x|)!

The distance of the linear pay scheme to the Shapley pay scheme at x is therefore:

||e lc (α)|| 2 = ∑ i∈N[

∑

x⊳x, x i=0

which is a function of the vector (α i ) i of pay rates.

(|x|)!(|x| − |x| − 1)!

[α i x i − (f(x + x i e i ) − f(x))]] 2 ,

(|x|)!

We now analyze the effect of increasing a worker i’s pay rate α i on this distance. We have:

∂

∂α i

||e lc || 2 (α) = 2e lc

i x i .

This shows that the sign of the effect of a change in α i entirely depends on the sign of e lc

i . Furthermore,

the magnitude of this effect depends on the effort level x i and the residual e lc

i . A necessary and sufficient

condition for the residual e lc

i

to be positive is when the linear payoff that worker i is receiving is greater than

what the worker would have received under the Shapley pay scheme: α i x i > ϕ f i (x). Therefore, increasing

the effort unit rate α i increases the level of unfairness only if worker i is getting more than his fair pay.

We now determine how each component of the distance between the linear pay scheme and the Shapley

pay scheme at x is affected by a change in α i .

First of all, notice that the violation of the equal-treatment axiom is the variance of the average pay of

symmetric workers:

||e lc,sym (α)|| 2 = ∑ i∈N

The derivative of this measure with respect to α i is:

[α i x i − 1 ∑

α j x j ] 2 .

|[i]|

j∈[i]

∂

||e lc,sym (α)|| 2 = 2e sym |[i]| − 1

i x i − 1

∂α i |[i]| |[i]|

∑

j∈[i],j≠i

2e sym

j x i .

We note that the latter derivative depends on two components. One component is the additional lack of

the equal-treatment property of worker i which is positive when v lc

i

> v sym

i

(that is, when worker i receives

under the linear pay scheme a payoff greater than the average payoff of the group of symmetric workers

to which i belongs). The second component measures discrimination due to the payoff of the other workers

symmetric to i which is smaller than the average: v lc

j

< vsym j

for j ≠ i. It is clear, that an increase in α i has

a direct effect and an externality effect that depend on the relative position of the people within the group

of workers who are symmetric to i.

26

The violation of efficiency is simply the square of the wasted output divided by the number of workers:

||e eff (α)|| 2 = ∑ [ 1 [ ∑ α i x i − f(x)] 2

n [∑ α i x i − f(x)]] 2 i∈N

=

.

n

i∈N i∈N

The effect of increasing the pay rate α i of worker i on the efficiency violation is:

2[ ∑ α i x i − f(x)]

∂

||e lc,eff (α)|| 2 i∈N

=

x i .

∂α i n

This effect is always non positive due to the fact that ∑ α i x i ≤ f(x). It follows that increasing a worker’s

pay rate always increases efficiency. Together with the findings on the effect of increasing a worker’s pay rate

on the symmetry violation, this finding suggests that the linear pay scheme trades off horizontal fairness and

efficiency under certain configurations.

The marginality violation is equal to:

⎛

⎛

⎞⎞

||e lc,mrg (α)|| 2 = ∑ ⎝

∑

ϕ(x, x) ⎝ 1 ∑

α j x j + 1 |[i]|

n [f(x) − ∑ 2

α i x i ] − mc(i, f, x, x) ⎠⎠

.

i∈N x⊳x, x i=0

j∈[i] i∈N

i∈N

Taking the derivative of ||e lc,mrg (α)|| 2 with respect to α i yields:

∂

∂α i

||e lc,mrg (α)|| 2 = 2e mrg

i ( 1

|[i]| x i − 1 n x i) +

∑

k∈N,k /∈[i]

∑

j∈[i],j≠i

e mrg

k

(− 1 n x i).

2e mrg

j ( 1

|[i]| x i − 1 n x i)+

A sufficient condition for this derivative to be positive in the first two components is that the symmetry

and efficiency payoffs are greater for i and for j ∈ [i] than their fair share: v sym,eff

i

> ϕ f i (x) for i ∈ [i]; this

means that increasing the effort rate of worker i increases unfairness. The final component is positive if the

workers outside the equivalence class of worker i have symmetry and efficiency payoffs that are below their

fair payoffs, that is v sym,eff

k

< ϕ f k (x).

In summary, increasing a worker i’s wage increases the violation of marginality when the worker himself

or workers who are symmetric to him are receiving more than they should receive under the Shapley pay

scheme and when other workers who are different from i receive less than their Shapley wage.

By a simple rule of derivation, we note that the total effect of a change in the effort rate α i is also

additively decomposable into the terms that we have presented:

∂

∂α i

||e lc (α)|| 2 =

∂

∂α i

||e lc,sym (α)|| 2 +

∂

∂α i

||e lc,eff (α)|| 2 +

∂

∂α i

||e lc,mrg (α)|| 2 .

7.3 The Closest Linear Contract to the Shapley Pay Scheme

We apply the generalization of the Shapley distance to more than one observation to study additional

properties of the linear contract. In particular, we find the closest linear pay scheme to the Shapley pay scheme

27

when all of the possible effort profiles are observed. We show that this particular linear scheme satisfies all of

the properties that characterize the Shapley pay scheme with the exception of efficiency property. Therefore,

the distance between this pay scheme and the Shapley wage function only measures the efficiency violation.

Interestingly, in an example, we show that when a pay scheme does not satisfy efficiency, the fact that

||e mrg || > 0 is not sufficient to conclude that it fails this axiom.

Let (N, T, f) be a firm in which T = {0, 1, ..., t} and |N| = n > 2. We want to find the linear contract lc

that is the closest possible to the Shapley pay scheme ϕ f . This is the linear contract that minimizes injustice

within the firm. For a worker i, assume that lc i (x) = a i x i for all x ∈ T n . The square of the distance between

ϕ f and lc is:

‖ϕ f − lc‖ 2 T = ∑

x∈T n {

‖ϕ f (x) − lc(x)‖ 2} .

Given the that ϕ f (x) − lc(x) is a vector of R n , we can rewrite ‖ϕ f − lc‖ 2 T as:

‖ϕ f − lc‖ 2 T = ∑ { n

} { }

∑

n∑ ∑

[ϕ f i (x) − lc i(x)] 2 = [ϕ f i (x) − lc i(x)] 2 = D(a 1 , a 2 , ..., a n ).

x∈T n i=1

i=1 x∈T n

To find the closest linear contract lc ∗ to the Shapley pay scheme, we need to find the vector (a 1 , a 2 , ..., a n )

that minimizes D(a 1 , a 2 , ..., a n ). Taking the first-order conditions yields for each i:

a f i =

∑

x i ϕ f i (x)

x∈T n ∑

x 2 .

i

x∈T n

Notice that the linear contract lc ∗ can be written as the projection of the Shapley pay scheme in the line

spanned by the effort vector. More specifically, lc ∗ i (x) = af i x, with af i = 〈xi,ϕf i 〉

〈x i,x i 〉 where x i = (x i ) x∈T n is the

vector of all possible efforts that i can exert, and ϕ f i = (ϕf i (x)) x∈T n

correspond to those efforts.

is the vector of the Shapley values that

Let us show that the linear contract lc ∗ is symmetric. Suppose two workers i, j ∈ N such that mc(f, i, x, x) =

mc(f, j, x, x) for all x, x ∈ T n with x i = x j = 0. It follows that ϕ f i (x) = ϕf j (x) since i and j are symmetric

and since ϕ f is symmetric. Hence a f i = af j

and we conclude that lc ∗ is symmetric.

Given the that lc ∗ satisfies symmetry property, its distance to the Shapley pay scheme can only be decomposed

into its violations of efficiency and marginality requirements. It follows that:

‖ϕ f − lc ∗ ‖ 2 T = ∑

x∈T n {

‖ϕ f (x) − lc ∗ (x)‖ 2} = ∑

If one defines L(x) = [f(x) − ∑ i∈N a∗ i x i], then v eff

i (x) = a f i x i + L(x)

n

e mrg

i (x) = v eff

i (x) − ϕ f i (x)

∑

=

x⊳x, x i=0

ϕ(x, x)[a f i x i + L(x)

n

x∈T n {

‖e eff (x)‖ 2 + ‖e mrg (x)‖ 2}

and e eff

i

− mc(i, f, x, x)].

(x) = − L(x)

n

. Finally,

28

The quantity ‖e eff (x)‖ 2 = L(x)2

n

penalizes the square of the wasted output.

The quantity ‖e mrg (x)‖ 2 = ∑ { } 2

∑

i∈N

ϕ(x, x)[a f i x i + L(x)

n

− mc(i, f, x, x)] is the average difference

between the efficiency-corrected linear pay scheme where the waste is split equally among the

x⊳x, x i=0

workers

and the marginal contribution to output.

We also show that the linear pay scheme lc ∗ satisfies marginality. Indeed consider lc ∗f

i (x) = a f i x i and

lc ∗g

i (x) = a g i x i such that g(x + e i x i ) − g(x) ≥ f(x + e i x i ) − f(x) for all y ⊳ x with y i = 0 and x ∈ T n . Given

the fact that the Shapley value ϕ i satisfies the marginality property, it follows that ϕ g i (x) ≥ ϕf i (x), and thus

a g i ≥ af i . Hence, lc∗g i

≥ lc ∗f

i . Hence lc ∗ i

satisfies marginality property.

The example set out below illustrates the fact that the fairest linear pay scheme is not efficient in general.

More importantly, when efficiency fails, the fact that the marginality component of the Shapley distance is

positive does not provide a sufficient basis to conclude that marginality fails.

Example 2. We consider a firm with two workers 1 and 2. Each worker can supply up to two hours of labor.

Labor is an essential input to production and we assume that the technology of the firm displays constant

returns to scale. For each worker, the first hour of work yields 50 units of output, and any additional hour

has a constant marginal product of 10. We also assume that the price of each output produced is equal to

1, such that for a vector of hours supplied x, the value f(x) represents the monetary value of the output

generated by x. The following table shows the mapping of the input to the output:

x f(x) x f(x) x f(x)

(0, 0) 0 (0, 2) 60 (2, 1) 70

(0, 1) 50 (2, 0) 60 (1, 1) 60

(1, 0) 50 (1, 2) 70 (2, 2) 80

Notice that at the input vectors (0, 0), (1, 1) and (2, 2), the workers are symmetric.

A. Assume that the manager offers a linear contract: lc 1 (x) = ax 1 and lc 2 (x) = bx 2 for x ∈ T 2 .

⎛

⎞

(0, 0) (0, b) (0, 2b)

• The payoffs under the linear contract are given by: lc = ⎜ (a, 0) (a, b) (a, 2b) ⎟

⎝

⎠ .

(2a, 0) (2a, b) (2a, 2b)

⎛

⎞

(0, 0) (0, 50) (0, 60)

• The Shapley payoffs are given by: ϕ f = ⎜(50, 0) (30, 30) (30, 40) ⎟, with symmetric diagonal as

⎝

⎠

(60, 0) (40, 30) (40, 40)

expected.

29

– The square of the distance between the Shapley and the linear pay schemes is given by:

‖ϕ − lc‖ 2 T = (50 − a)2 + (60 − 2a) 2 + (50 − b) 2 + 2 × (30 − a) 2 + 2 × (30 − b) 2 + (40 − 2a) 2 + (60 −

2b) 2 + (40 − 2b) 2 + (40 − 2a) 2 + (40 − 2b) 2 .

B. We would like to determine the fairest linear contract lc ∗ . Thus, we should solve the problem:

{

min ‖ϕ f − lc‖ 2 }

T .

a, b

The unique solution for a and b respectively is a ∗ = 26 and b ∗ = 26. The linear scheme lc ∗ is not

efficient. In fact consider the numerical example before the generalization: we have a ∗ 1 = a ∗ 2 = 26. For

x = (1, 1) for instance, lc ∗ 1(x) + lc ∗ 2(x) = 52 ≠ 60 = f(1, 1).

Let us compute the distance between lc ∗ and the Shapley pay scheme at the fairest hourly rates (a ∗ , b ∗ ).

We have ‖ϕ f − lc(a ∗ , b ∗ )‖ 2 T = 1920. We now decompose this total violation of justice into its different

components under the assumption of an observer who prefers Pareto efficiency over marginality.

We know that lc ∗ is symmetric, and so ‖e sym ‖ 2 T = 0. Furthermore, have ‖eeff ‖ 2 T

= 1024, thus implying

that efficiency is violated. Also ‖e mrg ‖ 2 T = 896. It therefore follows that, unfairness in the linear pay scheme

is 53.3% assigned to the violation of efficiency property and is 46.7% due to violation of both efficiency

and marginality requirements. Note, however, that the fact that ‖e mrg ‖ 2 T

> 0 is not a sufficient basis to

conclude that marginality has failed, especially given the fact that linear pay schemes are not efficient in

general. Thus, this amount corresponds to the additional change in the distance due to the imposition of

marginality on top of efficiency. We now decompose ||e mrg ‖ 2 T

into the violations of additivity and of the

null-worker property. We find that ‖e null ‖ 2 T = 640, ‖eadd ‖ 2 T = 256, and < eadd , e null >= 0, which leads

to‖e mrg ‖ 2 T = ‖enull ‖ 2 T + ‖eadd ‖ 2 T . The assignment of the marginality component of the Shapley distance in

this example is 71.4% due to the violation of the null-worker property and is 28.6% due to the violation of

the additivity axiom. We conclude that unfairness in the linear pay scheme is 53.3% due to the efficiency

violation, 33.33% due to the violation of the null-worker property, and 13.3% due to the additivity violation.

Now we fix the effort profile at x = (1, 2). The linear pay vector is (26, 52) with a f 1 = af 2

= 26. The

efficient pay scheme would be (22, 48) = (26, 52)+(f(x)−78)/2∗(1, 1). Then (22, 48) is symmetric (vacuously

because the workers 1 and 2 are not symmetric at this realized level of effort) and only apply to efficient

by construction. However, the Shapley pay scheme that not only is symmetric and efficient but also satisfies

marginality corresponds to (30, 40), which is different from (22, 48). In this case, ||e mrg || > 0 while marginality

holds. This situation is not ruled out by our decomposition Theorem 3 since efficiency does not hold.

7.4 Intra-firm Bargaining and Firm Unfairness

We consider an at-will firm with two identical workers. The firm and the workers bargain over wages using a

Rodolex procedure in the spirit of Stole and Zwiebel (1996). The bargaining is done bilaterally and p ∈ [0, 1]

30

is the probability that if the firm rejects an offer from a worker, the negotiation breaks down and 1 − p is

the probability that the negotiation proceeds to the next stage. Thus, p is a proxy for the firm’s negotiating

power. When p → 0, the firm has no negotiating power at all. We let the firm be indexed as 0 and the

workers be indexed as 1, 2. We spare the details of the bargaining protocol and redirect the reader to the

work of Brugemann et al. (2015).

In this example, the negotiation finishes with the following pay vector:

and with the firm receiving:

w 1 (p) = b +

w 2 (p) = b +

w 0 (p) = π(p) = π 0 (p) +

1

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b],

1 − p

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b],

(1 − p) 2

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b].

With π 0 (p) = y 0 + 1−p

2−p [y 1 − y 0 − b], b is the outside option of both workers and, y i the production of the

firm with i ∈ {0, 1, 2} workers being active.

The result of letting p → 0 is that the pay scheme converges to the Shapley value. In particular:

ϕ 1 = w 1 (0) = b + 1 3 [y 2 − π 0 (0) − 2b],

and

and:

ϕ 2 = w 2 (0) = b + 1 3 [y 2 − π 0 (0) − 2b],

ϕ 0 = w 0 (0) = π(0) = π 0 (0) + 1 3 [y 2 − π 0 (0) − 2b].

We observe that workers are symmetric and different from the firm in general. We compute:

v sym

(1 − p) 2

0 (p) = w 0 (p) = π 0 +

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b],

v sym

1 (p) = v sym

2 (p) = 1 2 [w (1 − 1 2

1(p) + w 2 (p)] = b +

p)

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b].

The corresponding error terms are:

e sym

1

2

1 (p) =

p

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b],

and

(with e sym

0 (p) = 0).

e sym

− 1 2

2 (p) =

p

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b],

31

The error of marginality is :

e mrg

1 (p) = v sym

1 (p) − ϕ = [

p( 1 2 − 1 3 p)

(1 + (1 − p) + (1 − p) 2 ) ][y 2 − 2b] + 1 3 π 0(0) −

(1 − 1 2 p)

1 + (1 − p) + (1 − p) 2 π 0(p),

and

e mrg

2 (p) = e mrg

1 (p).

Note that:

e mrg

(1 − p) 2

0 (p) = π 0 (p) − π 0 (0) +

1 + (1 − p) + (1 − p) 2 [y 2 − π 0 (p) − 2b] − 1 3 [y 2 − π 0 (0) − 2b].

Without loss of generality, we fix y 1 − y 0 − b = 0 such that π 0 (p) = π 0 (0) = y 0 . Then:

e mrg

0 (p) = e mrg

1 (p) = e mrg

p( 1 2

2 (p) = [

− 1 3 p)

(1 + (1 − p) + (1 − p) 2 ) ][y 2 − y 0 − 2b].

We also fix b = 0 and y 0 = 0 (with no consequence for the insights that we derive but with gains in

tractability) where the outside options of the firm and the workers are zero and we compute the goodness of

fit index.

The components of the relative index are:

||θ|| 2 = 1 + (1 − p)2 + (1 − p) 4

(1 + (1 − p) + (1 − p) 2 ) 2 y2 2,

and

We then have:

||e sym || 2 =

||e mrg || 2 =

1

2 p2

(1 + (1 − p) + (1 − p) 2 ) 2 y2 2,

3p 2 ( 1 2 − 1 3 p)2

(1 + (1 − p) + (1 − p) 2 ) 2 .

Recall that:

||e sh (p)|| 2

||θ(p)|| 2 = ||esym (p)|| 2

||θ(p)|| 2 + ||emrg (p)|| 2

||θ(p)|| 2

=

1

2 p2

1 + (1 − p) 2 + (1 − p) 4 + 3p 2 ( 1 2 − 1 3 p)2

1 + (1 − p) 2 + (1 − p) 4 .

||e sh (p)|| 2

||θ(p)|| 2 → 0 as p → 0.

We notice that the errors of symmetry are more important than the errors of marginality depending on the

value of p (the bargaining power of the firm). For lower values of p, a violation of symmetry is worse than a

violation of marginality, but, for high enough values of p, the inverse is true. In fact:

for p ∈ [0, 1 2 (3 − √ 6)).

||e sym (p)|| 2

||θ(p)|| 2 > ||emrg (p)|| 2

||θ(p)|| 2 ,

32

Also

at p = 1 2 (3 − √ 6),

and

for p ∈ ( 1 2 (3 − √ 6), 1].

||e sym (p)|| 2

||θ(p)|| 2 = ||emrg (p)|| 2

||θ(p)|| 2 ,

||e sym (p)|| 2

||θ(p)|| 2 < ||emrg (p)|| 2

||θ(p)|| 2 ,

More importantly, the derivative of the distance to the Shapley value with respect to p is always positive.

This means that the higher the bargaining power of the firm, the more the pay scheme differs from the

Shapley wages. For p ∈ (0, 1], we have:

∂ ||e sh (p)|| 2

∂p ||θ(p)|| 2 = ∂ ||e sym (p)|| 2

∂p ||θ(p)|| 2 + ∂ ||e mrg (p)|| 2

∂p ||θ(p)|| 2

= p + 1 p(9 + 2p(−9 + 4p)) > 0.

6

Note that each term is positive in its specified domain.

These findings show that a sufficiently powerful firm can induce its workers to increase their contributions

to profits but only at the cost of creating inequality among identical workers, and, even more importantly, at

the cost of marginality. It remains to be seen if these insights into the effects of the firm’s bargaining power

on Shapley unfairness can be extended to the case of n workers. Nonetheless, the analysis of a two-worker

firm provided above is very suggestive.

8 Conclusion

In this study, we have provided a local measure of the departures of any pay scheme from the Shapley pay

scheme in limited datasets. The local measure permits one to draw inference about violations of the classical

axioms that characterize the Shapley value (efficiency, equal treatment of identical workers, and marginality).

Our measure is decomposable into these axioms and is tractable. Our findings have testable implications for

the different ways in which a wage scheme may violate basic properties of distributive justice in a firm. We

also provide applications to well-known pay schemes.

33

References

Acemoglu, D. and Hawkins, W. B. (2014). Search with multi-worker firms. Theoretical Economics, 9(3),

583–628.

Aguiar, V. H. and Serrano, R. (2015). Slutsky matrix norms: The size, classiffication, and comparative statics

of bounded rationality. Mimeo.

Brügemann, B., Gautier, P., and Menzio, G. (2015). Bargaining, intra firm and Shapley values. Mimeo.

De Clippel, G. and Rozen, K. (2013). Fairness through the lens of cooperative game theory: An experimental

approach. Mimeo.

De Clippel, G. and Serrano, R. (2008). Marginal contributions and externalities in the value. Econometrica,

76(6), 1413–1436.

Khmelnitskaya, A. B. (1999). Marginalist and efficient values for TU games. Mathematical Social Sciences,

38(1), 45–54.

Moulin, H. (1992). An application of the Shapley value to fair division with money. Econometrica, (pp.

1331–1349).

Roth, A. E. (1977). The Shapley value as a von Neumann-Morgenstern utility. Econometrica, (pp. 657–664).

Shapley, L. (1953). A Value for n-person Games. Contributions to the Theory of Games, 2, 307–317.

Shapley, L. S. and Shubik, M. (1967). Ownership and the production function. The Quarterly Journal of

Economics, (pp. 88–111).

Stole, L. A. and Zwiebel, J. (1996). Intra-firm bargaining under non-binding contracts. The Review of

Economic Studies, 63(3), 375–410.

Young, H. P. (1985). Monotonic solutions of cooperative games. International Journal of Game Theory,

14(2), 65–72.

34

9 Appendix

Define the following production function f x , for x ∈ T n by:

⎧

⎨ 1 if x y

f x (y) =

⎩

0 if x y

. (3)

Lemma 7. (A linear basis for the production function) Any production function f is a linear combination

of production functions f x :

f =

Proof. Indeed, for y ∈ T n :

∑

∑

x∈T n ; x≠0

c x (f)f x , where c x (f) = ∑ x ′ x

c x (f)f x (y) = ∑

x∈T n ; x≠0 xy( ∑ x ′ x

= ∑ ∑

(−1) |x|−|x′| f(x ′ )

xy x ′ x

= ∑ x ′ y(

|y|

∑

|x|=|x ′ |

(−1) |x|−|x′| f(x ′ ), (4)

(−1) |x|−|x′| f(x ′ ))f x (y)

(−1) |x|−|x′| C |x|−|x′ |

|y|−|x ′ | )f(x′ ).

The expression in parentheses vanishes except for |x| = |y|, and so we are left to f(y). Thus, f(y) =

∑

c x (f)f x (y).

x∈T n ; x≠0

Proof of Theorem 1

We start this proof by showing that ϕ f satisfies efficiency, additivity, marginality and symmetry. Let f a

production function, x ∈ T n and x ⊳ x.

1. Efficiency: consider an order of entering the firm as follows: Worker 1 enters first, worker 2 follows,

then worker 3, and so on until worker n enters in the n th position. Let x 0 = (0, 0, ..., 0) be the initial

state; then, at the first step, the marginal contribution of worker 1 is given by:

ϕ f 1 = f(x 0 + x 1 e 1

} {{ }

x 1

) − f(x 0 ) = f(x 1 ) − f(x 0 ).

When at the second step, worker 2 enters, then his marginal contribution is given by:

ϕ f 2 = f(x 1 + x 2 e 2

} {{ }

x 2

) − f(x 1 ) = f(x 2 ) − f(x 1 ).

Continuing in the same manner, the last worker enters in the n th position and receives as marginal

contribution:

ϕ f n = f(x n−1 + x n e n ) − f(x n−1 ) = f(x) − f(x n−1 ).

} {{ }

x

35

If we apply the summation across ϕ i , we have:

∑

ϕ f i (x) = [f(x 1) − f(x 0 )] + [f(x 2 ) − f(x 1 )] + ... + [f(x n−1 ) − f(x n−2 )] + [f(x) − f(x n−1 )]

i∈N

= f(x) − f(x 0 )

= f(x), since f(0, 0, ..., 0) = 0.

This proof can be done for any order of entry of workers, and will lead to the same conclusion.

2. Additivity: let g be another production function, and a worker i with x i = 0. We know that the

marginal contribution of worker i by joining x in the sum of production functions f and g is the sum

of the marginal contributions of worker i by joining x in each scale of production functions f and g

respectively. More specifically, mc(i, f + g, x, x) = mc(i, f, x, x) + mc(i, g, x, x). It follows that:

ϕ f+g

i (x) =

=

=

∑

x⊳x, x i=0

∑

x⊳x, x i=0

⎧

⎨

⎩

∑

x⊳x, x i=0

= ϕ f i (x) + ϕg i (x)

|x|!(|x| − |x| − 1)!

[mc(i, f + g, x, x)]

(|x|)!

|x|!(|x| − |x| − 1)!

(|x|)!

|x|!(|x| − |x| − 1)!

(|x|)!

[mc(i, f, x, x)] +

∑

x⊳x, x i=0

⎫ ⎧

⎬ ⎨

[mc(i, f, x, x)]

⎭ + ⎩

|x|!(|x| − |x| − 1)!

[mc(i, g, x, x)]

(|x|)!

∑

x⊳x, x i=0

|x|!(|x| − |x| − 1)!

(|x|)!

⎫

⎬

[mc(i, g, x, x)]

⎭

Therefore ϕ f satisfies additivity.

3. Null worker property: If we consider i a null-worker with x i = 0, then mc(i, f, x, x) = 0; then, we can

deduce that ϕ f i (x) = 0.

4. Marginality: Let g be another production function, and i ∈ N such that mc(i, f, x, x) ≥ mc(i, g, x, x)

for all x ⊳ x with x i = 0. By definition of value ϕ f i (x), we can deduce that ϕf i (x) ≥ ϕg i (x).

5. Symmetry: Consider i and j two identical workers. The marginal contribution of workers i and j by

joining x are identical, for all x ⊳ x such that x i = x j = 0. We want to show that ϕ f i (x) = ϕf j (x). If

x i = x j = 0, it is obvious that ϕ f i (x) = ϕf j (x) = 0. Assume that at least either x j or x i is positive. By

definition, we have:

ϕ f i (x) = ∑

=

x⊳x, x i=0

∑

x⊳x, x i=x j=0

ϕ(x, x)[mc(i, f, x, x)]

ϕ(x, x)[mc(i, f, x, x)] +

} {{ }

A

∑

ϕ(x, x)[mc(i, f, x, x)] .

x⊳x, x i=0, x j=x j

} {{ }

B

36

In the same manner, we also have:

ϕ f j (x) = ∑

=

y⊳x, y j=0

∑

y⊳x, y j=y i=0

ϕ(x, y)[mc(j, f, y, x)]

ϕ(x, y)[mc(j, f, y, x)] +

} {{ }

C

∑

ϕ(x, y)[mc(j, f, y, x)] .

y⊳x, y j=0, y i=x i

} {{ }

D

By using the definition of symmetry, it follows that the terms A and C are equal since mc(i, f, z, x) =

mc(j, f, z, x) for any vector z such that z i = z j = 0 and z⊳x. Now, we have to show that B = D. Indeed,

it suffices to prove that (considering the vectors y and x as they appear in the expression of the Shapley

value for workers j and i respectively) mc(j, f, y, x) = mc(i, f, x, x) or [f(y + x j e j ) − f(y)] = [f(x +

x i e i ) − f(x)] for the vectors x and y which satisfy: [x i = 0 and x j = x j ], [y j = 0 and y i = x i ] and [x k =

y k , ∀k ≠ i, j]. Consider x ∗ = x−x j e j and y ∗ = y −x i e i . These latter vectors are such that x ∗ i = y∗ j = 0,

x ∗ k = y∗ k for all k ≠ i, j, leading to x∗ = y ∗ . Thus, f(y +x j e j )−f(y) = f(y ∗ +x i e i +x j e j )−f(y ∗ +x i e i )

and f(x + x i e i ) − f(x) = f(x ∗ + x j e j + x i e i ) − f(x ∗ + x j e j ) = f(y ∗ + x i e i + x j e j ) − f(y ∗ + x j e j ).

Since y ∗ is such that y ∗ ⊳ x and y ∗ i = y ∗ j = 0, it follows that f(y ∗ + x j e j ) = f(y ∗ + x i e i ), and

mc(j, f, y, x) = mc(i, f, x, x). We can conclude that B = D and ϕ f i (x) = ϕf j (x).

Now, we need to prove the uniqueness of value ϕ f . Consider another pay scheme ɛ f which satisfies the

four aforementioned axioms. Let i ∈ N, we should prove that ϕ f i = ɛf i . By Lemma 7, the production function

f can be rewritten as f =

∑ c x (f)f x . Let x ∈ T n , then we have:

x∈T n

{ }

∑

ɛ f i (x) = ɛ i c x (f)f x (x)

x⊳x

= ∑ x⊳x

ɛ i {c x (f)f x (x)} , by additivity of ɛ i

=

∑

x⊳x, x i>0

c x (f)

|x|

Since c x (f) = ∑ (−1) |x|−|x′| f(x ′ ), then:

x ′ x

ɛ f i (x) =

Simplifying the latter equation gives:

ɛ f i (x) = ∑

=

=

x⊳x, x i>0

∑

x⊳x, x i>0

∑

x⊳x, x i=0

by using the symmetry and the null worker properties of ɛ i .

∑

∑

x⊳x, x i>0 x ′ x

(|x| − 1)!(|x| − |x|)!

(|x|)!

{

}

(−1) |x|−|x′ | f(x′ )

. (5)

|x|

f(x) −

∑

x⊳x, x i=0

(|x| − 1)!(|x| − |x|)!

[f(x) − f(x − x i e i )]

(|x|)!

(|x|)!(|x| − |x| − 1)!

[f(x + x i e i ) − f(x)]

(|x|)!

(|x|)!(|x| − |x| − 1)!

f(x)

(|x|)!

= ϕ f i (x). 37

We have ɛ f i (x) = ϕf i (x) for all i ∈ N, and this ends our proof.

Proof of Theorem 2

We already demonstrated that ϕ f i

satisfies efficiency, marginality and symmetry. Let f be a production

function. We want to show that ɛ f = ϕ f for every allocation procedure ɛ satisfying symmetry, efficiency and

marginality.

Recall that marginality means that for any two production functions f and g defined on T n , and i ∈ N

such that mc(i, f, x, x) = mc(i, g, x, x) ∀x ⊳ x; then ɛ f i (x) = ɛg i (x). Consider a production function f that is

identically zero on all vectors of effort levels; then mc(i, f, x, x) = 0 for all i ∈ N and all x ⊳ x. By symmetry

of ɛ, we have ɛ f i (x) = ɛf j (x) and, by efficiency of ɛ, we also have ∑ ɛ f i (x) = f(x) = 0. It follows that ɛf i (x) = 0

i

for all i ∈ N. We can conclude that for any production function f on T n and i ∈ N:

Using Lemma 7, we can write f(y) =

mc(i, f, x, x) = 0 ∀x ⊳ x ⇒ ɛ f i (x) = 0. (6)

∑

x∈T n ; x≠0

c x (f)f x (y). It follows that:

{ }

∑

ɛ i (x) = ɛ f i c x (f)f x (x) = ∑ ɛ i {c x (f)f x (x)} . (7)

x⊳x

x⊳x

⎧

⎨ c x (f) if x a

We have c x (f)f x (a) =

, then ɛ

⎩

i {c x (f)f x (x)} = cx(f)

|x|

and ɛ f i (x) = ∑ c x(f)

|x|

.

0 if x a

x⊳x

Define the index I of production function f to be the minimum number of non-zero terms in some expression

for f of the form established in Lemma 7. This theorem is proven by induction on I.

• If I = 0, then f ≡ 0, so as f(x) = 0 for all x ∈ T n . Using condition (6), we can conclude that ɛ f i (x) = 0

as well as ϕ f i (x) = 0.

• If I = 1, then f = c x (f)f x for some x ⊳ x. Let X x = {l ∈ N : x l > 0}.

Assume i /∈ X x ; then for all y ⊳ x, f(y + x i e i ) − f(y) = 0. It follows that mc(i, f, y, x) = 0 for y ⊳ x;

then ɛ f i (x) = 0 from equation (6).

If i, j ∈ X x , by symmetry property, then we have ɛ f i (x) = ɛf j (x). Given the fact that ɛf i satisfies

efficiency property, then we have ∑ ɛ f k (x) = f(x); hence |x|ɛf k (x) = c x(f) since f x (x) = 1; therefore

ɛ f cx(f)

k

(x) =

|x|

for all k ∈ X x .

k∈N

It follows that ɛ f (x) is the Shapley value whenever the index of f is 0 or 1.

• Assume now that ɛ f is the Shapley value whenever the index of f is at most I and let f has index

I + 1 with expression f = I+1 ∑

c x k(f)f x k, all c x k(f) ≠ 0 and x k ⊳ x. Let X k = { i ∈ N : (x k ) i > 0 } ,

k=1

38

X = I+1 ⋂

X k and assume i /∈ X. Consider g =

∑ c x k(f)f x k. The index of g is at most I and

k=1

k: i∈X k

mc(i, f, y, x) = mc(i, g, y, x) for all y ⊳ x. By induction and marginality, ɛ f i (x) = ɛg i (x) = ∑

which is the value of ϕ f i (x).

c x k (f)

,

|x k |

k: i∈X k

It remains to show that ɛ f i (x) is the value ϕf i (x) when i ∈ X. By symmetry, ɛf i (x) is a constant ɛ for all

members of X; likewise the value ϕ f i (x) is some constant ɛ′ for all members of X. Since both allocations sum

to f(x) and are equal for all i ∈ X, it follows that ɛ = ɛ ′ . This ends our proof.

Proof of Lemma 1

Let f be a production function, i ∈ N be a null-worker and θ i be an allocation procedure satisfying the

marginality property. Consider a vector of effort levels x ∈ T n such that mc(i, f, x, x) = 0 for all x ⊳ x with

x i = 0. If g is a null production function, then mc(i, g, x, x) = 0 for all x and x, with x ⊳ x and x i = 0. It

follows that mc(i, f, x, x) = mc(i, g, x, x); then θ f i (x) = θg i (x). Given that g ≡ 0, we have θg i (x) = 0 for all

x ∈ T n . Therefore, θ f i (x) = 0 and θ i satisfies the null-worker property.

Proof of Lemma 2

Let (N, G) be a transferable-utility game and T = {0, 1}. We fix the effort level x = (1, · · · , 1), we assign

f(x) = G(N), and we define the vector x S as x S,j = 0 for all j ∈ N\S and x S,i = x i for all i ∈ S. Now we

can construct f : T n ↦→ R such that f(x S ) = G(S) for all S ⊆ N. Note that T n ≡

⋃ x S , which means

that f is well defined in the domain T n and that f(0) = G(∅) = 0 by definition of a characteristic function.

It follows that f is a production function as we have defined it.

Let F = (T, N, f) be any firm. If we fix x, we can build a transferable-utility game for the fixed effort

G f x : 2N ↦→ R as follows: G f x (S) = f(x S), where x S is defined as x S,j = 0 for all j ∈ N\S and x S,i = x i for

all i ∈ S.

We check that (N, G f x ) is a game. To do this it suffices to check that Gf x

is a characteristic function. We

observe that G f x

(∅) = f(0) = 0 by assumption and, under the assumption of limited datasets, we observe all

x S for a given x and for any S ⊆ N. Thus G f x

is a characteristic function.

S⊆N

Proof of Lemma 3

We fix f(x) and we omitted it from the notation. It should be clear that this optimization is pointwise. We

want to solve min v∈R n||v − θ|| 2 subject to v satisfying the equal-treatment property. To define the problem

in a tractable way, we denote the equivalence relation i ∼ sym j, when workers i and j are identical or

symmetrical in f(x). Notice that all workers are identical to themselves. With this, we define for any worker

i, the equivalence class [i] = {j ∈ N|j ∼ sym i}. We notice that imposing the restriction (v i − v j ) = 0 for

39

i, j ∈ [i] is equivalent to the “normalized” restriction 1

|[i]| (v i − v j ) = 0, where |[i]| ≥ 1 is the cardinality of the

equivalence class.

Formally, solving the problem of interest can be formulated as:

1 ∑

min v∈R n i − θ i )

2

i∈N(v 2 + ∑ i∈N

∑

λ i (v i − v j ).

j∈[i]

The first-order conditions (which are necessary and sufficient) are:

v i − θ i + λ i − ∑ j∈[i]

λ j = 0, for all i ∈ N;

(v i − v j ) = 0 for all i ∈ N and all j ∈ [i].

Because v i = v j for all i, j ∈ [i] we can call v [i] = v i (without the index). Thus:

and

∑

v j = |[i]|v [i] ,

j∈[i]

λ i − λ j = θ i − θ j

Adding up the last expression leads to:

|[i]|λ i − ∑ j∈[i]

λ j = |[i]|θ i − ∑ j∈[i]

θ j .

Then, solving for λ i gives us:

We know that ∑

θ i − 1

|[i]|

∑

j∈[i]

r∈[i]

θ r − ∑

θ j and v [i] = 1

|[i]|

r∈[i]

∑

j∈[i]

1

|[i]|

θ j .

λ i = θ i − 1 ∑

θ j + ∑ λ j .

|[i]|

j∈[i] j∈[i]

∑

θ j = 0; then we conjecture that ∑

j∈[i]

j∈[i]

λ j = 0, and this implies: λ i =

We verify that λ i and v [i] satisfies the first-order conditions showing that they are a solution. Since, the

problem has a unique solution, it follows that this is the unique solution. We conclude that the optimal

solution is:

v sym

i = v [i] = 1 ∑

θ j .

|[i]|

j∈[i]

This means that, the optimal solution is the average payoff of the equivalence class induced by the

equivalence relation ∼ sym of identical workers.

Proof of Lemma 5

40

Take v sym to be any symmetric pay scheme and v to be a skew-symmetric pay scheme. Notice that

< v sym , v >= ∑ v sym

i v i and notice furthermore that, for singletons equivalence classes [i] for the identical

i∈N

workers equivalence relation (that is “unique workers”), the skew symmetric pay scheme must have zero

payoff, i.e v i = 0 for all “unique” workers. For non-unique workers who are identical, say i ∼ sym j, we have

that v sym

i

= v sym

j

class [i] and notice that:

and v i = −v j , which makes v sym

i

< v sym , v >= ∑ j∈[i]

v i + v sym v j = 0. More general cases take the equivalence

j

v sym

j v j = v sym ∑ j∈[i]

v j = 0,

where v sym is a scalar value equal to the symmetric payoff given to any member of the equivalence class [i],

and because ∑ v j = 0 by definition of skew symmetric payoff. This implies that < v sym , v >= 0. Notice

j∈[i]

that this proof can be directly extended for the case of several equivalence classes.

Proof of Theorem 3

First, we have to prove that θ = ϕ + e sym + e eff + e mrg . This is simple from the lemmas that derive the

approximations and residuals v sym , e sym and v sym,eff , e eff ; and because we notice from Theorem 1 that,

e mrg = v sym,eff − ϕ and e sh = θ − ϕ leading to:

e mrg = e sh − e sym − e eff .

Now, it is necessary to obtain the decomposition. Notice that:

||e sh || 2 = ||e sym + e eff + e mrg || 2 = ||e sym || 2 + ||e eff || 2 + ||e mrg || 2

+ 2 < e sym , e eff > +2 < e sym , e mrg > +2 < e eff , e mrg > .

The proof amounts to checking that the residuals e sym , e eff and e mrg are pairwise orthogonal.

• First we prove that < e sym , e eff >= 0.

Notice that e eff

i = −E/n, where E = f(x)− ∑ θ i (f(x)) is the wasted output,and so it is a symmetric

i∈N

pay scheme. We also know that: e sym

i = θ i − 1 ∑

|[i]|

θ i , where [i] = {j ∈ N : j ∼ sym i} is the set

i∈[i]

of symmetric workers at f, and realized effort x. Thus ∑ = 0 always, hence making e sym a

i∈[i]

e sym

i

skew-symmetric pay scheme. By Lemma 5, we conclude that < e sym , e eff >= 0.

• Second we prove that < e sym , e mrg >= 0.

We use the identity e mrg = θ − ϕ − e sym − e eff and the properties of the inner product to write:

< e mrg , e sym >=< θ − ϕ, e sym > + < −e sym , e sym > + < −e eff , e sym > .

Here we notice that the third component is zero by the first step. Now, we have:

< e mrg , e sym >=< θ − ϕ, e sym > + < −e sym , e sym > .

41

Notice furthermore that the Shapley pay scheme ϕ either fulfills the equal-treatment property or is a

symmetric pay scheme. Then,

< θ − ϕ, e sym >=< θ, e sym > + < −ϕ, e sym >=< θ, e sym > .

Notice also that the payoff can be decomposed into its symmetric pay scheme projection and the skew

symmetric residual θ = v sym + e sym , such that:

< θ, e sym >=< v sym , e sym > + < e sym , e sym >=< e sym , e sym > .

Therefore:

< e mrg , e sym >=< e sym , e sym > − < e sym , e sym >= 0.

• The third and final step consists of checking < e eff , e mrg >= 0.

First we apply the bilinearity of the inner product to expand:

< e mrg , e eff >=< v sym,eff − ϕ, e eff >= − E n < 1, vsym,eff − ϕ >= 0, where E = f(x) − ∑ i∈N

Observe that < 1, v sym,eff − ϕ >= ∑

i∈N

v sym,eff

i

The moreover part of the statement follows from:

θ i (f(x)).

− ∑ ϕ i = 0, because v sym,eff and ϕ are efficient.

i∈N

(i) If θ satisfies the equal-treatment, efficiency, and marginality axioms, then by Theorem 1, we conclude

that θ f (x) = ϕ f (x) is the Shapley value at f(x); Thus, ||e sh || = 0. The moreover statement (i) in

the theorem follows from the contrapositive of the previous result (i.e., if ||e sh || > 0, then θ is lacking

equal-treatment, efficiency or marginality).

(ii) If θ satisfies the equal-treatment axiom, then it follows that v sym = θ; thus, ||e sym || = 0. The moreover

statement (ii) in the theorem follows from the contrapositive of the previous result (i.e., if ||e sym || > 0,

then θ is not symmetric).

(iii) Observe that if θ is efficient, then f(x) − ∑ θ i (x) = 0. Thus:

i∈N

||e eff || 2 = ∑ [f(x) − ∑ θ i (x)] 2

i∈N

= 0.

n

x∈T n

Hence, the moreover statement follows from the contrapositive of the previous result (i.e., if ||e eff || > 0,

then θ is not efficient).

(iv) If θ is efficient and symmetric, then v sym,eff (f(x)) = θ(f(x)); therefore, by (i) and (ii), we have

||e sym || = ||e eff || = 0 and ||e sh || = ||e mrg || > 0 implying that θ(f(x)) ≠ ϕ(f(x)) is not the Shapley

value. Thus, it does not satisfy marginality property by the uniqueness result of Theorem 1.

42

Proof of Lemma 6

The solution v sym,eff,null is obtained by solving the following optimization problem:

v sym,eff,null = arg min

v∈R ||vsym,eff − v|| subject to v k = 0 for k ∈ N and such that ∑ v i = f(x).

n

i∈N

This optimization problem is equivalent to what follows:

{

min

v∈R n

1 ∑

2

i∈N

(v sym,eff

i

The first-order conditions are given by:

− v i ) 2 + ∑ k∈N

λ k v k + ν( ∑ i∈N

v i − f(x))

}

.

v i = v sym,eff

i − ν for i ∈ N\N ;

v k = v sym,eff

k

− λ k − ν for k ∈ N ;

λ k = v sym,eff

k

− ν for k ∈ N .

Replacing these conditions into the constraints, we have:

∑

i∈N−N

[v sym,eff

i ] − [n − |N |]ν = f(x),

v sym,eff,null

i

f(x) − ∑ k∈N

[v sym,eff

k

] − [n − |N |]v = f(x),

ν = −

∑

= v sym,eff

i +

[v sym,eff

k

]

k∈N

[n − |N |]

∑

k∈N

,

[v sym,eff

k

]

[n − |N |]

for i ∈ N\N ,

v sym,eff,null

k

= 0 for k ∈ N .

Proof of Theorem 4

First, we notice that, pointwise:

θ f (x) = ϕ f (x) + e sym (f(x)) + e eff (f(x)) + e null (f(x)) + e add (f(x))

43

In fact, due to Theorem 2, we have:

θ = v sym + e sym

= v sym,eff + e eff + e sym

= v sym,eff,null + e null + e eff + e sym = v sym,eff,null,add + e add + e null + e eff + e sym

= ϕ + e add + e null + e eff + e sym .

By definition, we have:

||θ − ϕ|| 2 = ||e sym + e eff + e null + e add || 2

= ||e|| 2 + 2 < e sym , e eff > +2 < e sym , e null > +2 < e sym , e add >

+ 2 < e eff , e null > +2 < e eff , e add > +2 < e null , e add > .

Now, we study which residuals are orthogonal among each other.

1. We already know that < e sym , e eff >= 0 by Theorem 3.

2. Notice that < e sym , e null >= 0, because e null is symmetric and e sym is skew symmetric by Lemma 5.

3. We also know that :

< e eff , e null >= E n < 1, enull >= 0, with E = −[f(x) − ∑ i∈N

θ i ],

because < 1, e null >= 0. In fact, by definition: e null = v sym,eff − v sym,eff,null , or entry-wise :

and e null

k

e null

i

= v sym,eff

k

= −

∑

k∈N

[v sym,eff

k

]

|[N − N ]|

for k ∈ N . Therefore,

= −

f(x) −

∑

i∈N−N

|[N − N ]|

[v sym,eff

i ]

for i ∈ N\N ;

− < 1, e null > = − ∑ e null

i

= f(x) − ∑

i∈N−N

= f(x) − ∑ k∈N

[v sym,eff

i ] − ∑ [v sym,eff

k

]

k∈N

[v sym,eff

k

(x)]

= f(x) − f(x), since v sym,eff is efficient

= 0.

4. The additivity error is given by e add = v sym,eff,null − ϕ. We have < e add , e sym >= 0, because e add is

symmetric and e sym is skew symmetric by Lemma 5.

44

5. We show that < e add , e eff >= 0. Indeed:

< e add , e eff > = < v sym,eff,null , e eff > − < ϕ, e eff >

= < v sym,eff,null , 1 > E n − < ϕ, 1 > E n

= f(x)E

n

= 0.

− f(x)E

n

6. The remaining term < e add , e null > is in general non-zero. Indeed, we have what follows:

∑

[v sym,eff

i ]

∑

< ϕ, e null i∈N

>= −[

] ϕ i + ∑ v sym,eff

k

ϕ k .

|[N − N ]|

i∈N−N k∈N

We can decompose < v sym,eff,null , e null > as:

< v sym,eff,null , e null > = < v sym,eff , e null > + < e null , e null >

=

[ ∑ [v sym,eff

k

]] 2

< v sym,eff , e null k∈N

> +

+ ∑ (v sym,eff

k

) 2 .

[N − N ]

k∈N

In the same manner, we can rewrite < v sym,eff,null , e null > as:

f(x) −

∑ [v sym,eff

i ]

< v sym,eff,null , e null i∈N−N

> = −[

]

[N − N ]

+ ∑ k∈N

∑

i∈N−N

v sym,eff

i

[ ∑ [v sym,eff

(v sym,eff

k

]] 2

k

) 2 k∈N

+

+ ∑ (v sym,eff

k

) 2 .

[N − N ]

k∈N

Given the equation that,

< v sym,eff,null , e null >= −

f(x)

∑

i∈N−N

|[N − N ]|

v sym,eff

i

+ 2 ∑ k∈N

(v sym,eff

k

) 2 ,

it follows that < e add , e null >≠ 0.

The conclusion is the following:

||θ − ϕ|| 2 > = ||e sym + e eff + e null + e add || 2

= ||e sym || 2 + ||e eff || 2 + ||e add || 2 + ||e null || 2 + 2〈e add , e null 〉,

and ||e mrg || 2 = ||e null || 2 + ||e add || 2 + 2 < e null , e add >.

The moreover part of the statement now is established:

(i) If θ satisfies the null-worker and efficiency properties or if θ f k

(x)) = 0 for all k ∈ N null workers in

f(x), then v sym

k

(f(x)) = 0 for all k ∈ N because the null workers are symmetric among each other.

45

Under efficiency requirement, we have v sym,eff (f(x)) = v sym (f(x)). Therefore v sym,eff (f(x)) already

satisfies the null-worker property. Hence v sym,eff,null (f(x)) = v sym,eff (f(x)). If θ satisfies the nullworker

property and efficiency, then ||e null || = ||v sym,eff − v sym,eff,null || = 0. Moreover if ||e null || > 0,

then the null-worker property fails and as a consequence marginality property must fail by Lemma 1.

(ii) If θ is additive, symmetric, and efficient and satisfies the null-worker properties, then θ = ϕ is the

Shapley wage function; then ||e sh || = 0. Moreover, if ||e add || > 0, then at least one of the axioms:

additivity, symmetry, efficiency and the satisfaction of the null-worker property fail. If any error component

is zero and ||e i || = 0 for some i ∈ {sym, eff, null}, then we know that either additivity or

{sym, eff, null}\{i} fail. In particular, we know with certainty that, if all i ∈ {sym, eff, null} do not

fail, then ||e add || > 0 implies that additivity fails.

Proof of Theorem 5

First, we prove that a pay scheme θ that satisfies efficiency and marginality is a random value, i.e:

θ f i (x) =

∑

γ(r)mc(i, f, r(x), x),

r∈R(N)

where γ ∈ ∆(R(N)) is a probability distribution in the simplex of linear orderings in N and f a production

function. This follows, from the equivalence of the firm’s random value and from a transferable-utility game

random value due to Lemma 2 and Theorem 2 in Khmelnitskaya (1999).

(i) First we prove that marginality implies that θ f i (x) = θ i({mc(i, f, r(x), x} r∈R(N) ) is a marginalist value

for all i ∈ N, where θ i : R 2n−1

↦→ R. Notice that marginality implies that for any two production

functions f and g and for i ∈ N, such that mc(i, f, x, x) = mc(i, g, x, x) ∀x ⊳ x for x i = 0, then

θ f i (x) = θg i (x); thus θf i

is a marginalist value.

(ii) Second, we prove that marginality implies that θ f i (x) is a monotone value. A monotone value is such

that for any two production functions f and g such that g(x S ) ≥ f(x S ) when i ∈ S and g(x S ) = f(x S )

when i /∈ S, for x S

∈ T n (x S is defined as x S,i = x when i ∈ S and x S,i = 0 when i /∈ S); then

θ g i (x) ≥ θf i (x). If for two functions f and g, it holds that g(x S) ≥ f(x S ) when i ∈ S and g(x S ) = f(x S )

when i /∈ S, then mc(i, g, x, x) = g(x + xe i ) − g(x) ≥ mc(i, f, x, x) = f(x + xe i ) − f(x), for all x ⊳ x

such that x i = 0. Therefore, by marginality property, we conclude that θ g i (x) ≥ θf i (x) implying that θ i

is a monotone value.

(iii) Third we recall that marginality implies the null player property by Lemma 1.

By (i), (ii) and (iii) and by Lemma 2 and Theorem 2 in Khmelnitskaya (1999), together with an efficiency

of θ and the assumption of n ≥ 3, we conclude that θ is a random value:

θ f i (x) =

∑

γ(r)mc(i, f, r(x), x) for γ ∈ ∆(R(N)).

r∈R(N)

46

Now, we prove that, if θ is efficient and satisfies marginality, then ||e mrg || ≤ √ K f (x). First recall that:

v sym

i = 1 ∑

θ f j

|[i]|

(x),

j∈[i]

for [i] = {j ∈ N : j ∼ sym i} for the equivalence relation of symmetric workers in f at x for all i ∈ N,

now from the fact that it is a random value:

v sym

i = ∑

r∈R(N)

γ(r){ ∑ j∈[i]

By efficiency of θ, we have the equation that v sym,eff

i

that the Shapley value can be written as a random value :

ϕ f i (x) = 1 n!

∑

r∈R(N)

mc(j, f, r(x), x)}.

= v sym

i . Now ||e mrg || 2 = ||v sym,eff − ϕ|| 2 . Recall

γ(r)mc(i, f, r(x), x)

with uniform probability. Now, the Shapley is symmetric. Thus :

We then notice that :

ϕ f i (x) = 1 n!

1 ∑

ϕ f j

|[i]|

(x) = ϕf i (x).

j∈[i]

∑

r∈R(N)

Given this latter equation, we conclude that :

||e mrg || 2 = ∑ i∈N

By definition of K f (x) = max{ ∑

i∈N

⎧

⎨

⎧

⎫

⎨

1 ∑

⎬

mc(j, f, r(x), x)

⎩|[i]|

⎭ .

j∈[i]

⎩ [γ(r) − 1 n! ] 1

|[i]|

([ρ(r) − 1 n! ] 1

|[i]|

that ||e mrg || 2 ≤ K f (x) if θ is efficient and marginal.

∑

j∈[i]

⎫2

∑

⎬

mc(j, f, r(x), x) .

⎭

j∈[i]

mc(j, f, r(x), x)) 2 : ρ(r) ∈ ∆(R(N))}, it follows

By the contrapositive if ||e mrg || > √ K f (x) and if θ is efficient, then marginality must fail.

• The moreover part of the statement holds because, under marginality and efficiency of θ, e null = 0.

Thus ||e mrg || = ||e add ||. We conclude that, if the null-worker property holds in θ and is efficient for

n ≥ 3, then if ||e add || > K f (x), then we must have a violation of marginality and additivity properties.

Proof of Theorem 6

(i) If ||e sh || F = 0, then θ f (x) = ϕ f (x) for all f ∈ F . Thus, we build ϑ = ϕ which is an extension of the

data.

(ii) If ||e sym || F = 0, then θ f (x) = v sym (f(x)); thus it is symmetric for each f ∈ F . We build ϑ(g(x)) =

ϕ g (x) for g ∈ F\F and ϑ(f(x)) = θ f (x) for f ∈ F ; this is symmetric for each f ∈ F.

47

(iii) If ||e eff || F = 0, then v sym (f(x)) = v sym,eff (f(x)) and ∑ v sym

i (f(x)) = ∑ θ f i (x) = f(x); thus θf (x)

i∈N

i∈N

is efficient for each f ∈ F . We build ϑ(g(x)) = ϕ g (x) for g ∈ F\F and ϑ(f(x)) = θ f (x) for f ∈ F ; this

is efficient for each f ∈ F.

(iv) If ||e sym || F = 0, ||e eff || F = 0, ||e null || F = 0, then θ f (x) = v sym,eff,null (f(x)); thus, θ f (x) is symmetric

and, efficient and has the null-worker property, since v sym,eff,null

k

(f(x)) = 0 for all k ∈ N null workers

in f at effort x. We build ϑ(g(x)) = ϕ g (x) for g ∈ F\F and ϑ(f(x)) = θ f (x) for f ∈ F ; this is symmetric

and, efficient and has the null-worker property for each f ∈ F.

Proof of Theorem 7

(i) Under ||e eff || F = 0 for each f ∈ F , we have v sym (f(x)) = ϕ f ∑

1

(x). Thus, we have θ

|[i] f | j (f(x)) =

j∈[i] f

ϕ i (f(x)) for all i ∈ N and all f ∈ F . We build ϑ(g(x)) = ϕ(g(x)) for g ∈ F\F and ϑ(f(x)) = θ f (x) for

∑

1

f ∈ F . This ϑ satisfies the average marginality by construction, because ϑ

|[i] f | j (f(x)) = ϕ i (f(x))

j∈[i] f

for each f ∈ F.

(ii) Under ||e eff || F = 0 for each f ∈ F we have v sym (f(x)) = v sym,eff,null (f(x)); thus for each k ∈ N f

null-worker in f under x, we have v sym

∑

1

k

(f(x)) = 0. It follows that θ

|N f | j (f(x)) = 0. We build

j∈N f

ϑ(g(x)) = ϕ(g(x)) for g ∈ F\F and ϑ(f(x)) = θ f (x) for f ∈ F . This satisfies the average null-worker

∑

1

by construction, because ϑ

|N f | j (f(x)) = 0 for each f ∈ F.

j∈N f

Proof of Theorem 8

(i) For the uniqueness of the decomposition, it suffices to notice that:

γ ∗ ∈ argmax γ∈∆(R(N)) { ∑ ‖θ(f(x)) −

∑

γ(r)mc(i, f, r(x), x)‖ 2 }

i∈N

r∈R(N)

is unique. This follows from the convexity of ∆(R(N)) and from the fact that we are using an Euclidean

norm. The rest follows trivially.

(ii) If θ is marginal and efficient, then by Theorem 2 in Khmelnitskaya (1999) we know that θ is either

a random value or that there exists a γ ′ ∈ ∆(R(N)) such that θ f (x) =

∑ γ ′ (r)mc(i, f, r(x), x),

r∈R(N)

and thus ||r mrg || = 0. The statement follows from the contrapositive: if ||r mrg || > 0, then θ must fail

marginality property.

(iii) Assume that ||r sym || > 0 and θ satisfies the equal-treatment property, then under efficiency and

marginality properties of θ by Theorem 1 we know that θ = ϕ. By (ii) and the assumption that θ

satisfies the marginality property, we have v mrg = θ. Hence, ||r sym || = ||θ − ϕ|| > 0, which implies

θ ≠ ϕ, a contradiction. Therefore, θ must fail the equal-treatment property.

48

Proof Theorem 9

In what follows, we omit the f and x from the notation as they are fixed. By Khmelnitskaya (1999), and

based on the fact that we are looking for an additive decomposition we know that the decomposition of ρ sh

corresponds to an average marginal decomposition:

ρ sym = α(||r sym || 2 + 2 < r mrg , r sym >) + (1 − α)||e sym || 2 /||θ|| 2

and

ρ mrg = α||r mrg || 2 + (1 − α)||e mrg || 2 + /||θ|| 2

for α ∈ [0, 1], because we assign weight α to the ordering mrg ≻ sym and weight (1 − α) to the ordering

sym ≻ mrg. Notice that ρ sym and ρ mrg are then the average marginal contributions under the two orders

of the axioms sym and mrg to the goodness-of-fit index ρ sh .

For the moreover part, we have the following:

(i) If ρ sh > 0 then either symmetry or marginality property fails; this, follows from the fact that ρ sh = 0 if

θ = ϕ (it is numerically equal to the Shapley value or θ satisfies marginality and symmetry properties)

at f given vector of effort levels x. By the contrapositive, if θ fails either axiom, then ρ sh > 0.

(ii) To prove that, if ρ sym > max{α(ρ sh (sym, mrg) − ρ sh (sym)), (1 − α)(ρ sh (sym) − ρ sh (∅)}, then sym

fails, we use Theorem 3, because ρ sh (sym) − ρ sh (∅) = ||e sym || 2 /||θ|| 2 and because by Theorem 3 we

know that if ρ sh (sym) − ρ sh (∅) > 0 then θ fails sym. Observe that if ρ sym > max{α(ρ sh (sym, mrg) −

ρ sh (sym)), (1 − α)(ρ sh (sym) − ρ sh (∅)}, then we conclude that ρ sh (sym) − ρ sh (∅) > 0, thus establishing

the result 8 .

(iii) To prove that if ρ mrg > max{α(ρ sh (mrg) − ρ sh (∅)), (1 − α)(ρ sh (mrg, sym) − ρ sh (mrg)} then mrg

fails, we use Theorem 8, because ρ sh (mrg) − ρ sh (∅) = ||r mrg || 2 /||θ|| 2 and because by Theorem 8

we know that if ρ sh (mrg) − ρ sh (∅) > 0, then θ must fail mrg property. Observe that if ρ mrg >

max{α(ρ sh (mrg) − ρ sh (∅)), (1 − α)(ρ sh (mrg, sym) − ρ sh (mrg)}, then we conclude that ρ sh (mrg) −

ρ sh (∅) > 0, thus establishing the result.

8 For the extremes of α ∈ {0, 1} the statement is vacuously satisfied since it will never be the case that ρ sym >

max{α(ρ sh (sym, mrg) − ρ sh (sym)), (1 − α)(ρ sh (sym) − ρ sh (∅)}.

49