Matching via Majorization for Consistency of Product QualityMatching via Majorization for...
Transcript of Matching via Majorization for Consistency of Product QualityMatching via Majorization for...
Matching via Majorization for Consistency of Product
Quality∗
Lirong Cui† Dejing Kong† Haijun Li‡
Abstract
A new matching method is introduced in this paper to match attributes of parts
in order to ensure consistent quality of products that are assembled from matched
parts. The method yields invariant optimal matching that depends only on ranking of
attributes of assembling parts, where the optimal matching criteria consist of a large
class of metrics, including the Kantorovich cost function for matched pairs. Our method
is non-parametric and based on the theory of majorization that reinforces a general
rank-invariant matching strategy that “small matches with small and large matches
with large” for variance reduction. Using this majorization-based matching method,
several specific multi-part matching problems are investigated in order to illustrate its
wide applicability in quality assurance.
Key words and phrases: Invariant optimal solutions; matching pattern; majoriza-
tion; Schur-convexity, Kantorovich cost function.
1 Introduction
The common theme of discrete matching problems is to match elements of one finite set
S1 to elements of another finite set S2 with some cost associated to matched elements and
the goal is to find best matching strategies for minimizing the total associated cost of all
possible matched elements. Matching is usually one-to-one, but can also be a subset-to-
subset matching.
∗This work is in honor of Professor Alan G. Hawkes on his 75th birthday.†[email protected], [email protected], School of Management and Economics, Bei-
jing Institute of Technology, Beijing, 100081,China. Supported by the NSF of China under grant 71371031.‡[email protected], Department of Mathematics, Washington State University, Pullman, WA 99164,
U.S.A. Supported by NSF grant DMS 1007556.
1
The early work in matching appeared in the 1940s (see, e.g., [4]), and the research has
accelerated in observational studies since the 1970s after the paper by Cochran and Rubin
[2] and the paper by Rubin [20]. Optimal matching has now been widely studied in various
contexts; for example, in design of observational studies [18, 19, 21], economics [6, 23],
sociology [8, 15], epidemiology and medicine [13, 1], and in political science [7], to mention
just a few. Optimal matching has been also studied as design problems in optimizing network
flows (see, e.g., [5] and the reference therein).
The continuous matching problem can be formulated as the Monge-Kantorovich mass
transportation problem [16, 17], where S1 and S2 are usually multi-dimensional Euclidean
spaces (or general Hilbert spaces). Given two probability distributions f1 and f2 on S1 and
S2 respectively, the goal is to find an optimal matching (or transport) map from S1 to S2so that certain cost functional of f1 and f2 is minimized. Such an optimal transport exists
under mild model assumptions, and finding it explicitly can be done in the situations such
as S1 = S2 = R.
In this paper we develop a discrete matching method based on the theory of majorization
[14] and apply it to quality assurance. The simplest setup of our problem is described as
follows. Let S1 = {x1, . . . , xn} and S2 = {y1, . . . , yn} denote two finite sets with the same
size. If xi < yj for all 1 ≤ i, j ≤ n, then find a permutation π(·) of {1, . . . , n} so as to
minimize the total variance
dπ(S1,S2) :=1
n
n∑i=1
[(yπ(i) − xi)−
1
n
n∑k=1
(yπ(k) − xk)]2, (1.1)
subject to∑n
i=1(yi − xi) = c being a constant. The problem was motivated from optimal
assembling of parts. For example, an axle’s diameter is an important attribute, and its
sleeve has a large hole size. In an assembling process, two parts are equipped with each
other; one is inside, and another is outside, see Figure 1.1. Due to manufacturing variability,
the measurements, x1, . . . , xn, of inside parts are represented by a random variable X, and
the measurements, y1, . . . , yn, of outside parts are represented by a random variable Y . We
assume that P (X ≤ Y ) = 1 because of conforming regularity. In practice, parts are matched
randomly in an assembling process, which cannot make the dimension gaps of paired parts
consistently minimal; that is, the variance of these gaps is rather large. The dimension
gap between two paired parts is crucial to ensure the quality of products since the larger
dimension gaps could result in shorter lifetimes for products due to various factors such as
friction and vibration. An important issue is, as formulated in (1.1), to match the inside and
outside parts properly to make the product quality highly consistent in a batch of products.
Quality management has plenty of contents such as sampling plans (see, e.g., [11, 12]) and
2
Figure 1.1: The values of the diameters Φ1 and Φ2 for inside (axle) and outside (sleeve) parts
are x1, . . . , xn and y1, . . . , yn, respectively.
quality controls (see, e.g., [9]) etc. Delivering consistency of product quality has been always
an important issue in quality management, but it has not been studied deeply specially in
quantitative analysis. Consistent product quality not only makes efficient use of materials but
also facilitates effective management of production system operation. In today’s digital era,
detailed data related to parts or sub-systems can be shared by many different departments
such as design and planning, manufacturing, quality assurance, etc. The shared product data
in manufacturing provide valuable information on quality management of assembling parts.
Based on part manufacturing information, the optimal matching strategies for assembling
parts are called for to safeguard consistency of assembled product quality. In this paper,
we shall present and make more precise a heuristic invariant optimal matching rule: “small
matches with small and large matches with large” to ensure the consistency of product or
system quality.
Specifically, we show in this paper how the one-to-one matching problem (1.1) can be
solved explicitly via majorization (see Section 2 for the definition). We also show this
majorization-based matching method can be used to solve general subset-to-subset opti-
mal matching problems with multiple objective functions, that are often arising from reli-
ability modeling and quality management. In terms of discrete matching in the design of
observational studies, our optimal matching problem (1.1) can be thought of as an optimal
matching with a single covariate for each treatment and control individual. In contrast to
optimal matching in observational studies, the optimal matching criteria used in our match-
ing problems consist of a large class of metrics, including the Mahalanobis distance (1.1)
and Kantorovich cost function for matched pairs. Our majorization-based optimal matching
can be proceeded by successive applications of a finite number of local matchings for persis-
tent variance reductions, and so the method can be applied to matching for large batches
of products that may have various matching constraints. Theory of majorization is elegant
and mathematically powerful [14, 3], and our majorization-based method can be extended
3
to continuous optimal matching problems that are arising from, e.g., shape recognition or
data compression. To the best of our knowledge, this is the first paper that applies optimal
matching to quality assurance.
The rest of the paper is organized as follows. Section 2 presents main comparison results
for the majorization-based matching. Section 3 discusses optimal matching solutions for
several multi-part product assemblings. The remarks in Section 4 conclude the paper.
2 Optimal Matching via Majorization
Various notions of majorization describe the fundamental phenomena of spread out. For
any real vector x = (x1, . . . , xn) ∈ Rn, x(k) denotes the kth smallest among components
x1, . . . , xn, and x[k] denotes the kth largest among components x1, . . . , xn, 1 ≤ k ≤ n.
Definition 2.1. Let x = (x1, . . . , xn),y = (y1, . . . , yn) ∈ Rn.
1. x is said to be majorized by y, denoted as x ≺ y, if
k∑i=1
x(i) ≥k∑i=1
y(i), k = 1, . . . , n− 1, andn∑i=1
x(i) =n∑i=1
y(i).
2. x is said to be weakly majorized by y, denoted as x ≺w y, if
k∑i=1
x[i] ≤k∑i=1
y[i], k = 1, . . . , n.
For example, (1, 2, 3) ≺ (2, 0, 4) and (1, 2, 3) ≺w (2, 1, 4). That is, if x ≺ y, then y is more
spread out than x does. Obviously, x ≺w y if and only if x ≺ z and z ≤ y component-wise
for some z ∈ Rd. For any non-negative vector x = (x1, . . . , xn) ∈ Rn,
(x, x, . . . , x) ≺ (x1, x2, . . . , xn) ≺( n∑i=1
xi, 0, . . . , 0), (2.1)
where x =∑n
i=1 xi/n denotes the average.
Definition 2.2. A real-valued function φ defined on a set D ⊆ Rd is said to be Schur-convex
on D if x ≺ y =⇒ φ(x) ≤ φ(y).
If a symmetric function φ defined on an open subset D is differentiable, then φ is Schur-
convex if and only if the partial derivative ∂φ(x)/∂xk satisfies Schur’s condition:
(xi − xj)(∂φ(x)
∂xi− ∂φ(x)
∂xj
)≥ 0, for all i 6= j. (2.2)
4
For example, the function φ(x) =∑n
i=1 xpi is Schur-convex for any p ≥ 2. This, together
with (2.1), imply that for any vector (x1, . . . , xn),( 1
n
n∑i=1
xi
)p≤ 1
n
n∑i=1
xpi , for any p ≥ 2.
As illustrated in this example, the notion of majorization provides a powerful method for
deriving various inequalities. A most comprehensive treatment on majorization and Schur
convexity is the monograph [14] by Marshall, Olkin and Arnold. A theory of majorization
on partially ordered sets was developed in [22], and stochastic versions of majorization and
Schur-convexity have also been introduced in the literature; see, e.g., [14, 10]. The following
properties of majorization and Schur-convexity can be found in [14].
Theorem 2.3. Let x = (x1, . . . , xn),y = (y1, . . . , yn) ∈ Rn.
1. x ≺w y ⇐⇒ φ(x) ≤ φ(y) for all Schur-convex and non-decreasing functions φ.
2. x ≺ y if and only if there exist a finite number (say m ≥ 1) of vectors x(i), i = 1, . . . ,m,
such that
x = x(1) ≺ x(2) ≺ · · · ≺ x(m) = y
where x(i) and x(i+1) differ in two coordinates only, and all x(i)s are of the following
form
(z1, . . . , zk−1, zk+∆, zk+1, . . . , zl−1, zl−∆, zl+1, . . . , zn), 1 ≤ k < l ≤ n, ∆ ∈ R. (2.3)
3. A symmetric function φ(·) defined on an open set D is Schur-convex if and only if for
any z1 ≥ z2 ≥ · · · ≥ zn and ∆ ≥ 0,
φ(z1, . . . , zk−1, zk+∆, zk+1, . . . , zl−1, zl−∆, zl+1, . . . , zn) is non-decreasing in ∆, (2.4)
where 1 ≤ k < l ≤ n.
4. For any convex function g(·) and any increasing Schur-convex function φ(·),
ψ(x) = φ(g(x1), . . . , g(xn))
is also Schur-convex. In particular, the function∑n
i=1 g(xi) and max{g(x1), . . . , g(xn)}are Schur-convex for any convex function g(·).
5
The transform (2.3) is known as the Robin Hood transform that allocates positive weight
from a larger component to a smaller component so as to make the vector less spread out.
Such a transform is powerful and essentially reduces derivation of any majorization-based
inequality to a two-dimensional problem. The Robin Hood transform on partial ordered sets
has been also developed [22].
Theorem 2.4. If x = (x1, . . . , xn),y = (y1, . . . , yn) ∈ Rn, such that x(n) ≤ y(1), then
φ(g(y(1) − x(1)), . . . , g(y(n) − x(n))) ≤ φ(g(y1 − x1), . . . , g(yn − xn))
≤ φ(g(y(1) − x(n)), . . . , g(y(n) − x(1)))
for any convex function g(·) and any increasing Schur-convex function φ(·).
Proof: Since φ(·) is symmetric, we assume without loss of generality that yis are already
arranged in the increasing order; that is, y1 ≤ y2 ≤ · · · ≤ yn. In light of Theorem 2.3 (4), we
need to show that(y1 − x(1), . . . , yn − x(n)
)≺(y1 − x1, . . . , yn − xn
)≺(y1 − x(n), . . . , yn − x(1)
). (2.5)
1. To prove the first inequality in (2.5), we write, without loss of generality, the compo-
nents of(y1 − x1, . . . , yn − xn
)in terms of order statistics x(i)s:(
y1 − x1, . . . , yn − xn)
=(y1 − x(k), . . . , yl − x(1), . . .
), k, l ≥ 1.
Obviously,
y1 − x(k) ≤ min{y1 − x(1), yl − x(k)} ≤ max{y1 − x(1), yl − x(k)} ≤ yl − x(1)
which imply that (y1 − x(1), yl − x(k)
)≺(y1 − x(k), yl − x(1)
).
Keeping all other components i 6= 1, i 6= l the same, we perform the first Robin Hood
transform on components 1 and l:
x(1) :=(y1−x(1), . . . , yl−x(k), . . .
)≺(y1−x(k), . . . , yl−x(1), . . .
)=(y1−x1, . . . , yn−xn
).
Note that the first component of x(1) is the same as the first component of(y1 −
x(1), . . . , yn − x(n)). Starting from the second component of x(1), repeat similar Robin
Hood transforms on remaining components, leading to a sequence of Robin Hood
transforms:(y1 − x(1), . . . , yn − x(n)
)= x(n−1) ≺ · · · ≺ x(2) ≺ x(1) ≺
(y1 − x1, . . . , yn − xn
)and the first inequality of (2.5) follows.
6
2. To prove the second inequality of (2.5), we use the reverse Robin Hood transforms (to
make the vectors more spread out). Write
y(0) :=(y1 − x1, . . . , yn − xn
)=(y1 − x(k), . . . , yl − x(n), . . .
), k, l ≤ n.
Obviously,
y1 − x(n) ≤ min{y1 − x(k), yl − x(n)} ≤ max{y1 − x(k), yl − x(n)} ≤ yl − x(k)
which imply that (y1 − x(k), yl − x(n)
)≺(y1 − x(n), yl − x(k)
).
Keeping all other components i 6= 1, i 6= l the same, we perform the first reverse Robin
Hood transform on components 1 and l:(y1−x1, . . . , yn−xn
)=(y1−x(k), . . . , yl−x(n), . . .
)≺(y1−x(n), . . . , yl−x(k), . . .
)=: y(1).
Note that the first component of y(1) is the same as the first component of(y1 −
x(n), . . . , yn− x(1)). Starting from the second component of y(1), repeat similar reverse
Robin Hood transforms on remaining components, to get
y(2) :=(y1 − x(n), y2 − x(n−1), . . .
),
repeat again and again on remaining components, which lead to a sequence of reverse
Robin Hood transforms:(y1 − x1, . . . , yn − xn
)≺ y(1) ≺ y(2) ≺ · · · ≺ y(n−1) =
(y1 − x(n), . . . , yn − x(1)
)and the second inequality of (2.5) follows. �
Remark 2.5. Theorem 2.4 presents a general result for optimal matching, and more
importantly, local Robin Hood transforms used in the proof is very powerful. For
example, if there are some constraints on matching vectors x and y, then inequalities
can be established by performing Robin Hood transforms within the region defined
by the constraints. In matching problems,∑n
i=1(yi − xi) is fixed, leading naturally
to majorization. If the sum is not fixed (e.g., xis and yis may be drawn from larger
batches), then the weak majorization can be used.
7
Corollary 2.6. Let x = (x1, . . . , xn),y = (y1, . . . , yn) ∈ Rn, such that x(n) ≤ y(1). Define
the Kantorovich cost function:
Kg(x,y) := min{ 1
n
n∑i=1
g(yπ(i) − xi) : all permutations π(·) of {1, . . . , n}}
(2.6)
where g(·) is a convex function. Then it follows from Theorem 2.4 that
Kg(x,y) =1
n
n∑i=1
g(y(i) − x(i)).
The Kantorovich cost function (2.6) is a discrete version of Kantorovich’s function used
in the Monge-Kantorovich mass transportation problem [16, 17].
Example 2.7. Let x = (x1, . . . , xn),y = (y1, . . . , yn) ∈ Rn, such that x(n) ≤ y(1). Define the
variance of the difference vector (y1 − x1, . . . , yn − xn) as
Var(y − x) =1
n
n∑i=1
[(yi − xi)− (y − x)
]2, where y − x =
1
n
n∑i=1
(yi − xi).
It follows from Corollary 2.6 that Var(y − x) achieves the minimum
1
n
n∑i=1
[(y(i) − x(i))− (y − x)
]2when the ith smallest x(i) matches the ith smallest y(i); that is, optimal matching occurs
when small matches small and large matches large. �
For subset-to-subset matching, the key issue is on specifications of optimal matching
criteria, and it often has more than one objective functions. For example, optimal matching
can be achieved by minimizing some cost function of all matched subsets as well as variances
within subsets. We illustrate our marjorization-based matching method in the following
one-to-subset matching problem (see Figure 2.1).
Figure 2.1: The values of the diameters Φ1 and Φj, 2 ≤ j ≤ m + 1, for inside (axle) and
outside (sleeve) parts are xi and yim−m+1, . . . , yim, respectively.
8
Consider x = (x1, . . . , xn) ∈ Rn,y = (y1, . . . , ynm) ∈ Rnm, such that x(n) ≤ y(1). The goal
is to match xi to a subset of m components of y so as to minimize the following objective
functions: find a permutation π(·) on {1, . . . , nm} such that
min1
nm
n∑i=1
m∑j=1
g(yπ(im−m+j) − xi), where g(·) is convex, and (2.7)
min1
nm
n∑i=1
m∑j=1
[yπ(im−m+j) − yi]2 =
1
n
n∑i=1
{ 1
m
m∑j=1
[yπ(im−m+j) − yi]2}
(2.8)
where yi = 1m
∑mj=1 yπ(im−m+j), 1 ≤ i ≤ n. Since xi is fixed for the subset {im − m +
1, . . . , im}, the second objective function can be written as
1
nm
n∑i=1
m∑j=1
[yπ(im−m+j) − yi]2 =
1
n
n∑i=1
{ 1
m
m∑j=1
[(yπ(im−m+j) − xi)− (yi − xi)
]2}.
That is, the second objective function describes the average of variances of matching differ-
ences with subsets.
Proposition 2.8. Assume without loss of generality that the components of x are arranged
in the increasing order: x1 ≤ x2 ≤ · · · ≤ xn. The optimal matching (2.7) and (2.8) can be
achieved via the permutation
yπ(im−m+j) = y(im−m+j), i = 1, . . . , n, j = 1, . . . ,m. (2.9)
Proof: We enlarge x as follows,
x∗ = (x1, . . . , x1︸ ︷︷ ︸m
, x2, . . . , x2︸ ︷︷ ︸m
, . . . , xn, . . . , xn︸ ︷︷ ︸m
).
It then follows from Corollary 2.6 that the permutation (2.9) minimizes (2.7). To show the
permutation (2.9) also minimizes (2.8), consider
n∑i=1
{ 1
m
m∑j=1
[yπ(im−m+j) − yi]2}
=n∑i=1
{ 1
m
m∑j=1
y2π(im−m+j) − yi2}
=1
m
n∑i=1
m∑j=1
y2π(im−m+j) −1
m2
n∑i=1
[ m∑j=1
yπ(im−m+j)
]2Since
∑ni=1
∑mj=1 y
2π(im−m+j) is invariant under any permutation π(·), minimizing (2.8) boils
down to maximizing∑n
i=1
[∑mj=1 yπ(im−m+j)
]2.
9
Since∑n
i=1 z2i is Schur-convex in (z1, . . . , zn), we have, for any permutation π(·),( m∑j=1
yπ(j), . . . ,
m∑j=1
yπ(nm−m+j)
)≺( m∑j=1
y(j), . . . ,
m∑j=1
y(nm−m+j)
)
=⇒n∑i=1
[ m∑j=1
yπ(im−m+j)
]2≤
n∑i=1
[ m∑j=1
y(im−m+j)
]2which implies that (2.8) is minimized via the permutation (2.9). �
It is worth mentioning that the permutation (2.9) is just one of n! permutations that
minimizes (2.8), but it is the only one that also minimizes (2.7). Note that the objective
function (2.8) can be made more general, but using variance as an optimal criterion is a
common practice in quality assurance.
3 Numerical Examples and Invariant Optimal Match-
ing Strategies
As mentioned in Sections 1 and 2, invariant optimal matching strategies explore ranking of
matching elements and their solutions do not depend on specific values of these elements.
Using the method we developed in Section 2, the optimal solutions can be constructed
explicitly using majorization.
It is worth mentioning that in manufacturing practice, there can be many types of optimal
matching; for example, matching with two types of parts can be one-to-one, one-to-subset,
subset-to-subset, and matching with more than two types of parts can be one-to-one-to-one,
one-to-one-to-subset, subset-to-subset-to-subset, etc. All these matching problems require
various optimal criteria to ensure consistency, precision and compatibility of materials. We
begin with two illustrative numerical examples on one-to-one and one-to-subset matching
before discussing a more complex matching problem.
Example 3.1. A two-part product consists of an axle and its sleeve, and the size data of
8 products for both parts and their rankings are given in Table 3.1. The optimal matching
criterion is the variance (see Example 2.7):
min Var(y − x) =1
n
n∑i=1
[(yπ(i) − xi)− (y − x)
]2,
where y − x = 1n
∑ni=1(yπ(i)−xi). This is a one-to-one matching problem, in which the total
variance of dimension gaps of matched parts is the optimal criterion. The optimal matching
10
is obtained using Corollary 2.6 and is listed in Table 3.2. Note that the optimal matching is
not unique in this example because some parts have identical matching sizes, but all optimal
solutions subscribe our invariant optimal matching strategy that “small matches with small
and large matches with large”.
Table 3.1: The size data are shown in the top table, and the ranked data (in the increasing
order) are shown in the bottom table.
If we match both parts randomly, we only have a chance with probability 16/8! to obtain
the optimal matching. We may also encounter the worst matching with probability 16/8!.
The total gap variance in the optimal matching is 6.9038 × 10−7 and the largest total gap
variance is 1.1609× 10−5 . �
Table 3.2: Optimal matching with gaps
Example 3.2. A two-part product consists of an axle and four sleeves in different positions
of the axle, and the size data for axles and sleeves are given in Table 3.3. The optimal
11
matching criteria are the total variance and the sum of variances with subsets:
min1
nm
n∑i=1
m∑j=1
(yπ(im−m+j) − xi − y − x
)2, and
min1
nm
n∑i=1
m∑j=1
[yπ(im−m+j) − yi]2
where y − x = 1nm
∑ni=1
∑mj=1(yπ(im−m+j) − xi) and yi = 1
m
∑mj=1 yπ(im−m+j), 1 ≤ i ≤ n.
After sorting in the increasing order data sets of axles and sleeves, respectively, we obtain
the optimal matching using Proposition 2.8 as follows:(x2
y2, y17, y11, y13
),
(x1
y1, y16, y5, y26
),
(x6
y35, y3, y29, y34
),
(x9
y6, y19, y25, y32
),
(x3
y21, y14, y7, y23
),
(x8
y22, y31, y20, y4
),
(x5
y33, y30, y10, y27
),
(x4
y28, y15, y24, y12
),
(x7
y9, y36, y8, y18
),
where xis are arranged in the increasing order. This is a 1-to-4 matching problem with two
objective optimal criteria, but as long as objective functions are Schur-convex, our matching
method (see Theorem 2.4) yields the invariant optimal matching strategy that “small matches
with small and large matches with large”. �
Table 3.3: Size data for a one-to-subset matching
12
Figure 3.1: Subset-to-subset optimal sequential matching
Our majorization-based match method can be applied to sequential optimal matching.
We illustrate this using a two-dimensional matching problem (see Figure 3.1). Consider the
data sets X = {(x1, y1), (x2, y2), . . . , (xn, yn)}, U = {u1, . . . , unm1} and V = {v1, . . . , vnm2}.The goal is to match {x1, . . . , xn} to U and match {y1, . . . , yn} to V ; that is, to find a
permutation π(·) on {1, . . . , nm1} and a permutation τ(·) on {1, . . . , nm2} so as to minimize
min1
nm1
n∑i=1
m1∑j=1
(uπ(im1−m1+j) − xi − u − x
)2; (3.1)
min1
nm1
n∑i=1
m1∑j=1
[uπ(im1−m1+j) − ui]2; (3.2)
min1
nm2
n∑i=1
m2∑j=1
(vτ(im2−m2+j) − yi − v − y
)2; (3.3)
min1
nm2
n∑i=1
m2∑j=1
[vτ(im2−m2+j) − vi]2, (3.4)
where u − x = 1nm1
∑ni=1
∑m1
j=1(uπ(im1−m1+j)−xi) and ui = 1m1
∑m1
j=1 uπ(im1−m1+j), 1 ≤ i ≤ n,
and v − y = 1nm2
∑ni=1
∑m2
j=1(vτ(im2−m2+j) − yi) and vi = 1m2
∑m2
j=1 vτ(im2−m2+j), 1 ≤ i ≤ n.
Without loss of generality, we assume that x1 ≤ x2 ≤ · · · ≤ xn, and correspondingly
(y1, y2, . . . , yn) = (y(ρ(1)), y(ρ(2)), . . . , y(ρ(n))), where ρ(·) is the permutation on {1, . . . , n} that
maps i to the index of the ρ(i)th smallest among y1, . . . , yn. Using Proposition 2.8, the
permutation
uπ(im1−m1+j) = u(im1−m1+j), 1 ≤ i ≤ n, 1 ≤ j ≤ m1,
13
minimizes (3.1) and (3.2), and the permutation
vτ(im2−m2+j) = v(ρ(i)m2−m2+j), 1 ≤ i ≤ n, 1 ≤ j ≤ m2,
minimizes (3.3) and (3.4).
Example 3.3. A multi-part product consists of a twined axle and two different sleeves, and
the first axle needs to match the two sleeves of the first type and the second axle needs to
match the three sleeves of the second type. The size data for axles and sleeves are given in
Table 3.4. The optimal matching strategy is given below,(x3; y3
u2, u1; v10, v7, v2
),
(x2; y2
u5, u3; v9, v11, v5
),
(x4; y4
u6, u7; v1, v4, v3
),
(x1; y1
u4, u8; v6, v8, v12
).
Note that the multi-dimensional optimal matching is a special case of optimal matching on
partially ordered sets. This problem can be viewed as a sequencial matching with a 1-to-2
matching followed by a 1-to-3 matching with multiple objective optimal criteria. Again as
long as objective functions are Schur-convex, our matching method (see Theorem 2.4) yields
the invariant optimal matching strategy that “small matches with small and large matches
with large”. �
Table 3.4: Subset-to-subset optimal sequential matching
14
4 Concluding Remarks
The matching problems presented in this paper were motivated from a consulting problem in
quality management, and the goal is to match non-overlapping subsets of parts of one type
to non-overlapping subsets of parts of another type so as to minimize the total matching
variance and the sum of variances within subsets. We develop a majorization-based method
to solve this problem and find the optimal solutions explicitly by constructing a sequence of
pair-wise local matching operations for persistent variance reductions.
In contrast to optimal matching problems in observational studies [2, 20, 18, 19, 21],
our method can be applied to a wide class of optimal criteria metrics, including the Maha-
lanobis distance and Kantorovich cost functionals. Our method focuses on the structural
properties of matching, such as ranking of matched elements, with the aim of developing the
majorization-based method for continuous matching problems arising from quality manage-
ment. The majorization-based matching method developed in this paper sheds new light
on the optimal matching heuristic that “small matches with small and large matches with
large” for a wide class of objective functions. Our future research includes majorization-based
optimal matching for partially ordered data sets and its application to quality assurance.
Acknowledgement
This work is in honor of Professor Alan G. Hawkes on his 75th birthday. This work was sup-
ported partly by the NSF of China under grant 71371031 and NSF grant DMS 1007556. The
authors would like to thank two anonymous referees and editor for their valuable suggestions
on the improvements of the paper.
References
[1] Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., and
Sturmer, T. (2006): Variable selection for propensity score models. American Journal
of Epidemiology, 163, 1149-1156.
[2] Cochran, W. G. and Rubin, D. B. (1973): Controlling bias in observational studies: A
review. Sankhya Ser. A, 35, 417-446.
[3] Egozcue, M., and Wong, W. K. (2010): Gains from diversification on convex com-
binations: A majorization and stochastic dominance approach. European Journal of
Operational Research 200: 893-900.
15
[4] Greenwood, E. (1945): Experimental Sociology: A Study in Method. Kings Crown
Press, New York.
[5] Hansen, B. B. and Klopfer, S. O. (2006): Optimal full matching and related designs via
network flows. Journal of Computational and Graphical Statistics, 15, 609-627.
[6] Imbens, G. W. (2004): Nonparametric estimation of average treatment effects under
exogeneity: A review. Review of Economics and Statistics, 86, 4-29.
[7] Ho, D. E., Imai, K., King, G., and Stuart, E. A. (2007): Matching as nonparametric
preprocessing for reducing model dependence in parametric causal inference. Political
Analysis, 15, 199-236.
[8] Lechner, M. (2002): Some practical issues in the evaluation of heterogeneous labour
market programmes by matching methods. Journal of the Royal Statistical Society, Ser.
A, 165, 59-82.
[9] Lee, M. H. (2013): Variable sampling rate multivariate exponentially weighted moving
average control chart with double warning lines, Quality Technology and Quantitative
Management, 10, 353-368.
[10] Li, H. and Shaked, M. (1993): Stochastic majorization of stochastically monotone fam-
ilies of random variables. Advances in Applied Probability, 25, 895-913.
[11] Liu, F. Y., and Cui, L. R. (2013): a design of attributes single sampling plans for
three-class products, Quality Technology and Quantitative Management, 10, 369-287.
[12] Liu, S. W., and Wu, C. W. (2014): Design and construction of a variables repetitive
group sampling plan for unilateral specification limit, Communications in Statistics-
Simulation and Computation, 43, 1866-1878.
[13] Lu, B., Zanutto, E., Hornik, R., and Rosenbaum, P. R. (2001): Matching with doses in
an observational study of a media campaign against drug abuse. Journal of the American
Statistical Association, 96, 1245-1253.
[14] Marshall, A. W., Olkin, I., and Arnold, B. C.: Inequalities: Theory of Majorization and
Its Applications. Springer, New York (2009)
[15] Morgan, S. L. and Harding, D. J. (2006): Matching estimators of causal effects:
Prospects and pitfalls in theory and practice. Sociological Methods and Research, 35,
3-60.
16
[16] Rachev, S. T. (1991): Probability Metrics and the Stability of Stochastic Models. John
Wiley & Sons, New York.
[17] Rachev, S. T., and Ruschendorf, L. (1998): Mass Transportation Problems, Volume I,
Theory. Vol. 1. Springer, New York.
[18] Rosenbaum, P. R. (2002): Observational Studies, Springer, New York.
[19] Rosenbaum, P. R. (2010): Design of Observational Studies, Springer, New York.
[20] Rubin, D. B. (1973): Matching to remove bias in observational studies. Biometrics, 29,
159-184.
[21] Stuart, E. A. (2010): Matching methods for causal inference: A review and a look
forward. Statistical Science, 25, 1-21.
[22] Xu, S. H., and Li, H. (2000): Majorization of weighted trees: A new tool to study
correlated stochastic systems. Mathematics of Operations Research, 25, 298-323.
[23] Zhao, Z. (2004): Using matching to estimate treatment effects: Data requirements,
matching metrics, and Monte Carlo evidence. Review of Economics and Statistics, 86,
91-107.
17