Progress On The Union-Closed Conjecture and Offsprings in Winter 2022-2023
Stijn Cambie∗
arXiv:2306.12351v1 [math.CO] 21 Jun 2023
Mathematicians had little idea whether the easy-to-state union-closed conjecture was true or false
even after 40 years. However, last winter saw a surge of interest in the conjecture and its variants,
initiated by the contribution of a researcher at Google. Justin Gilmer made a significant breakthrough
by discovering a first constant lower bound for the proportion of the most common element in a union-
closed family.
{1, 2, 3, 4}
{1, 2, 3}
{1} {2}
Korea, supported by the Institute for Basic Science (IBS-R029-C4)

The Union-closed conjecture can now be formally stated as follows.
Conjecture 1 (Union-closed conjecture). If F = 6 {∅} is a union-closed family with ground set [n],
then there exists an element i ∈ [n] such that at least half of the sets in F contain i.
Considering our previous example Fm for large m, one can verify that it might be that only a small
fraction of the elements of the ground set are abundant (belong to at least half of the sets) and their
average proportion of sets to which they belong can tend to zero. Note that this conjecture would be
(arguably) false when taking an infinite ground set N, e.g. by considering the (union-closed) family
of finite subsets of N.
This conjecture can also be formulated in many different ways. For example, one can con-
sider bitstrings in {0, 1}n with the element-wise OR-operation. For instance, when n = 4 and
F = {0011, 1100, 1111}, we note that 0011 + 1100 = 1111. This family is closed under the OR-
operation, which corresponds to being union-closed in the initial formulation.
Taking the complements of the set, one obtains the Intersection-closed sets conjecture, which states
that an intersection-closed family has an element in its ground set appearing in at most half of the
sets. In [3, Sec. 3], one can also find a lattice-, graph-, and Salzborn-formulation.
On November 17, 2022, Justin Gilmer [10], a researcher at Google working in machine learning,
made a breakthrough by proving a first constant fraction for Conjecture 1. Soon thereafter, as fast as
a few days, his result made others put improvements and related results on the preprint server Arxiv.
In this note, we summarize the contributions and progress that was made in the winter of 2022–2023.
We explain the main ideas of Gilmer’s approach (Section 2), mention the forthcoming extensions of
his method (Sections 3 and 4), as well as an unsuccessful attempt (Section 5) and discuss other work
related to the Union-closed conjecture (Section 6).
When sampling uniformly at random from F , the entropy will equal log2 |F | and no higher entropy
is possible. If one can sample from F ∪ F in such a way that the entropy is larger than log2 |F |, then
one can conclude that |F ∪ F | > |F |. This is exactly the core of Gilmer’s approach.
More precisely, he proved the following statement.
Theorem 2. Let A and B denote independent and identically distributed random variables that sample
from a common distribution over subsets of [n]. Assume that for all i ∈ [n], P[i ∈ A] ≤ 0.01. Then
H(A ∪ B) ≥ 1.26H(A).
As a corollary, by taking the uniform distribution over the subsets of [n], one knows that if F ⊂ 2[n]
is a family for which every element is contained in no more than 1% of the sets, then |F ∪F | ≥ |F |1.26 .3


This implies that whenever |F | ≥ 2, either |F ∪ F | > |F | (and so the family is not union-closed) or
there is an element appearing in at least a 0.01 fraction of the sets in F . From this, one can conclude
that Conjecture 1 is true for a half replaced by 0.01.
example 3. Let F = {{1}, {2}} and thus F ∪ F = {{1}, {2}, {1, 2}}. Let A and B be i.i.d. random
variables that output a set of F uniformly at random. Then P(A = {1}) = P(A = {2}) and
analogously for B, which implies
1 1
P(A ∪ B = {1}) = P(A ∪ B = {2}) = and P(A ∪ B = {1, 2}) = .
4 2
Now H(A) = 2 · 21 log2 2 = 1 and H(A ∪ B) = 2 · 41 log2 4 + 12 log 2 = 23 (< log2 3). Since log2 (2) <
H(A ∪ B), we conclude that it is impossible that A ∪ B takes values in a family with only 2 elements
and thus |F ∪ F | > |F |, i.e. Gilmer’s method verifies that F is not union-closed.
and thus F ∪ F = 2[3] . Note that |F | = 7 and every 1 ≤ i ≤ 3 appears in
example 4. Let F = ≤2
exactly 3 sets and thus in a 37 fraction. Let A, B be i.i.d. random variables that output a set of F
uniformly at random. Then
P(A ∪ B = ∅) =
P(|A ∪ B| = 1) =
P(|A ∪ B| = 2) =
P(A ∪ B = [3]) =
H(A) =7 log2 (7) = log2 (7)
1 3
H(A ∪ B) = log2 (49) + 3 log2 (49/3)
49 49
9 12
+ 3 log2 (49/9) + log2 (49/12)
49 49
and thus H(A) > H(A ∪ B). We conclude that this is an example for which Gilmer’s method does not
provide evidence that the family is not union-closed, even while the maximum fraction of occurence
of an element is 37 .
Note: Analogously, when F = ≤3 , one can verify that H(A) = log2 (26) ∼ 4.7 and H(A ∪ B) ∼
4.54. Every element appears in a 26 fraction in this case.
Lemma 5. Let φ = 2 and 0 ≤ x ≤ 1, then h(x2 ) ≥ φxh(x).
The validity of this lemma was established in two different ways by [1] and Sawin [18]. The former
used accurate computer calculations and applied interval arithmetic on three intervals, while the latter
utilized a purely calculus-based approach. Thanks to some communication between the authors of [1]
and [5], in [5] a reference to the formal proof of [1] was added. In [15] the lemma was split in two
parts without formal proof, but both can be verified easily.
A short and more elegant proof for Lemma 5 was given later by Boppana [2], even while the proof
itself would originate from 1989. This proof relies on the following extension of the classical Rolle’s
theorem, which follows from observations in e.g. [12].
Theorem 6. Let f be a differentiable function on a interval I. Let m(f ) be the sum of multiplicities
of the roots of f in I. Then m(f ′ ) ≥ m(f ) − 1.
By iterating the theorem three times, one finds m(f ) ≤ m(f ′′′ ) + 3. Applying this result on
the function f (x) = h(x2 ) − φxh(x) and counting the multiplicities of the roots 0, φ1 and 1 of f ,
the conclusion that f is nonnegative on [0, 1] follows quickly. Once Lemma 5 is derived, the proof
for Conjecture 1 for constant ψ (instead of 0.5) is rather short in each of the papers [1, 5, 15, 18],
indicated e.g. by the total length of the paper by Chase and Lovett [5]. Their work has three steps.
First, they extended the analytic claim (Lemma 5) to the two-variate function f (x, y) := h(x)y+h(y)x .
Next they prove a strengthened inequality between the entropy of A ∪ B and the one of A and B,
for random variables A and B (not necessarily identical) on {0, 1}n for which every bit is 1 with a
bounded probability. Finally, they finish the proof of their slightly more general statement that holds
for approximate union-closed families. The latter being families for which the union of two random
drawn sets belong to the family with a high probability.
One example which certifies the sharpness of their proof can be derived from F1 + F2 = {A |
[n] [n]
A ∈ F1 ∨ A ∈ F2 } where F1 = ψn+n 2/3 and F2 = ≥(1−ψ)n . For this, one need to note that
|F1 | >> |F2 | and that the union of two (iid uniform sampled) random sets from F1 belongs with very
high probability to F2 . The expected size of the union is slightly larger (with an additional term of
the order n2/3 , i.e. Θ(n2/3 )) than n − (1 − ψ)2 n = (1 − ψ)n, and since the variance on the size is
O(n1/2 ), the union almost surely belongs to F2 as well. The conclusion is still valid when replacing
the term n2/3 by any function g(n) for which n >> g(n) >> n1/2 .
Figure 2: An approximate union-closed family whose elements appear in at most a ψ + o(1) fraction.
In a different direction, in his paper, Gilmer included some ideas for a full resolution of Conjecture 1,
but some of these directions were immediately proven not to hold by Sawin and Ellis [18, 7].
4 Further refinements and extensions related to Gilmer’s work
Sawin [18] gave a suggestion to improve the bound further, which given the sharpness of the form
for union-closed families may be considered surprising. Hereby the essence is in a question purely
stated in terms of probability distributions. His suggestion was worked out by Yu [20] and Cambie [4].
Yu [20] considered the approach in a slightly more general form initially and made a lower bound
computable by restricting to the suggestion of Sawin and applying [1, Lem. 5] and the Krein-Milman
theorem [13] to bound the support (number of values with nonzero probability) of a joint distribution
by 4. A numerical computation then yield a bound equal to (roughly) 0.38234. In parallel, Cambie [4]
found an upper bound for Sawin’s approach which indicates that the improvement is way smaller than
expected and one would hope for. The construction is a discrete probability distribution with only
two values having nonzero probability, with the values determined by a system of equations involving
the entropy function. Additionally he proved that this value is sharp, by first reducing the support to
3 elements, where one of the elements equals 1. Finally, the conclusion is derived from the combination
of 3-dimensional plots, a numerical minimization problem and a more precise solution for the case
where the support has exactly two elements, one of which equals 1.
Finally, building upon the work of [5], Yuster [21] considered families that are almost k-union-
closed, meaning that the union of k independent uniform random sets from F belongs to F with
high probability. He conjectured a tight version for the minimum frequency (the proportion of sets
containing the element) of some element in such families, with the threshold for this frequency being
the unique real root in [0, 1] of (1 − x)k = x, denoted by ψk . To understand the sharpness of his
conjecture and the intuition behind the choice of ψk , consider the union of F1 = ψk n[n]
+ n2/3 and F2 =
≥(1−ψk )n
. If at least one set from F2 is included among the k sets drawn, the union is guaranteed
to belong to F2 . If all k sets belong to F1 , the expected size of the union is n − (1 − ψk )k n + Θ(n2/3 ),
and since the variance is O(n1/2 ), the union almost surely belongs to F2 as well. The conjecture is
proven to be true for k ≤ 4, while for larger values of k a weaker bound is established.
For every X ∈ F, Pr[Aδ = X] ≥ (1 − δ) Pr[A = X] and thus for δ sufficiently small, we have
h(Pr[Aδ = X]) − h(Pr[A = X]) & δ/|F |h′ (1/|F |).4 On the other hand, for X ∈ (F ∪ F)\F , let the
probability p := Pr[A ∪ B = X]. We have that h(δp) ∼ −δp(log δ + log p − 1). By choosing δ to be
sufficiently small such that − log δ is much greater than p1 h′ (1/|F |), we can ensure that H(Aδ ) > H(A)
1 h(x)
0.2 0.4 0.6 0.8 1
construction of the family P412 in [11] can be extended to such families.Let k ≥ 3 be a fixed integer and
let n be a sufficiently large even integer as a function of k (n ≥ 10k works). Let En = {i ∈ [n] | i ≡ 0
(mod 2)} and On = {i ∈ [n] | i ≡ 1 (mod 2)} be the set of even and odd integers in [n] respectively.
Consider the family Pkn consisting of subsets S of [n] of size at least k, such that either
• {1, 2} ⊂ S,
• S ⊂ En and 2 ∈ S, or
• S ⊂ On and 1 ∈ S.
It is clear that 1 and 2 are abundant elements. Now the other elements appear all equally often (by
symmetry) and by a small bijection and counting argument, we conclude that these elements are not
abundant whenever
n−3 n/2 − 2
<2 .
k−3 ≥k−1
Since this is the case for n sufficiently large, the conclusion is clear.
Another result related with union-closed families and the smallest set size, was published early
2023. Ellis, Ivan and Leader [9] proved that for every k ∈ N, there exists a union-closed family in which
the (unique) smallest set has size k, but where each element of this set has frequency (1 + o(1)) log k
2k .
As such, proving that focusing on the smallest set cannot work in the strongest possible sense. They
also proposed the problem of verifying the union-closed conjecture for a family for which they were
unable to verify the statement. The latter was verified by Pulaj and Wood [17]. They also proved
new bounds on the least number m (given k and n) such that every union-closed family F containing
any A ⊆ [n]
k with |A| = m as a subfamily, satisfies Conjecture 1.
We can conclude that despite the progress that originates from the breakthrough of Justin Gilmer,
the exact version of Conjecture 1 is still not proven. Mathematicians are still thinking about other
directions or modifications of the strategy and hope to resolve Conjecture 1 in the future. Taking
into account that the improvement by taking combinations suggested by Sawin [18] turned out to be
tinier than expected and hoped for, as illustrated by the example in [4], it seems that the focus should
go towards essential new ideas. In particular, the union-closed conjecture might be a distraction of a
more general behaviour that |F ∪ F | > |F |c for some c(ε) > 1 when every element of [n] appears in
less than a 21 − ε fraction of the sets in F .5
Note added: In June 2023, Liu [14] improved the constant slightly with a different method of
We thank Zachary Chase, Justin Gilmer, Raffaele Scandone and Lei Yu for internal communication
while writing this manuscript.

