Randomized Truthful Auctions with Learning Agents

Gagan Aggarwal
Google Research
[email protected]
&Anupam Gupta
New York University, Google Research
[email protected]
&Andres Perlroth
Google Research
[email protected]
&Grigoris Velegkas
Yale University
[email protected]
Part of the work was done while the author was a research intern at Google Research in Mountain View.
Abstract

We study a setting where agents use no-regret learning algorithms to participate in repeated auctions. Kolumbus and Nisan (2022a) showed, rather surprisingly, that when bidders participate in second-price auctions using no-regret bidding algorithms, no matter how large the number of interactions T𝑇Titalic_T is, the runner-up bidder may not converge to bidding truthfully. Our first result shows that this holds for general deterministic truthful auctions. We also show that the ratio of the learning rates of the bidders can qualitatively affect the convergence of the bidders. Next, we consider the problem of revenue maximization in this environment. In the setting with fully rational bidders, Myerson (1981) showed that revenue can be maximized by using a second-price auction with reserves. We show that, in stark contrast, in our setting with learning bidders, randomized auctions can have strictly better revenue guarantees than second-price auctions with reserves, when T𝑇Titalic_T is large enough. Finally, we study revenue maximization in the non-asymptotic regime. We define a notion of auctioneer regret comparing the revenue generated to the revenue of a second price auction with truthful bids. When the auctioneer has to use the same auction throughout the interaction, we show an (almost) tight regret bound of Θ~(T3/4).~Θsuperscript𝑇34\smash{\widetilde{\Theta}(T^{3/4})}.over~ start_ARG roman_Θ end_ARG ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) . If the auctioneer can change auctions during the interaction, but in a way that is oblivious to the bids, we show an (almost) tight bound of Θ~(T).~Θ𝑇\smash{\widetilde{\Theta}(\sqrt{T})}.over~ start_ARG roman_Θ end_ARG ( square-root start_ARG italic_T end_ARG ) .

1 Introduction

In auction design, truthfulness is a highly sought-after property. It allows bidders to simply reveal their true valuations, simplifying the bidding process. In the standard single item setting with fully rational profit-maximizing bidders, Myerson’s seminal paper Myerson (1981) shows that an auctioneer can achieve optimal revenue by using a truthful and deterministic auction mechanism – a Second Price Auction (SPA) with a reserve price.

In many applications nowadays, buyers no longer bid directly in the auction but, instead, use learning algorithms to bid on their behalf. For example, in online advertising, platforms offer automated bidding tools that manage ad campaigns on behalf of advertisers. Such bidders learn to bid over many rounds and are not fully rational. In a surprising result, Kolumbus and Nisan (2022a) show that some appealing properties of second-price auctions break down in the presence of such learning bidders. In particular, when (profit-maximizing) bidders use no-regret learning algorithms, the second-price auction does not achieve as much revenue as with fully rational bidders. Indeed, bidders do not learn to bid their value, and consequently, the runner-up bidder’s bid is less than their value with positive probability, which diminishes the second price auction’s revenue. Moreover, Kolumbus and Nisan (2022b) show that for a setting where rational agents are using learning algorithms to bid, then it is no longer optimal to truthfully submit their value as the input to the learning algorithm. This raises a crucial question: are there truthful auctions that promote convergence to the true valuations within a learning environment, and can they also guarantee strong revenue performance?

In this paper we provide an affirmative answer to this question. In doing so, we also showcase the value of randomized mechanisms — often overlooked in settings with profit-maximizing bidders — for environments where bidders are learning agents. While randomization introduces inherent inefficiencies due to allocations to low-valuation bidders, this very behavior facilitates learning among low-valuation bidders. A revenue-maximizing auctioneer must now carefully balance the randomization within a truthful mechanism to incentivize learning without incurring excessive revenue loss due to mis-allocation.

We build our theory based on the model presented by Kolumbus and Nisan (2022a). We consider single-item repeated interactions over T𝑇Titalic_T periods. There are two profit-maximizing bidders participating in the auctions, with valuations that are drawn independently from the same distribution, and fully persistent over time. This assumption is motivated by online ad auctions, where multiple auctions are taking place every second, and the valuations of the advertisers remain stable for certain time scales, e.g., a day or a week. Thus, there is typically a very large sequence of auctions where the valuations of the participating agents are persistent. Bidders use mean-based no-regret learning algorithms (Braverman et al., 2018) and receive full feedback on which they base their updates. (Many of our results extend immediately to multiple bidders. We discuss other extensions, such as the partial feedback settings, in Appendix G.) The auctioneer focuses on truthful auctions, and their objective is to maximize the total revenue they achieve over the T𝑇Titalic_T rounds of interaction. Our results are the following:

1.1 Our Results and Techniques

Limitations of Deterministic Auctions. Our first set of results (in Section 3) characterize the convergence of learners who are using Multiplicative Weights Update (MWU) in repeated deterministic auctions. In particular, we show the following sharp phase transition:

  • If the learning rate of the winning type is at least as fast as the learning rate of the runner-up type, then the runner-up type will not converge to bidding truthfully, even as T𝑇T\rightarrow\inftyitalic_T → ∞; in fact, it will be bidding strictly below its true value, in expectation.

  • On the other hand, we show that in many auctions, such as SPA, if the learning rate of the runner-up type is strictly faster than that of the winning type, then the runner-up type will indeed converge to truthful bidding.

These generalize the results of Kolumbus and Nisan (2022a) who showed that in SPA, when bidders are using MWU with the same learning rate, then the low type will not converge to bidding truthfully. The main challenges to proving this set of results arise from our study of general deterministic auctions, which have less structure than second-price auctions. Indeed, small differences in the learning rates can affect the landscape qualitatively, as is manifested from our results. Moreover, while the auctions are deterministic, the learning algorithms are randomized and highly correlated. Hence our approach is to break down the interaction into several epochs and establish some qualitative properties which hold, with high probability, at the end of each epoch. This requires a careful accounting of the cumulative utility of each bid of both bidders within every epoch; in particular, if our estimation is off by even some ω(1)𝜔1\omega(1)italic_ω ( 1 ) term, then it will not be sufficient to establish our result.

Strictly-IC Auctions and the Power of Randomized Mechanisms. The results in Section 3 show that since the low valuation bidder tends to underbid, an auctioneer using SPA with reserve makes strictly less revenue than that predicted by the model with rational agents. Motivated by this, we consider a special class of randomized auctions called strictly-IC auctions. These are randomized truthful auctions where for each bidder, it is strictly better to bid their true valuation compared to any other bid. We show that any strictly-IC auction is asymptotically truthful: that is, the limit point of the bidder’s bid converges to their true value. Furthermore, we provide a black-box transformation from any truthful auction A𝐴Aitalic_A (deterministic or not) to a randomized auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT that has the following two properties: (i) the bidders converge towards truthful bidding, and (ii) the difference between the allocation and payment rules of the original auction A𝐴Aitalic_A and its strictly-IC counterpart Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT are negligible for any bid profile. Hence, such an auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT behaves similarly to A𝐴Aitalic_A, but, crucially, it conveys information to the low bidder to help it converge to truthful bidding. As a corollary of this result, we get that SPA with reserve is not revenue-maximizing in this setting, and that randomization can get strictly more revenue than SPA with reserve. This is in stark contrast with the seminal result of Myerson (1981) which shows that SPA with reserve is optimal for rational bidders.

At a more conceptual level, our results for randomized mechanisms can be viewed as showing that having enough randomness is key to the low bidder converging to truthful bidding: this randomness can come from the process itself, e.g., if bidder values are independently drawn in each round, as in Feng et al. (2021). But if not, and if the ranking of the bidders does not change much due to the lack of inherent randomness, our results show that injecting external randomness into the auction induces the desired learning behavior and hence improves the revenue. Having persistent valuations is just one case of the ranking of the bidders remaining stable over time: studying this case allows us to showcase our main ideas, but a central message of our work is that the presence or absence of stability in the rankings of the bidders is the main factor that dictates convergence to truthful bidding.

A Non-Asymptotic Analysis. Our next set of results in Section 5 address the non-asymptotic regime. Here we consider the prior-free setting, meaning that the valuations of the bidders could be drawn from potentially different distributions that are unknown to the auctioneer. In order to evaluate its revenue performance when bidders are learning agents, we introduce the notion of auctioneer regret for an auction, which measures the difference between the revenue achieved over T𝑇Titalic_T rounds of implementing a given auction with learning bidders and the revenue achieved by implementing the optimal auction with rational bidders (i.e., SPA with a reserve price). Proposition 5.2 shows that if the auctioneer is constrained to use the same auction rule for all T𝑇Titalic_T rounds, then no truthful auction — deterministic or randomized — can achieve an auctioneer-regret better than O~(T3/4)~𝑂superscript𝑇34\widetilde{O}(T^{3/4})over~ start_ARG italic_O end_ARG ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) in the setting of adversarial valuations. However, if the auctioneer can change the auction rule just once within the T𝑇Titalic_T rounds, with the change happening at a time independent of the bid history, then the auctioneer’s regret drops to O~(T)~𝑂𝑇\widetilde{O}(\sqrt{T})over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ), as we show in Section 5 Moreover, we show in Proposition 5.4 that this bound of O~(T)~𝑂𝑇\widetilde{O}(\sqrt{T})over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ) is optimal even if the auctioneer can design the auction schedule. As a byproduct of our result, we show that the first-stage randomized auction used by the mechanism leads to the fastest convergence to truthful bidding from no-regret learning agents.

To show that an auctioneer facing learning bidders using MWU must suffer an Ω(T3/4)Ωsuperscript𝑇34\Omega(T^{3/4})roman_Ω ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) revenue loss compared to the setting when it is facing rational agents, we break down the revenue loss into two non-overlapping epochs: one where the learning bidders have not converged to truthful bidding, and the other where the bidders are truthful. Now an auctioneer using the same auction throughout the interaction faces a trade-off: they can speed up the learning process to reduce the revenue loss from the first epoch, but this loses revenue in the second epoch due to the fact that the auction now differs significantly from SPA. Our result optimizes this trade-off to show that an Ω(T3/4)Ωsuperscript𝑇34\Omega(T^{3/4})roman_Ω ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) revenue loss is unavoidable. This naturally suggests decomposing the interaction into two epochs: in the first one, the auctioneer uses a truthful auction to facilitate the convergence to truthful bidding, and in the second one it uses SPA. We then design an auction that guarantees the fastest convergence to truthful bidding for mean-based learners in the prior-free setting, and we show that an improved revenue loss of at most O~(T)~𝑂𝑇\widetilde{O}(\sqrt{T})over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ) can be achieved with this approach. (Importantly, to maintain truthfulness, the decisions of the auctioneers are fixed before the beginning of the interaction and are not affected by the bids.) This regret of O~(T)~𝑂𝑇\widetilde{O}(\sqrt{T})over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ) seems surprising, because in traditional no-regret learning settings the optimal regret is achieved when the exploration and exploitation phase are intermixed.

1.2 Related Work

The most closely related works to our setting are Feng et al. (2021); Deng et al. (2022); Kolumbus and Nisan (2022a); Banchio and Skrzypacz (2022); Rawat (2023). All these works study the long-term behavior of bidding algorithms that participate in repeated auctions, focusing on first-price and second-price auctions, but they give qualitatively different results. This is because they make different assumptions across two important axes: the type of learning algorithms that the bidders use and whether their valuation is persistent across the interaction or it is freshly drawn in each round. Feng et al. (2021) studied the convergence of no-regret learning algorithms that bid repeatedly in second-price and first-price auctions, where all agents have i.i.d. valuation that are redrawn in every round from a discrete distribution that has non-negligible mass on each point. They show that in this setting the bidders exhibit the same-long term behavior in both second-price and first-price auctions that classical theory predicts, i.e., the bids in second-price auctions are truthful and the bids in first-price auctions form Bayes-Nash equilibria. Kolumbus and Nisan (2022a) studied the same setting with the crucial difference that agents’ valuations are persistent across the execution and they are not resampled from some distribution at every iteration. Interestingly, they showed that in the case of two bidders with in second-price auctions, the agent that has the highest valuation will end up bidding between the low valuation and its valuation, whereas the agent with the low type will end up bidding strictly below its valuation. Intuitively, in their setting the high type bidder quickly learns to bid above the valuation of the low type bidder and always win the auction, and thus the low type does not get enough signal to push its bid distribution up to its valuation. On the other hand, when the valuations are redrawn as in Feng et al. (2021), the competition that the agents face varies. In the long run, this gives enough information to the algorithms to realize that bidding truthfully is the optimal strategy. In the case of first-price auctions where the agents have persistent valuations, both Kolumbus and Nisan (2022a); Deng et al. (2022) provide convergence guarantees of no-regret learning algorithms. The type of “meta-games” we touch upon in our work, where we want to understand the incentives of the agents who are submitting their valuations to bidding algorithms that participate in the auctions on the behalf of these agents, were originally studied by Kolumbus and Nisan (2022a) and, subsequently, for more general classes of games by Kolumbus and Nisan (2022b).

The pioneering work of Hart and Mas-Colell (2000) showed that when players deploy no-regret algorithms to participate in games they converge to coarse-correlated equilibria. Recently, there has been a growing interest in the study of no-regret learning in repeated auctions. The empirical study of Nekipelov et al. (2015) showed that the bidding behavior of advertisers on Bing is consistent with the use of no-regret learning algorithms that bid on their behalf. Subsequently, Braverman et al. (2018) showed, among other things, that when a seller faces a no-regret buyer in repeated auctions and can use non-truthful, it can extract the whole welfare as its revenue. A very recent work (Cai et al., 2023) extended some of the previous results to the setting with multiple agents. For a detailed comparison between our work and Cai et al. (2023), we refer to Appendix B.

Banchio and Skrzypacz (2022); Rawat (2023) diverge from the previous works and consider agents that use Q𝑄Qitalic_Q-learning algorithms instead of no-regret learning algorithms. Their experimental findings show that in first-price auctions, such algorithmic bidders exhibit collusive phenomena, whereas they converge to truthful bidding in second-price auctions. One of the main reasons for these phenomena is the asynchronous update used by the Q𝑄Qitalic_Q-learning algorithm. The collusive behavior of such algorithms has also been exhibited in other settings (Calvano et al., 2020; Asker et al., 2021, 2022b; den Boer et al., 2022; Epivent and Lambin, 2022; Asker et al., 2022a). Notably, Bertrand et al. (2023) formally proved that Q𝑄Qitalic_Q-learners do collude when deployed in repeated prisoner’s dilemma games.

In a related line of work, Zhang et al. (2023) study the problem of steering no-regret learning agents to a particular equilibrium. They show that the auctioneer can use payments to incentivize the algorithms to converge to a particular equilibrium that the designer wants them to. An interpretation of our results is that randomization is a way to achieve some kind of equilibrium steering in repeated auctions.

Diverging slightly from the setting we consider, some recent papers have illustrated different advantages of using randomized auctions over deterministic ones. Mehta (2022); Liaw et al. (2023) showed that there are randomized auctions which induce equilibria with better welfare guarantees for value-maximizing autobidding agents compared to deterministic ones. In the setting of revenue maximization in the presence of heterogeneous rational buyers, Guruganesh et al. (2022) showed that randomization helps when designing prior-free auctions with strong revenue guarantees, when the valuations of the buyers are drawn independently from, potentially, non-identical distributions.

2 Model

Our model follows the setup used in Kolumbus and Nisan (2022a). There are T𝑇Titalic_T rounds, and the auctioneer sells a single item in each round t=1,,T𝑡1𝑇t=1,\ldots,Titalic_t = 1 , … , italic_T. There are two bidders, with bidder i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 } having a persistent private valuation visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT drawn i.i.d. over the discrete set BΔ:={0,1/Δ,2/Δ,,1}assignsubscript𝐵Δ01Δ2Δ1B_{\Delta}:=\left\{0,\nicefrac{{1}}{{\Delta}},\nicefrac{{2}}{{\Delta}},\ldots,% 1\right\}italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT := { 0 , / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , / start_ARG 2 end_ARG start_ARG roman_Δ end_ARG , … , 1 } from a regular distribution F𝐹Fitalic_F. (A discrete distribution is regular if the discrete virtual valuation function ϕ(v):=v1Δv>v𝐏𝐫[v]𝐏𝐫[v]assignitalic-ϕ𝑣𝑣1Δsubscriptsuperscript𝑣𝑣𝐏𝐫delimited-[]superscript𝑣𝐏𝐫delimited-[]𝑣\phi(v):=v-\frac{1}{\Delta}\frac{\sum_{v^{\prime}>v}\mathop{\bf Pr\/}[v^{% \prime}]}{\mathop{\bf Pr\/}[v]}italic_ϕ ( italic_v ) := italic_v - divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG divide start_ARG ∑ start_POSTSUBSCRIPT italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT > italic_v end_POSTSUBSCRIPT start_BIGOP bold_Pr end_BIGOP [ italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] end_ARG start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_v ] end_ARG is non-decreasing.) Given an allocation probability x𝑥xitalic_x and price p𝑝pitalic_p, the bidder with valuation v𝑣vitalic_v receives a payoff of vxp𝑣𝑥𝑝v\cdot x-pitalic_v ⋅ italic_x - italic_p. In what follows, we refer to the bidder with valuation vL=min{v1,v2}subscript𝑣𝐿subscript𝑣1subscript𝑣2v_{L}=\min\{v_{1},v_{2}\}italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = roman_min { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } (resp. vH=max{v1,v2}subscript𝑣𝐻subscript𝑣1subscript𝑣2v_{H}=\max\{v_{1},v_{2}\}italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = roman_max { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT }) as the low type (resp. high type).

We are interested in truthful auctions, (also called strategy-proof auctions, or dominant-strategy incentive-compatible mechanisms) that are individually rational, so that at every round t𝑡titalic_t the auctioneer uses a mechanism ((x1t,x2t),(p1t,p2t))subscriptsuperscript𝑥𝑡1subscriptsuperscript𝑥𝑡2subscriptsuperscript𝑝𝑡1subscriptsuperscript𝑝𝑡2((x^{t}_{1},x^{t}_{2}),(p^{t}_{1},p^{t}_{2}))( ( italic_x start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , ( italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) satisfying

vixit(vi,b)pit(vi,b)subscript𝑣𝑖superscriptsubscript𝑥𝑖𝑡subscript𝑣𝑖superscript𝑏superscriptsubscript𝑝𝑖𝑡subscript𝑣𝑖superscript𝑏\displaystyle v_{i}\cdot x_{i}^{t}(v_{i},b^{\prime})-p_{i}^{t}(v_{i},b^{\prime})italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) vixit(b,b)pit(b,b),absentsubscript𝑣𝑖superscriptsubscript𝑥𝑖𝑡𝑏superscript𝑏superscriptsubscript𝑝𝑖𝑡𝑏superscript𝑏\displaystyle\geq v_{i}\cdot x_{i}^{t}(b,b^{\prime})-p_{i}^{t}(b,b^{\prime}),≥ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) , vi,b,bBΔ,i=1,2,formulae-sequencefor-allsubscript𝑣𝑖𝑏superscript𝑏subscript𝐵Δ𝑖12\displaystyle\forall v_{i},b,b^{\prime}\in B_{\Delta},\,i=1,2\,,∀ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_i = 1 , 2 ,
vixit(vi,b)pit(vi,b)subscript𝑣𝑖superscriptsubscript𝑥𝑖𝑡subscript𝑣𝑖superscript𝑏superscriptsubscript𝑝𝑖𝑡subscript𝑣𝑖superscript𝑏\displaystyle v_{i}\cdot x_{i}^{t}(v_{i},b^{\prime})-p_{i}^{t}(v_{i},b^{\prime})italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) 0,absent0\displaystyle\geq 0,≥ 0 , vi,bBΔ,i=1,2.formulae-sequencefor-allsubscript𝑣𝑖superscript𝑏subscript𝐵Δ𝑖12\displaystyle\forall v_{i},b^{\prime}\in B_{\Delta},\,i=1,2\,.∀ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_i = 1 , 2 .

In this work, we study various properties of randomized truthful auctions.

Definition 2.1 (Randomized Truthful Auction).

A truthful auction ((x1,x2),(p1,p2))subscript𝑥1subscript𝑥2subscript𝑝1subscript𝑝2((x_{1},x_{2}),(p_{1},p_{2}))( ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , ( italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ) is randomized if there is some bid profile (b1,b2)BΔsubscript𝑏1subscript𝑏2subscript𝐵Δ(b_{1},b_{2})\in B_{\Delta}( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT such that either x1(b1,b2)(0,1)subscript𝑥1subscript𝑏1subscript𝑏201x_{1}(b_{1},b_{2})\in(0,1)italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ ( 0 , 1 ) or x2(b1,b2)(0,1).subscript𝑥2subscript𝑏1subscript𝑏201x_{2}(b_{1},b_{2})\in(0,1).italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ∈ ( 0 , 1 ) .

Bidders employ learning algorithms that bid over the T𝑇Titalic_T rounds. We assume that the learning algorithms are mean-based no-regret learning algorithms (Braverman et al., 2018). For the following discussion, define Uit(b𝐛it):=τ=1tvixiτ(b,biτ)piτ(b,biτ)assignsuperscriptsubscript𝑈𝑖𝑡conditional𝑏subscriptsuperscript𝐛𝑡𝑖superscriptsubscript𝜏1𝑡subscript𝑣𝑖superscriptsubscript𝑥𝑖𝜏𝑏subscriptsuperscript𝑏𝜏𝑖superscriptsubscript𝑝𝑖𝜏𝑏subscriptsuperscript𝑏𝜏𝑖U_{i}^{t}(b\mid\mathbf{b}^{t}_{-i}):=\sum_{\tau=1}^{t}v_{i}\cdot x_{i}^{\tau}(% b,b^{\tau}_{-i})-p_{i}^{\tau}(b,b^{\tau}_{-i})italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b ∣ bold_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) := ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) to be the cumulative reward of agent i𝑖iitalic_i when they bid b𝑏bitalic_b over the t𝑡titalic_t rounds, whereas the other agent’s bids are 𝐛it={biτ}τ[t]subscriptsuperscript𝐛𝑡𝑖subscriptsubscriptsuperscript𝑏𝜏𝑖𝜏delimited-[]𝑡\mathbf{b}^{t}_{-i}=\{b^{\tau}_{-i}\}_{\tau\in[t]}bold_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT = { italic_b start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_τ ∈ [ italic_t ] end_POSTSUBSCRIPT. The mean-based property states that if a bid bBΔ𝑏subscript𝐵Δb\in B_{\Delta}italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT has performed significantly better than bid bBΔ,superscript𝑏subscript𝐵Δb^{\prime}\in B_{\Delta},italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , then the probability of bidding bsuperscript𝑏b^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the next round is negligible. This is formalized below.

Definition 2.2 (Mean-Based Property (Braverman et al., 2018)).

An algorithm for agent i𝑖iitalic_i is δ𝛿\deltaitalic_δ-mean-based if for any bid sequence 𝐛itsubscriptsuperscript𝐛𝑡𝑖\mathbf{b}^{t}_{-i}bold_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT such that Uit1(b𝐛it)Uit1(b𝐛it)>δTsuperscriptsubscript𝑈𝑖𝑡1conditional𝑏subscriptsuperscript𝐛𝑡𝑖superscriptsubscript𝑈𝑖𝑡1conditionalsuperscript𝑏subscriptsuperscript𝐛𝑡𝑖𝛿𝑇U_{i}^{t-1}(b\mid\mathbf{b}^{t}_{-i})-U_{i}^{t-1}(b^{\prime}\mid\mathbf{b}^{t}% _{-i})>\delta\cdot Titalic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT ( italic_b ∣ bold_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∣ bold_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) > italic_δ ⋅ italic_T, for some b,bBΔ𝑏superscript𝑏subscript𝐵Δb,b^{\prime}\in B_{\Delta}italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT, the probability of playing bid bsuperscript𝑏b^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT in the next round is at most δ𝛿\deltaitalic_δ. We say that an algorithm is mean-based if it is δ𝛿\deltaitalic_δ-mean-based for some δ=o(1).𝛿𝑜1\delta=o(1).italic_δ = italic_o ( 1 ) .

The no-regret learning property states that the cumulative utility that the bidding algorithm generates is close to the cumulative utility that the optimal fixed bid would have generated, regardless of the history of bids the other bidders played. This is formalized in Definition C.1. Mean-based no-regret learning algorithms are becoming a standard class of learning algorithms to use in auction environments (see, e.g., Braverman et al. (2018); Feng et al. (2021); Deng et al. (2022); Kolumbus and Nisan (2022a), and references therein) and include many known no-regret learning algorithms, including the multiplicative-weights update algorithm (MWU). For completeness, we present the version of MWU that we use in our work in Algorithm 1. The above definitions consider a fixed value of T.𝑇T.italic_T . Thus, given a sequence of such values T𝑇Titalic_T and the limiting behavior as T𝑇T\rightarrow\inftyitalic_T → ∞, we say that a family of algorithms, parameterized by the time horizon T𝑇Titalic_T, satisfies the mean-based definition if there exists {δT}Tsubscriptsubscript𝛿𝑇𝑇\{\delta_{T}\}_{T\in\mathbbm{N}}{ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_T ∈ blackboard_N end_POSTSUBSCRIPT such that δTT0,subscript𝑇subscript𝛿𝑇0\delta_{T}\rightarrow_{T\rightarrow\infty}0,italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT → start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT 0 , and each algorithm in this family is δTsubscript𝛿𝑇\delta_{T}italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT-mean-based. We define the no-regret property of such a family of algorithms in a similar way. In general, the asymptotic behavior of the algorithms we study in this work is with respect to T𝑇Titalic_T and the big O𝑂Oitalic_O notation suppresses quantities that do not depend on T.𝑇T.italic_T .

For the sake of exposition, we focus on the full feedback setting: after every round t[T]𝑡delimited-[]𝑇t\in[T]italic_t ∈ [ italic_T ], the algorithm learns for each bid bBΔ𝑏subscript𝐵Δb\in B_{\Delta}italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT the (expected) utility it would have generated had it played bid b𝑏bitalic_b. In Appendix G, we discuss potential extensions.

Throughout this paper we make a natural assumption on the algorithms which restrict bidders to never bid over their value. Specifically, for any round t𝑡titalic_t, and any history of bids before period t𝑡titalic_t, agent i𝑖iitalic_i bids bi>visubscript𝑏𝑖subscript𝑣𝑖b_{i}>v_{i}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT > italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with zero probability. Without this assumption, Braverman et al. (2018); Cai et al. (2023) show that the auctioneer can extract the entire welfare in the setting where the valuations of the agents are drawn i.i.d. in each round. We focus on the last-iterate convergence of the distribution of the bids of the algorithms as T.𝑇T\rightarrow\infty.italic_T → ∞ . This is a desirable property of algorithms in multi-agent games, and recent work has focused on establishing it for learning algorithms (Cai et al., 2022b, a; Cai and Zheng, 2022). This is formalized in Definition C.2.

Due to space limitations, all the proofs of our results can be found in the appendix.

3 Deterministic Truthful Auctions

In this section we study the effect of the learning rate on the convergence of no-regret learning algorithms in non-degenerate deterministic truthful auctions. Informally, the non-degeneracy requirement states that i) the winning bidder W𝑊Witalic_W under truthful bidding gets strictly positive utility, ii) there is some sufficiently small bid of the winning bidder such that the runner-up bidder R𝑅Ritalic_R wins the item by bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT but does not win by bidding vR1/Δsubscript𝑣𝑅1Δv_{R}-\nicefrac{{1}}{{\Delta}}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG. The formal definition is given in Definition D.1. We focus our attention to bidders that use MWU to participate in the auctions and we study the bidding distribution they converge to as a function of the ratio of the learning rate of their algorithms. Throughout this section we refer to the bidder who wins the auction under truthful bidding as the winning bidder and to the bidder that loses the auction under truthtelling as the runner-up bidder. Our main result in this section shows the following behavior in non-degenerate deterministic truthful auctions:

  • The winning bidder converges to bidding between its minimum winning bid and its true value, no matter what the choice of the learning rates of the algorithms are.

  • If the learning rate of the runner-up bidder is strictly faster than the learning rate of the winning bidder, then the runner-up bidder converges to bidding truthfully.

  • If the learning rate of the runner-up bidder is not strictly faster than that of the winning bidder, then the runner-up bidder converges to a bidding distribution whose mean is strictly smaller than its true value. This result holds under an even milder requirement than non-degeneracy. Namely, as long as the utility of the winning bidder under truthful bidding is strictly positive.

We remark that, when the learning rates of the algorithms are instantiated before the random draw of the two valuations of the agents that are i.i.d. from some distribution F𝐹Fitalic_F, then with probability at least 1/2121/21 / 2 the runner-up bidder will not converge to bidding truthfully, if the underlying auction is deterministic. As we will show later, this behavior worsens the revenue guarantees of the auction.

Let us first set up some notation to facilitate our discussion. We denote by vW{vL,vH}subscript𝑣𝑊subscript𝑣𝐿subscript𝑣𝐻v_{W}\in\{v_{L},v_{H}\}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ∈ { italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT } and ηTWsuperscriptsubscript𝜂𝑇𝑊\eta_{T}^{W}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT (resp., vR{vL,vH}subscript𝑣𝑅subscript𝑣𝐿subscript𝑣𝐻v_{R}\in\{v_{L},v_{H}\}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ∈ { italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT }, and ηTRsuperscriptsubscript𝜂𝑇𝑅\eta_{T}^{R}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT) the value and learning rate of the winning bidder (i.e., the one who wins if both bidders bid truthfully) and the runner-up bidder, respectively. We would like to remind the readers that, typically, the learning rate ηTsubscript𝜂𝑇\eta_{T}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT of MWU is a decreasing function of T𝑇Titalic_T and is chosen in a way to minimize the quantity CΔ/ηT+CΔηTT,subscript𝐶Δsubscript𝜂𝑇subscriptsuperscript𝐶Δsubscript𝜂𝑇𝑇\nicefrac{{C_{\Delta}}}{{\eta_{T}}}+C^{\prime}_{\Delta}\cdot\eta_{T}\cdot T\,,/ start_ARG italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG + italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T , where CΔ,CΔsubscript𝐶Δsubscriptsuperscript𝐶ΔC_{\Delta},C^{\prime}_{\Delta}italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT are discretization-dependent constants. Usually, it is instantiated with ηT=1/T.subscript𝜂𝑇1𝑇\eta_{T}=1/\sqrt{T}.italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = 1 / square-root start_ARG italic_T end_ARG . However, for the purposes of our analysis we will say that ηTsubscript𝜂𝑇\eta_{T}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT is non-degenerate if limTηTT=,limTηTlogT=0.formulae-sequencesubscript𝑇subscript𝜂𝑇𝑇subscript𝑇subscript𝜂𝑇𝑇0\lim_{T\rightarrow\infty}\eta_{T}\cdot T=\infty,\lim_{T\rightarrow\infty}\eta_% {T}\cdot\log T=0\,.roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T = ∞ , roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ roman_log italic_T = 0 . The intuition is that if the learning rate is slower than 1/T,1𝑇1/T,1 / italic_T , the bidder will be adjusting its bid distribution very slowly, so it will not learn to bid correctly. On the other hand, if the rate is faster than 1/logT1𝑇1/\log T1 / roman_log italic_T then the bidder will be adjusting its distribution too aggressively.

Our results show that in deterministic auctions the convergence behavior of the bidders depends heavily on the ratio between the learning rates. In particular, for the bidder with valuation vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT, we show that its bids converge to a distribution supported between p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG, the price it would pay if both bidders bid truthfully, and its value vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT, no matter what the choice of the learning rate of its algorithm is. On the other hand, the convergence behavior of the runner-up bidder is more nuanced: if ηTR/ηTW=ω(1),superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊𝜔1\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}=\omega(1),/ start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG = italic_ω ( 1 ) , i.e., the runner-up bidder learns more aggressively than the winning bidder, then it converges to bidding truthfully. However, if ηTR/ηTW<CΔ,superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊subscript𝐶Δ\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}<C_{\Delta},/ start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG < italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , where CΔsubscript𝐶ΔC_{\Delta}italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT is some discretization-dependent constant, then the runner-up converges to a bidding distribution that puts positive mass on every (discretized) point between 00 and vR,subscript𝑣𝑅v_{R},italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , and, in particular, its expected value is strictly less than vR.subscript𝑣𝑅v_{R}.italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT . We remark that even though our proof idea is inspired by Kolumbus and Nisan (2022a), our analysis considers all the possible learning rates that MWU could be instantiated with and requires a more technically involved argument. In particular, we notice that while the result of Kolumbus and Nisan (2022a) is, implicitly, proved for identical learning rates, we show that the choice of the learning rate affects the qualitative behavior of the algorithms in a crucial way.

We prove this result in two parts. We start with the case where ηTR/ηTW<CΔ.superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊subscript𝐶Δ\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}<C_{\Delta}./ start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG < italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . The idea of the proof is to split the horizon into consecutive periods of size O(1/ηTR),𝑂1superscriptsubscript𝜂𝑇𝑅O(1/\eta_{T}^{R}),italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ) , which we call epochs. Now following the idea of Kolumbus and Nisan (2022a), we show that within each epoch the runner-up bidder bids truthfully Ω(1/ηTW)Ω1superscriptsubscript𝜂𝑇𝑊\Omega(1/\eta_{T}^{W})roman_Ω ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) many times, so the total utility of the winning bidder for bidding between p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG and vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT will be at least Ω(1/ηTW)Ω1superscriptsubscript𝜂𝑇𝑊\Omega(1/\eta_{T}^{W})roman_Ω ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) greater than bidding anything between 00 and p^1/Δ.^𝑝1Δ\hat{p}-1/\Delta.over^ start_ARG italic_p end_ARG - 1 / roman_Δ . Because its learning rate is ηTW,superscriptsubscript𝜂𝑇𝑊\eta_{T}^{W},italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT , this means that it will move a constant fraction of its mass from the region {0,1/Δ,,p^1/Δ}01Δ^𝑝1Δ\{0,1/\Delta,\ldots,\hat{p}-1/\Delta\}{ 0 , 1 / roman_Δ , … , over^ start_ARG italic_p end_ARG - 1 / roman_Δ } to the region {p^,,vW}.^𝑝subscript𝑣𝑊\{\hat{p},\ldots,v_{W}\}.{ over^ start_ARG italic_p end_ARG , … , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT } . Summing this geometric series, we see that the winning bidder will submit bids in the region {0,1/Δ,,p^1/Δ}01Δ^𝑝1Δ\{0,1/\Delta,\ldots,\hat{p}-1/\Delta\}{ 0 , 1 / roman_Δ , … , over^ start_ARG italic_p end_ARG - 1 / roman_Δ } at most O(1/ηTW)𝑂1superscriptsubscript𝜂𝑇𝑊O(1/\eta_{T}^{W})italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) many times. Let us now focus on the runner-up bidder. Following the previous argument, its total utility for bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT will be at most O(1/ηTW)𝑂1superscriptsubscript𝜂𝑇𝑊O(1/\eta_{T}^{W})italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) greater than bidding some other bid bBΔ.superscript𝑏subscript𝐵Δb^{\prime}\in B_{\Delta}.italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . Since ηWR/ηTW<C,superscriptsubscript𝜂𝑊𝑅superscriptsubscript𝜂𝑇𝑊𝐶\eta_{W}^{R}/\eta_{T}^{W}<C,italic_η start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT < italic_C , this means the probability of bidding bsuperscript𝑏b^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT after T𝑇Titalic_T rounds is only smaller than the probability of bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT by a discretization-dependent multiplicative constant. The formal statement of this result and its proof follow are postponed to Appendix D.

Our next result illustrates that the convergence behavior of the runner-up type exhibits a sharp phase-transition phenomenon: if ηTRsuperscriptsubscript𝜂𝑇𝑅\eta_{T}^{R}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT is even slightly faster than ηTW,superscriptsubscript𝜂𝑇𝑊\eta_{T}^{W},italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT , i.e., ηTR/ηTW=ω(1),superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊𝜔1\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}=\omega(1),/ start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG = italic_ω ( 1 ) , then the runner-up will learn to bid truthfully. Let us first give a high-level idea of the proof. Similarly as before, we split the horizon into intervals of size O(1/ηTW).𝑂1superscriptsubscript𝜂𝑇𝑊O(1/\eta_{T}^{W}).italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) . We consider the first interval of this interaction. Because of the choice of the learning rate, we can show that the winning bidder will bid vR1/Δsubscript𝑣𝑅1Δv_{R}-1/\Deltaitalic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ at least Ω(1/ηTW)Ω1superscriptsubscript𝜂𝑇𝑊\Omega(1/\eta_{T}^{W})roman_Ω ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) many times. Thus, this means that the total utility of bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT for the runner-up bidder will be at least Ω(1/ηTW)Ω1superscriptsubscript𝜂𝑇𝑊\Omega(1/\eta_{T}^{W})roman_Ω ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) greater than bidding any other bid. Since ηTR/ηTW=ω(1)superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊𝜔1\eta_{T}^{R}/\eta_{T}^{W}=\omega(1)italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT = italic_ω ( 1 ), after the first epoch the MWU algorithm will place all but a o(1)𝑜1o(1)italic_o ( 1 )-fraction of the probability mass to bidding truthfully. The formal statement and its proof appear in Appendix D.

Next, we discuss the implications that our results have to the revenue guarantees of the auctioneer. In the setting with rational bidders, the seminal work of Myerson (1981) showed that using second-price auctions with an anonymous reserve price, which depends on the value distribution F𝐹Fitalic_F, generates the optimal revenue for the auctioneer. Our next result shows that this is no longer true when the bidders are learning agents, even when the valuations of the agents are drawn i.i.d. from the uniform distribution on BΔsubscript𝐵ΔB_{\Delta}italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT, which we denote by U[BΔ].𝑈delimited-[]subscript𝐵ΔU[B_{\Delta}].italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] . Intuitively, this happens because, no matter what the reserve price is, with some non-zero probability the valuations of both agents will be higher than the reserve price. Then, since the runner-up bids will be strictly lower than the true valuation, the generated revenue will be strictly lower than in the setting with rational agents, even when T.𝑇T\rightarrow\infty.italic_T → ∞ .

Theorem 3.1 (SPA with Reserve Is Not Revenue Optimal).

Let two agents draw their valuations from the uniform distribution over U[BΔ]𝑈delimited-[]subscript𝐵ΔU[B_{\Delta}]italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] and participate in T𝑇Titalic_T repeated auctions using mean-based learners. Let b1T,b2Tsuperscriptsubscript𝑏1𝑇superscriptsubscript𝑏2𝑇b_{1}^{T},b_{2}^{T}italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT be the bid distributions after T𝑇Titalic_T rounds. Let Rev(b1,b2;r)Revsubscript𝑏1subscript𝑏2𝑟\mathrm{Rev}(b_{1},b_{2};r)roman_Rev ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) denote the revenue of the second-price auction with reserve price r𝑟ritalic_r when the bids are b1,b2BΔ2.subscript𝑏1subscript𝑏2subscriptsuperscript𝐵2Δb_{1},b_{2}\in B^{2}_{\Delta}.italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . Then, for all r<11/Δ,𝑟11Δr<1-1/\Delta,italic_r < 1 - 1 / roman_Δ ,

𝐄v1,v2U[BΔ][limT𝐄b1b1T,b2b2T[Rev(b1,b2;r)v1,v2]]<𝐄v1,v2U[BΔ][Rev(v1,v2;r)]c,subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]subscript𝑇subscript𝐄formulae-sequencesimilar-tosubscript𝑏1superscriptsubscript𝑏1𝑇similar-tosubscript𝑏2superscriptsubscript𝑏2𝑇delimited-[]conditionalRevsubscript𝑏1subscript𝑏2𝑟subscript𝑣1subscript𝑣2subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]Revsubscript𝑣1subscript𝑣2𝑟𝑐\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\lim_{T\rightarrow\infty% }\mathop{\bf E\/}_{b_{1}\sim b_{1}^{T},b_{2}\sim b_{2}^{T}}[\mathrm{Rev}(b_{1}% ,b_{2};r)\mid v_{1},v_{2}]\right]<\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{% \Delta}]}\left[\mathrm{Rev}(v_{1},v_{2};r)\right]-c\,,start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_Rev ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ∣ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ] < start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ] - italic_c ,

where c>0𝑐0c>0italic_c > 0 is some constant that does not depend on T.𝑇T.italic_T .

4 The Value of Randomized Truthful Auctions: The Asymptotic Case

In this section we show that there is a class of randomized auctions such that when mean-based no-regret learners participate in them repeatedly, they converge to truthful bidding. This holds for any choice of the learning rates of these algorithms, which is in contrast to the results of Section 3. We start by defining a class of auctions called strictly IC.

Definition 4.1 (Strictly IC Auctions).

An auction is called strictly IC if for every bidder i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ], valuation viBΔsubscript𝑣𝑖subscript𝐵Δv_{i}\in B_{\Delta}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT, and bid profile biBΔn1subscript𝑏𝑖superscriptsubscript𝐵Δ𝑛1b_{-i}\in B_{\Delta}^{n-1}italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT it holds that vixi(vi,bi)pi(vi,bi)>vixi(b,bi)pi(b,bi),bvi.formulae-sequencesubscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝑖subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝑖subscript𝑣𝑖subscript𝑥𝑖𝑏subscript𝑏𝑖subscript𝑝𝑖𝑏subscript𝑏𝑖for-all𝑏subscript𝑣𝑖v_{i}\cdot x_{i}(v_{i},b_{-i})-p_{i}(v_{i},b_{-i})>v_{i}\cdot x_{i}(b,b_{-i})-% p_{i}(b,b_{-i}),\forall b\neq v_{i}\,.italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) > italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) , ∀ italic_b ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

The next result, which is very useful for our derivation, states that when mean-based no-regret learning algorithms bid in some strictly IC auction they converge to bidding truthfully. Recall the definition of a mean-based learner (cf. Definition 2.2) which states that if the cumulative utility of some bid b𝑏bitalic_b up until round t1𝑡1t-1italic_t - 1 is much smaller than the utility of some other bid bsuperscript𝑏b^{\prime}italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT, then the probability of playing b𝑏bitalic_b in the next round t𝑡titalic_t is negligible. The proof is postponed to Appendix E.

Lemma 4.2 (Convergence in Strictly IC Auctions).

Consider n𝑛nitalic_n bidders that participate in a repeated strictly IC auction A𝐴Aitalic_A using mean-based no-regret learning algorithms. Then, as T,𝑇T\rightarrow\infty,italic_T → ∞ , the bidders converge to truthful bidding in a last-iterate sense.

The next important observation is that when we are taking a non-trivial combination of an IC auction with a strictly IC auction, the resulting auction is strictly IC. The notion of mixture we consider is formalized in Definition 4.3.

Definition 4.3 (Mixture of Auctions).

Let A=(x(),p())𝐴𝑥𝑝A=\left(x(\cdot),p(\cdot)\right)italic_A = ( italic_x ( ⋅ ) , italic_p ( ⋅ ) ) be an IC auction and A=(x(),p())superscript𝐴superscript𝑥superscript𝑝A^{\prime}=\left(x^{\prime}(\cdot),p^{\prime}(\cdot)\right)italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = ( italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) ) be a strictly IC auction. For some q(0,1)𝑞01q\in(0,1)italic_q ∈ ( 0 , 1 ) we define the q𝑞qitalic_q-mixture of the auctions A,A𝐴superscript𝐴A,A^{\prime}italic_A , italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT to the be auction A~q=(qx()+(1q)x(),qp()+(1q)p()).subscript~𝐴𝑞𝑞𝑥1𝑞superscript𝑥𝑞𝑝1𝑞superscript𝑝\widetilde{A}_{q}=\left(q\cdot x(\cdot)+(1-q)\cdot x^{\prime}(\cdot),q\cdot p(% \cdot)+(1-q)\cdot p^{\prime}(\cdot)\right).over~ start_ARG italic_A end_ARG start_POSTSUBSCRIPT italic_q end_POSTSUBSCRIPT = ( italic_q ⋅ italic_x ( ⋅ ) + ( 1 - italic_q ) ⋅ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) , italic_q ⋅ italic_p ( ⋅ ) + ( 1 - italic_q ) ⋅ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) ) .

Notice that for the allocation rule qx()+(1q)x()𝑞𝑥1𝑞superscript𝑥q\cdot x(\cdot)+(1-q)\cdot x^{\prime}(\cdot)italic_q ⋅ italic_x ( ⋅ ) + ( 1 - italic_q ) ⋅ italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) Myerson’s lemma states that the corresponding payment rule that makes the auction truthful is indeed qp()+(1q)p().𝑞𝑝1𝑞superscript𝑝q\cdot p(\cdot)+(1-q)\cdot p^{\prime}(\cdot).italic_q ⋅ italic_p ( ⋅ ) + ( 1 - italic_q ) ⋅ italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) . The following claim, whose proof follows from the definition of this class of auctions, formalizes the fact that the class of strictly IC auctions is closed under mixtures with IC auctions.

Claim 1 (Mixture of IC and Strictly IC Auction).

Let A,A𝐴superscript𝐴A,A^{\prime}italic_A , italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be an IC, strictly IC auction, respectively. Then, for any q(0,1)𝑞01q\in(0,1)italic_q ∈ ( 0 , 1 ) the auction qA+(1q)A𝑞𝐴1𝑞superscript𝐴q\cdot A+(1-q)\cdot A^{\prime}italic_q ⋅ italic_A + ( 1 - italic_q ) ⋅ italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is strictly IC.

We remark that we can construct strictly IC auctions using randomization; such an example is presented in Section 5. Equipped with the above results, we can show that there is a black-box transformation from any IC auction A𝐴Aitalic_A to a strictly IC auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT so that as T,𝑇T\rightarrow\infty,italic_T → ∞ , any mean-based learning algorithms converges to truthful bidding, and the auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is close to the auction A𝐴Aitalic_A in the sense that |xi(b)xi(b)|=o(1),|pi(b)pi(b)|=o(1),i[n],bBΔn.formulae-sequencesubscript𝑥𝑖𝑏subscriptsuperscript𝑥𝑖𝑏𝑜1formulae-sequencesubscript𝑝𝑖𝑏subscriptsuperscript𝑝𝑖𝑏𝑜1formulae-sequencefor-all𝑖delimited-[]𝑛for-all𝑏subscriptsuperscript𝐵𝑛Δ|x_{i}(b)-x^{\prime}_{i}(b)|=o(1),|p_{i}(b)-p^{\prime}_{i}(b)|=o(1),\forall i% \in[n],\forall b\in B^{n}_{\Delta}.| italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , | italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , ∀ italic_i ∈ [ italic_n ] , ∀ italic_b ∈ italic_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . The formal statement of the result follows.

Theorem 4.4.

Let A𝐴Aitalic_A be an IC auction for n𝑛nitalic_n agents with valuations v1,,vnsubscript𝑣1subscript𝑣𝑛v_{1},\ldots,v_{n}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT. Let each agent i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] use a mean-based no-regret learning algorithm to bid in the auction. Then, there exists an auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that for each agent i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] we have that limTbiT=visubscript𝑇subscriptsuperscript𝑏𝑇𝑖subscript𝑣𝑖\lim_{T\rightarrow\infty}b^{T}_{i}=v_{i}roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and |xi(b)xi(b)|=o(1),|pi(b)pi(b)|=o(1),bBΔn,formulae-sequencesubscript𝑥𝑖𝑏subscriptsuperscript𝑥𝑖𝑏𝑜1formulae-sequencesubscript𝑝𝑖𝑏subscriptsuperscript𝑝𝑖𝑏𝑜1for-all𝑏subscriptsuperscript𝐵𝑛Δ|x_{i}(b)-x^{\prime}_{i}(b)|=o(1),|p_{i}(b)-p^{\prime}_{i}(b)|=o(1),\forall b% \in B^{n}_{\Delta},| italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , | italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , ∀ italic_b ∈ italic_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , where xi(),xi()subscript𝑥𝑖subscriptsuperscript𝑥𝑖x_{i}(\cdot),x^{\prime}_{i}(\cdot)italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ ) , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ ) (resp. pi(),pi()subscript𝑝𝑖subscriptsuperscript𝑝𝑖p_{i}(\cdot),p^{\prime}_{i}(\cdot)italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ ) , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ )) is the allocation (resp. payment) rule of A,A.𝐴superscript𝐴A,A^{\prime}.italic_A , italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT .

Equilibria of Meta-Game in Repeated Strictly IC Auctions We now describe the implications that our results have for the meta-game that we alluded to in Section 1. Recall that this game is defined as follows: the agents submit their valuations to mean-based no-regret learning algorithms and then, given these fixed valuations, they bid on the behalf of the agents in a repeated truthful auction A𝐴Aitalic_A. The main question we are interested in understanding is given the specification of the auctions and the valuations of the agents, what is the optimal value they should submit to the algorithms in order to maximize their utility, after a large number of steps?

Despite the fact that A𝐴Aitalic_A is IC and IR, Kolumbus and Nisan (2022a) showed that, rather surprisingly, when two agents use MWU to participate in repeated second price auctions there are instances where the agent with the low valuation has an incentive to report a higher value to its algorithm than its true one. This is because the valuation reported by one agent affects the bidding distribution that the other agent will converge to. To illustrate this point, assume that the low type reports vL>vHsubscriptsuperscript𝑣𝐿subscript𝑣𝐻v^{\prime}_{L}>v_{H}italic_v start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT > italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT to its bidding algorithm. Then, the bidder with type vHsubscript𝑣𝐻v_{H}italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT will take the role of the low bidder in the interaction and will converge to bidding strictly below vH.subscript𝑣𝐻v_{H}.italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT . Now if its expected bid is also below vLsubscript𝑣𝐿v_{L}italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT, this will generate strictly positive utility for its opponent. Using our previous construction from Theorem 4.4 and transforming the auction A𝐴Aitalic_A to a strictly IC auction A,superscript𝐴A^{\prime},italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , we can show that in the new meta-game every agent can gain at most o(1)𝑜1o(1)italic_o ( 1 ) more utility in the long run by misreporting to the algorithm than reporting its true valuation. The reason why we observe a qualitatively different behavior in our construction is that every algorithm converges to bidding its reported value, no matter what the reported values of the other agents are. Due to space constraints, we refer the interested reader to Appendix E

Revenue Maximization in the Learning Setting In this section, we illustrate another application of Theorem 4.4 to revenue maximization in the learning setting. We are interested in auctions with strong revenue guarantees when the bids are coming from the limiting distribution of the algorithms, as T.𝑇T\rightarrow\infty.italic_T → ∞ . This has the additional complication that not only do agents draw their valuations from the distribution F,𝐹F,italic_F , but also their bids come from the limiting distribution that the algorithms converge to, as T.𝑇T\rightarrow\infty.italic_T → ∞ . As we have seen already, this distribution depends on the valuation reported to the algorithm, the particular mean-based algorithm that it is using, and, potentially, the reported valuations and the algorithms of the opposing bidders.

As we explained in Section 3, second price auctions with reserves have strictly worse revenue guarantees in the setting with learning bidders compared to the setting with rational bidders. Using our transformation described in Theorem 4.4 we can restore their revenue guarantees. The following result whose formal proof is deferred to Appendix E is, essentially, a corollary of Theorem 4.4. Let us denote by Rev(A;b1,,bn)Rev𝐴subscript𝑏1subscript𝑏𝑛\mathrm{Rev}(A;b_{1},\ldots,b_{n})roman_Rev ( italic_A ; italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) the revenue of some auction A𝐴Aitalic_A and by Rev(Myerson;b1,,bn)RevMyersonsubscript𝑏1subscript𝑏𝑛\mathrm{Rev}(\mathrm{Myerson};b_{1},\ldots,b_{n})roman_Rev ( roman_Myerson ; italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) the revenue of Myerson’s optimal auction for F,𝐹F,italic_F , where the bid profile is b1,,bnBΔnsubscript𝑏1subscript𝑏𝑛subscriptsuperscript𝐵𝑛Δb_{1},\ldots,b_{n}\in B^{n}_{\Delta}italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT.

Corollary 4.5.

Consider an environment with n𝑛nitalic_n agents that draw their values i.i.d. from some regular distribution F𝐹Fitalic_F and participate in repeated single-item auctions using mean-based no-regret learning algorithms. Then, there is a randomized auction A𝐴Aitalic_A so that

𝐄v1,,vnFn[limT𝐄b1b1T,,bnbnT[Rev(A;b1,,bn)]|v1,,vn]subscript𝐄similar-tosubscript𝑣1subscript𝑣𝑛superscript𝐹𝑛delimited-[]conditionalsubscript𝑇subscript𝐄formulae-sequencesimilar-tosubscript𝑏1superscriptsubscript𝑏1𝑇similar-tosubscript𝑏𝑛subscriptsuperscript𝑏𝑇𝑛delimited-[]Rev𝐴subscript𝑏1subscript𝑏𝑛subscript𝑣1subscript𝑣𝑛\displaystyle\mathop{\bf E\/}_{v_{1},\ldots,v_{n}\sim F^{n}}\bigg{[}\lim_{T% \rightarrow\infty}\mathop{\bf E\/}_{b_{1}\sim b_{1}^{T},\ldots,b_{n}\sim b^{T}% _{n}}[\mathrm{Rev}(A;b_{1},\ldots,b_{n})]~{}\bigg{|}~{}v_{1},\ldots,v_{n}\bigg% {]}start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∼ italic_F start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∼ italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_Rev ( italic_A ; italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ] | italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ]
𝐄v1,,vnFn[Rev(Myerson;v1,,vn)]o(1).absentsubscript𝐄similar-tosubscript𝑣1subscript𝑣𝑛superscript𝐹𝑛delimited-[]RevMyersonsubscriptv1subscriptvn𝑜1\displaystyle\geq\mathop{\bf E\/}_{v_{1},\dots,v_{n}\sim F^{n}}[\mathrm{Rev}(% \mathrm{Myerson;v_{1},\ldots,v_{n})}]-o(1).≥ start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∼ italic_F start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_Rev ( roman_Myerson ; roman_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , roman_v start_POSTSUBSCRIPT roman_n end_POSTSUBSCRIPT ) ] - italic_o ( 1 ) .

Given the results from Theorem 3.1 and Corollary 4.5 we would like to remark the following.

Remark 1 (Randomized Auctions vs. SPA with Reserve).

Our results illustrate that randomized auctions have strictly better revenue guarantees compared to SPA with reserve price, when the bidders are using mean-based no-regret learning algorithms. This is a property of randomized auctions that is not witnessed in the setting where the bidders are fully rational, as proven by Myerson (1981).

5 Revenue Maximization in the Finite Time Horizon Setting

So far, we have focused on the asymptotic regime and we have studied the convergence of the learning bidders under various auctions. In this section, we study the finite-horizon setting, where our goal is to come up with auctions that have strong revenue guarantees for the auctioneer. We focus on the prior-free setting, meaning that the auctioneer does not have any distributional knowledge about the valuation of the agents. Similarly to the rest of the paper, we assume that the two buyers are using mean-based no-regret learning algorithms to participate in single-item auctions for T𝑇Titalic_T rounds. Since we are working on the prior-free setting, it is natural to compete with the cumulative revenue of the second-price auction. The goal of the auctioneer is to choose an auction in a way that minimizes

Reg~T(A;vL,vH)=t=1TRev(vL,vH;SP)𝐄[t=1TRev(bLt,bHt;A)],subscript~Reg𝑇𝐴subscript𝑣𝐿subscript𝑣𝐻superscriptsubscript𝑡1𝑇Revsubscript𝑣𝐿subscript𝑣𝐻SP𝐄delimited-[]superscriptsubscript𝑡1𝑇Revsuperscriptsubscript𝑏𝐿𝑡superscriptsubscript𝑏𝐻𝑡𝐴\widetilde{\mathrm{Reg}}_{T}(A;v_{L},v_{H})=\sum_{t=1}^{T}\mathrm{Rev}(v_{L},v% _{H};\mathrm{SP})-\mathop{\bf E\/}\left[\sum_{t=1}^{T}\mathrm{Rev}(b_{L}^{t},b% _{H}^{t};A)\right]\,,over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ; roman_SP ) - start_BIGOP bold_E end_BIGOP [ ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ; italic_A ) ] ,

where the expectation is taken with respect to the randomness of the learning algorithms and, potentially, the auction. We will refer to this benchmark as the auctioneer regret. One quantity that will be useful for the derivation of our results is the following

γA=mini{1,2},bi,bi,viBΔ3:bivi{(vixi(vi,bi)pi(vi,bi))(vixi(bi,bi)pi(bi,bi))},subscript𝛾𝐴subscript:formulae-sequence𝑖12subscript𝑏𝑖subscript𝑏𝑖subscript𝑣𝑖subscriptsuperscript𝐵3Δsubscript𝑏𝑖subscript𝑣𝑖subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝑖subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝑖subscript𝑣𝑖subscript𝑥𝑖subscript𝑏𝑖subscript𝑏𝑖subscript𝑝𝑖subscript𝑏𝑖subscript𝑏𝑖\gamma_{A}=\min_{i\in\{1,2\},b_{i},b_{-i},v_{i}\in B^{3}_{\Delta}:b_{i}\neq v_% {i}}\left\{\left(v_{i}\cdot x_{i}(v_{i},b_{-i})-p_{i}(v_{i},b_{-i})\right)-% \left(v_{i}\cdot x_{i}(b_{i},b_{-i})-p_{i}(b_{i},b_{-i})\right)\right\}\,,italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_i ∈ { 1 , 2 } , italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT : italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUBSCRIPT { ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ) - ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ) } ,

i.e., the minimum increase in the utility by bidding truthfully instead of bidding non-truthfully in A.𝐴A.italic_A .

Our first goal is to understand the dependence of the auctioneer regret on the time horizon T𝑇Titalic_T. Then, we will move on to establishing bounds with respect to the number of discretized bids Δ.Δ\Delta.roman_Δ . Our first result shows that given any strictly IC auction A𝐴Aitalic_A there exists an auction ATsubscript𝐴𝑇A_{T}italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT that achieves auctioneer regret O(TΔδTγA).𝑂𝑇Δsubscript𝛿𝑇subscript𝛾𝐴O\left(T\cdot\sqrt{\frac{\Delta\cdot\delta_{T}}{\gamma_{A}}}\right).italic_O ( italic_T ⋅ square-root start_ARG divide start_ARG roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG end_ARG ) . This is formalized below and the proof is postponed to Appendix F.

Proposition 5.1.

There exists auction ATsubscript𝐴𝑇A_{T}italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT which is a mixture of some strictly IC auction A𝐴Aitalic_A and SPASPA\mathrm{SPA}roman_SPA such that, for all vL,vH[0,1]2subscript𝑣𝐿subscript𝑣𝐻superscript012v_{L},v_{H}\in[0,1]^{2}italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ∈ [ 0 , 1 ] start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT and for all δTsubscript𝛿𝑇\delta_{T}italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT-mean-based learning algorithms it holds that Reg~T(AT;vL,vH)=O(ΔδTγAT),vL,vHBΔ2.formulae-sequencesubscript~Reg𝑇subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻𝑂Δsubscript𝛿𝑇subscript𝛾𝐴𝑇for-allsubscript𝑣𝐿subscript𝑣𝐻subscriptsuperscript𝐵2Δ\widetilde{\mathrm{Reg}}_{T}(A_{T};v_{L},v_{H})={O}\left(\sqrt{\frac{\Delta% \cdot\delta_{T}}{\gamma_{A}}}\cdot T\right),\forall v_{L},v_{H}\in B^{2}_{% \Delta}\,.over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) = italic_O ( square-root start_ARG divide start_ARG roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG end_ARG ⋅ italic_T ) , ∀ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

We emphasize that for common mean-based no-regret learning algorithms such as MWU it is the case that δT=O~(1/T),subscript𝛿𝑇~𝑂1𝑇\delta_{T}=\widetilde{O}\left({\nicefrac{{1}}{{\sqrt{T}}}}\right),italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = over~ start_ARG italic_O end_ARG ( / start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG ) , which implies that the auctioneer regret from Proposition 5.1 grows as O~(T3/4).~𝑂superscript𝑇34\widetilde{O}\left(T^{3/4}\right).over~ start_ARG italic_O end_ARG ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) . Our next result complements this result by showing that even if the high-valuation bidder always bids truthfully and the low-valuation bidder uses MWU with learning rate Θ(1/T),Θ1𝑇\Theta(\nicefrac{{1}}{{\sqrt{T}}}),roman_Θ ( / start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG ) , no auction can achieve a better auctioneer regret than O(T3/4).𝑂superscript𝑇34O(T^{3/4}).italic_O ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) .

Proposition 5.2 (Lower Bound for Constant Auction Policies).

Consider a repeated auction environment where the high-valuation bidder bids truthfully and the low-valuation bidder uses MWU with rate Θ(1/T).Θ1𝑇\Theta(\nicefrac{{1}}{{\sqrt{T}}}).roman_Θ ( / start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG ) . Then, every truthful auction ATsubscript𝐴𝑇A_{T}italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT has an auctioneer regret Reg~T(AT;vL,vH)CΔT3/4,subscript~Reg𝑇subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻subscript𝐶Δsuperscript𝑇34\widetilde{\mathrm{Reg}}_{T}(A_{T};v_{L},v_{H})\geq C_{\Delta}\cdot T^{3/4},over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≥ italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT , where CΔ>0subscript𝐶Δ0C_{\Delta}>0italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT > 0 is some constant that depends on the discretization parameter.

The proof is postponed to Appendix F. We note that choosing the learning rate of MWU to be 1/T1𝑇1/\sqrt{T}1 / square-root start_ARG italic_T end_ARG gives the optimal no-regret guarantees. Other choices, such as ηT=Ω(1),subscript𝜂𝑇Ω1\eta_{T}=\Omega(1),italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = roman_Ω ( 1 ) , have trivial regret bounds.

Having established the previous results for repeated auctions where the auctions remain constant across all the iterations, it is natural to ask whether we can get improved results when the auctioneer is allowed to change the underlying auction, but in a way that is oblivious to the bids that bidders have submitted so far. In other words, the auctioneer has to commit to an auction schedule {A1,,AT}subscript𝐴1subscript𝐴𝑇\{A_{1},\ldots,A_{T}\}{ italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT } before the beginning of the interaction. We extend the definition of the auctioneer regret in a natural way to allow for different auctions in every timestep and we denote Reg~T(A1,,AT;vL,vH)=t=1TRev(vL,vH;SP)𝐄[t=1TRev(bLt,bHt;At)].subscript~Reg𝑇subscript𝐴1subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻superscriptsubscript𝑡1𝑇Revsubscript𝑣𝐿subscript𝑣𝐻SP𝐄delimited-[]superscriptsubscript𝑡1𝑇Revsuperscriptsubscript𝑏𝐿𝑡superscriptsubscript𝑏𝐻𝑡subscript𝐴𝑡\widetilde{\mathrm{Reg}}_{T}(A_{1},\ldots,A_{T};v_{L},v_{H})=\sum_{t=1}^{T}% \mathrm{Rev}(v_{L},v_{H};\mathrm{SP})-\mathop{\bf E\/}[\sum_{t=1}^{T}\mathrm{% Rev}(b_{L}^{t},b_{H}^{t};A_{t})]\,.over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ; roman_SP ) - start_BIGOP bold_E end_BIGOP [ ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ; italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT ) ] . Our next result shows that there exists an auction schedule where the auctioneer changes the underlying auction only once throughout the interaction so that its regret is bounded by O~(δTT).~𝑂subscript𝛿𝑇𝑇\widetilde{O}(\delta_{T}\cdot T).over~ start_ARG italic_O end_ARG ( italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T ) . For typical choices of ηTsubscript𝜂𝑇\eta_{T}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT this translates to an auctioneer regret bounded by O~(T).~𝑂𝑇\widetilde{O}({\sqrt{T}}).over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ) . The main insight is that the auctioneer can split the interaction into two intervals: the first interval has size T0,subscript𝑇0T_{0},italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , for some appropriately chosen T0[T],subscript𝑇0delimited-[]𝑇T_{0}\in[T],italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ [ italic_T ] , where the auctioneer uses some strictly IC auction A𝐴Aitalic_A that encourages the learners to converge to bidding truthfully. Then, assuming that T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT is large enough to guarantee this convergence, the auctioneer switches to using second-price auction. This is perhaps counterintuitive because in other no-regret learning settings, such as multi-armed bandits, the optimal regret bound is achieved when exploration and exploitation are happening simultaneously, whereas in our setting these two phases are separated.

Theorem 5.3.

There exists an auction schedule (A1,,AT)subscript𝐴1subscript𝐴𝑇(A_{1},\ldots,A_{T})( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) in which A1=A2==AT0=A,subscript𝐴1subscript𝐴2subscript𝐴subscript𝑇0𝐴A_{1}=A_{2}=\ldots=A_{T_{0}}=A,italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT = … = italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = italic_A , where A𝐴Aitalic_A is any strictly IC auction, and AT0+1=AT0+2==AT=SPsubscript𝐴subscript𝑇01subscript𝐴subscript𝑇02subscript𝐴𝑇SPA_{T_{0}+1}=A_{T_{0}+2}=\ldots=A_{T}=\mathrm{SP}italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 2 end_POSTSUBSCRIPT = … = italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = roman_SP, that achieves Reg~(A1,,AT;vL,vH)O(δTT(1γA+Δ)),vL,vHBΔ2.formulae-sequence~Regsubscript𝐴1subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻𝑂subscript𝛿𝑇𝑇1subscript𝛾𝐴Δfor-allsubscript𝑣𝐿subscript𝑣𝐻subscriptsuperscript𝐵2Δ\widetilde{\mathrm{Reg}}(A_{1},\ldots,A_{T};v_{L},v_{H})\leq O\left(\delta_{T}% \cdot T\cdot\left(\frac{1}{\gamma_{A}}+\Delta\right)\right),\forall v_{L},v_{H% }\in B^{2}_{\Delta}\,.over~ start_ARG roman_Reg end_ARG ( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≤ italic_O ( italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T ⋅ ( divide start_ARG 1 end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG + roman_Δ ) ) , ∀ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

The formal proof of this result is postponed to Appendix F. The previous result shows that for ηT=O~(1/T)subscript𝜂𝑇~𝑂1𝑇\eta_{T}=\smash{\widetilde{O}\left({\nicefrac{{1}}{{\sqrt{T}}}}\right)}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = over~ start_ARG italic_O end_ARG ( / start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG ) the auctioneer regret of the auction schedule we designed is O~(T).~𝑂𝑇\smash{\widetilde{O}(\sqrt{T})}.over~ start_ARG italic_O end_ARG ( square-root start_ARG italic_T end_ARG ) . Thus, we see an O~(T1/4)~𝑂superscript𝑇14\widetilde{O}(T^{1/4})over~ start_ARG italic_O end_ARG ( italic_T start_POSTSUPERSCRIPT 1 / 4 end_POSTSUPERSCRIPT ) improvement compared to the previous setting where the auctioneer was restricted to be using the same auction across all iterations.

Next, we prove that even if the auctioneer uses a different auction in every step, our bound from Theorem 5.3 is (almost) optimal with respect to the time horizon T.𝑇T.italic_T . The proof idea is that when the agents are using MWU with learning rate ηT,subscript𝜂𝑇\eta_{T},italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , the signals in the first O(1/ηT)𝑂1subscript𝜂𝑇O(1/\eta_{T})italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) steps are insufficient for them to move their bidding distribution to truthful bids. I.e., with at least some constant probability in every round within the first O(1/ηT)𝑂1subscript𝜂𝑇O(1/\eta_{T})italic_O ( 1 / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) rounds, they will not be bidding their true valuation. Importantly, our lower bound holds even in the (unrealistic) setting where the auctioneer can choose A1,,AT,subscript𝐴1subscript𝐴𝑇A_{1},\ldots,A_{T},italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , conditioned on vL,vH.subscript𝑣𝐿subscript𝑣𝐻v_{L},v_{H}.italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT . This is formalized below; the proof is postponed to Appendix F.

Proposition 5.4.

When two agents are using MWU with learning rate 1/T1𝑇\nicefrac{{1}}{{\sqrt{T}}}/ start_ARG 1 end_ARG start_ARG square-root start_ARG italic_T end_ARG end_ARG to participate in repeated single-item auctions for all the auction schedules (A1,,AT)subscript𝐴1subscript𝐴𝑇(A_{1},\ldots,A_{T})( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) it holds that Reg~(A1,,AT;vL,vH)=Ω(T).~Regsubscript𝐴1subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻Ω𝑇\smash{\widetilde{\mathrm{Reg}}(A_{1},\ldots,A_{T};v_{L},v_{H})}=\Omega(\sqrt{% T})\,.over~ start_ARG roman_Reg end_ARG ( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) = roman_Ω ( square-root start_ARG italic_T end_ARG ) .

Having established the optimal dependence with respect to the time horizon T,𝑇T,italic_T , we now shift our attention to understanding the dependence of the auctioneer regret on the discretization parameter Δ.Δ\Delta.roman_Δ . First, we define an auction A¯¯𝐴\bar{A}over¯ start_ARG italic_A end_ARG that satisfies γA¯=Θ(1/Δ2).subscript𝛾¯𝐴Θ1superscriptΔ2\gamma_{\bar{A}}=\Theta(\nicefrac{{1}}{{\Delta^{2}}}).italic_γ start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG end_POSTSUBSCRIPT = roman_Θ ( / start_ARG 1 end_ARG start_ARG roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) .

Definition 5.5 (Staircase Auction).

We define the allocation rule of auction A¯¯𝐴\bar{A}over¯ start_ARG italic_A end_ARG in the following way: with probability 1/2121/21 / 2 select a bidder i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 } independently of their bids and then allocate to i𝑖iitalic_i with probability bi.subscript𝑏𝑖b_{i}.italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . We define the payment rule in the way that makes the auction truthful.

A simple application of Myerson’s lemma shows that γA¯=Θ(1/Δ2).subscript𝛾¯𝐴Θ1superscriptΔ2\gamma_{\bar{A}}=\Theta(\nicefrac{{1}}{{\Delta^{2}}}).italic_γ start_POSTSUBSCRIPT over¯ start_ARG italic_A end_ARG end_POSTSUBSCRIPT = roman_Θ ( / start_ARG 1 end_ARG start_ARG roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) . This is because between any two consecutive bids, i.e., bids whose distance is 1/Δ,1Δ1/\Delta,1 / roman_Δ , the increase in the allocation is 1/2Δ12Δ\nicefrac{{1}}{{2\Delta}}/ start_ARG 1 end_ARG start_ARG 2 roman_Δ end_ARG and the function is linear. A corollary of Theorem 5.3 shows the following bound in the auctioneer regret.

Corollary 5.6.

Let the bidders use a mean-based learner with ηT=O~(logΔ/T)subscript𝜂𝑇~𝑂Δ𝑇\eta_{T}=\widetilde{O}(\sqrt{\nicefrac{{\log\Delta}}{{T}}})italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = over~ start_ARG italic_O end_ARG ( square-root start_ARG / start_ARG roman_log roman_Δ end_ARG start_ARG italic_T end_ARG end_ARG ) and the auctioneer use the schedule (A1,,AT)subscript𝐴1subscript𝐴𝑇(A_{1},\ldots,A_{T})( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) with A1==AT0=A¯,AT0+1=AT0+2==AT=SPA,formulae-sequencesubscript𝐴1subscript𝐴subscript𝑇0¯𝐴subscript𝐴subscript𝑇01subscript𝐴subscript𝑇02subscript𝐴𝑇SPAA_{1}=\ldots=A_{T_{0}}=\bar{A},A_{T_{0}+1}=A_{T_{0}+2}=\ldots=A_{T}=\mathrm{% SPA},italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = … = italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT = over¯ start_ARG italic_A end_ARG , italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT = italic_A start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 2 end_POSTSUBSCRIPT = … = italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = roman_SPA , for T0=O~(T/Δ2)subscript𝑇0~𝑂𝑇superscriptΔ2T_{0}=\widetilde{O}\left(\nicefrac{{\sqrt{T}}}{{\Delta^{2}}}\right)italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = over~ start_ARG italic_O end_ARG ( / start_ARG square-root start_ARG italic_T end_ARG end_ARG start_ARG roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ). Then, Reg~(A1,,AT;vL,vH)O~(Δ2T),vL,vHBΔ2.formulae-sequence~Regsubscript𝐴1subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻~𝑂superscriptΔ2𝑇for-allsubscript𝑣𝐿subscript𝑣𝐻subscriptsuperscript𝐵2Δ\widetilde{\mathrm{Reg}}(A_{1},\ldots,A_{T};v_{L},v_{H})\leq\widetilde{O}\left% (\Delta^{2}\sqrt{T}\right),\forall v_{L},v_{H}\in B^{2}_{\Delta}\,.over~ start_ARG roman_Reg end_ARG ( italic_A start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≤ over~ start_ARG italic_O end_ARG ( roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT square-root start_ARG italic_T end_ARG ) , ∀ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

6 Conclusion

Our work studies the behavior of learning bidders in repeated single-item auctions, with persistent valuations. We show the limitations of deterministic mechanisms, and how nuances such as learning rates can qualitatively affect participant behavior. Moreover, we show that randomized auctions can encourage faster convergence of bidders to truthful behavior. We hope our work paves the way to better understanding of learning agents’ behavior in single-parameter environments, and of the power of randomization.

Acknowledgements

Anupam Gupta is supported in part by NSF grants CCF-1955785 and CCF-2006953. Grigoris Velegkas is supported in part by the AI Institute for Learning-Enabled Optimization at Scale (TILOS).

References

  • (1)
  • Asker et al. (2021) John Asker, Chaim Fershtman, and Ariel Pakes. 2021. Artificial intelligence and pricing: The impact of algorithm design. Technical Report. National Bureau of Economic Research.
  • Asker et al. (2022a) John Asker, Chaim Fershtman, and Ariel Pakes. 2022a. The Impact of AI Design on Pricing. Technical Report. Working Paper.
  • Asker et al. (2022b) John Asker, Chaim Fershtman, Ariel Pakes, et al. 2022b. Artificial intelligence, algorithm design and pricing. In AEA Papers and Proceedings, Vol. 112. American Economic Association, 452–56.
  • Banchio and Skrzypacz (2022) Martino Banchio and Andrzej Skrzypacz. 2022. Artificial intelligence and auction design. In Proceedings of the 23rd ACM Conference on Economics and Computation. 30–31.
  • Bertrand et al. (2023) Quentin Bertrand, Juan Duque, Emilio Calvano, and Gauthier Gidel. 2023. Q-learners Can Provably Collude in the Iterated Prisoner’s Dilemma. arXiv:2312.08484 [cs.GT]
  • Braverman et al. (2018) Mark Braverman, Jieming Mao, Jon Schneider, and Matt Weinberg. 2018. Selling to a no-regret buyer. In Proceedings of the 2018 ACM Conference on Economics and Computation. 523–538.
  • Cai et al. (2023) Linda Cai, S Matthew Weinberg, Evan Wildenhain, and Shirley Zhang. 2023. Selling to Multiple No-Regret Buyers. arXiv preprint arXiv:2307.04175 (2023).
  • Cai et al. (2022a) Yang Cai, Argyris Oikonomou, and Weiqiang Zheng. 2022a. Accelerated algorithms for monotone inclusions and constrained nonconvex-nonconcave min-max optimization. arXiv preprint arXiv:2206.05248 (2022).
  • Cai et al. (2022b) Yang Cai, Argyris Oikonomou, and Weiqiang Zheng. 2022b. Finite-time last-iterate convergence for learning in multi-player games. Advances in Neural Information Processing Systems 35 (2022), 33904–33919.
  • Cai and Zheng (2022) Yang Cai and Weiqiang Zheng. 2022. Accelerated single-call methods for constrained min-max optimization. arXiv preprint arXiv:2210.03096 (2022).
  • Calvano et al. (2020) Emilio Calvano, Giacomo Calzolari, Vincenzo Denicolò, Joseph E Harrington Jr, and Sergio Pastorello. 2020. Protecting consumers from collusive prices due to AI. Science 370, 6520 (2020), 1040–1042.
  • den Boer et al. (2022) Arnoud V den Boer, Janusz M Meylahn, and Maarten Pieter Schinkel. 2022. Artificial collusion: Examining supracompetitive pricing by Q-learning algorithms. Amsterdam Law School Research Paper 2022-25 (2022).
  • Deng et al. (2022) Xiaotie Deng, Xinyan Hu, Tao Lin, and Weiqiang Zheng. 2022. Nash convergence of mean-based learning algorithms in first price auctions. In Proceedings of the ACM Web Conference 2022. 141–150.
  • Epivent and Lambin (2022) Andréa Epivent and Xavier Lambin. 2022. On Algorithmic Collusion and Reward-Punishment Schemes. Available at SSRN 4227229 (2022).
  • Feng et al. (2021) Zhe Feng, Guru Guruganesh, Christopher Liaw, Aranyak Mehta, and Abhishek Sethi. 2021. Convergence analysis of no-regret bidding algorithms in repeated auctions. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 5399–5406.
  • Guruganesh et al. (2022) Guru Guruganesh, Aranyak Mehta, Di Wang, and Kangning Wang. 2022. Prior-Independent Auctions for Heterogeneous Bidders. arXiv preprint arXiv:2207.09429 (2022).
  • Hart and Mas-Colell (2000) Sergiu Hart and Andreu Mas-Colell. 2000. A simple adaptive procedure leading to correlated equilibrium. Econometrica 68, 5 (2000), 1127–1150.
  • Kolumbus and Nisan (2022a) Yoav Kolumbus and Noam Nisan. 2022a. Auctions between regret-minimizing agents. In Proceedings of the ACM Web Conference 2022. 100–111.
  • Kolumbus and Nisan (2022b) Yoav Kolumbus and Noam Nisan. 2022b. How and why to manipulate your own agent: On the incentives of users of learning agents. Advances in Neural Information Processing Systems 35 (2022), 28080–28094.
  • Liaw et al. (2023) Christopher Liaw, Aranyak Mehta, and Andres Perlroth. 2023. Efficiency of Non-Truthful Auctions in Auto-bidding: The Power of Randomization. In Proceedings of the ACM Web Conference 2023. 3561–3571.
  • Mehta (2022) Aranyak Mehta. 2022. Auction design in an auto-bidding setting: Randomization improves efficiency beyond vcg. In Proceedings of the ACM Web Conference 2022. 173–181.
  • Myerson (1981) Roger B Myerson. 1981. Optimal auction design. Mathematics of operations research 6, 1 (1981), 58–73.
  • Nekipelov et al. (2015) Denis Nekipelov, Vasilis Syrgkanis, and Eva Tardos. 2015. Econometrics for learning agents. In Proceedings of the sixteenth acm conference on economics and computation. 1–18.
  • Rawat (2023) Pranjal Rawat. 2023. Designing Auctions when Algorithms Learn to Bid: The critical role of Payment Rules. arXiv preprint arXiv:2306.09437 (2023).
  • Roughgarden (2010) Tim Roughgarden. 2010. Algorithmic game theory. Commun. ACM 53, 7 (2010), 78–86.
  • Skreta (2006) Vasiliki Skreta. 2006. Mechanism design for arbitrary type spaces. Economics Letters 91, 2 (2006), 293–299. https://doi.org/10.1016/j.econlet.2005.12.005
  • Zhang et al. (2023) Brian Hu Zhang, Gabriele Farina, Ioannis Anagnostides, Federico Cacciamani, Stephen Marcus McAleer, Andreas Alexander Haupt, Andrea Celli, Nicola Gatti, Vincent Conitzer, and Tuomas Sandholm. 2023. Steering No-Regret Learners to Optimal Equilibria. arXiv preprint arXiv:2306.05221 (2023).

Appendix A Multiplicative Weights Update (MWU)

In this section we describe the version of MWU we consider in this work. Similar to Braverman et al. (2018), we are using the following version of the algorithm.

1:  Choose ηT=logΔTsubscript𝜂𝑇Δ𝑇\eta_{T}=\sqrt{\frac{\log\Delta}{T}}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = square-root start_ARG divide start_ARG roman_log roman_Δ end_ARG start_ARG italic_T end_ARG end_ARG. Initialize ΔΔ\Deltaroman_Δ weights, letting witsubscriptsuperscript𝑤𝑡𝑖w^{t}_{i}italic_w start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT be the value of the i𝑖iitalic_ith weight at round t𝑡titalic_t. Initially, set all wi0=1subscriptsuperscript𝑤0𝑖1w^{0}_{i}=1italic_w start_POSTSUPERSCRIPT 0 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = 1, let v𝑣vitalic_v be the valuation of the agent.
2:  for t=1𝑡1t=1italic_t = 1 to T𝑇Titalic_T do
3:     Choose bid bisubscript𝑏𝑖b_{i}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT with probability pit=wit1/jwjt1subscriptsuperscript𝑝𝑡𝑖subscriptsuperscript𝑤𝑡1𝑖subscript𝑗subscriptsuperscript𝑤𝑡1𝑗p^{t}_{i}=w^{t-1}_{i}/\sum_{j}w^{t-1}_{j}italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_w start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT / ∑ start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT italic_w start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT.
4:     for j=1𝑗1j=1italic_j = 1 to K𝐾Kitalic_K do
5:        Let ujt=vxt(bj,b)pt(bj,b)subscriptsuperscript𝑢𝑡𝑗𝑣superscript𝑥𝑡subscript𝑏𝑗superscript𝑏superscript𝑝𝑡subscript𝑏𝑗superscript𝑏u^{t}_{j}=v\cdot x^{t}(b_{j},b^{\prime})-p^{t}(b_{j},b^{\prime})italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_v ⋅ italic_x start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_p start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
6:        Set wjt=wjt1eηTujtsubscriptsuperscript𝑤𝑡𝑗subscriptsuperscript𝑤𝑡1𝑗superscript𝑒subscript𝜂𝑇subscriptsuperscript𝑢𝑡𝑗w^{t}_{j}=w^{t-1}_{j}\cdot e^{\eta_{T}u^{t}_{j}}italic_w start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT = italic_w start_POSTSUPERSCRIPT italic_t - 1 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT ⋅ italic_e start_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_j end_POSTSUBSCRIPT end_POSTSUPERSCRIPT.
7:     end for
8:  end for
ALGORITHM 1 Multiplicative Weights Update Algorithm.

Appendix B Further Related Work

We view our results and the setting in which we work as orthogonal to the setting of Cai et al. (2023). Firstly, they do not restrict themselves to truthful auctions, and for their welfare extraction results, the agents are allowed to overbid. Secondly, in their setting, redrawing valuations i.i.d. in every round helps the learning process (this was also observed by Feng et al. (2021)). Intuitively, consider two agents and SPA: for every valuation of player 1, there is some positive probability that player 2’s draw is below it, hence player 1 will learn that bidding truthfully is strictly better (in expectation over the other random draw), which leads to the desired bidding behavior. In such a system, randomness is already present due to the draws of the valuations, which helps the convergence to the right bidding behavior.

Our work also differs from Cai et al. (2023) in having different conceptual goals: we aim to “restore” the single-shot behavior in natural auctions, such as second-price auctions, in the presence of mean-based learning agents by making minimal modifications to the underlying auction rule. On the other hand, Cai et al. (2023) aim to exploit the mean-based learning behavior to extract more revenue, and their auctions diverge from the truthful ones we consider in our work. Thus, in our setting, it is clear that reporting the valuation truthfully to the bidding algorithm is an (almost) optimal strategy for the agents (i.e., the so-called “meta-game” considered by Kolumbus and Nisan (2022a) is truthful), whereas it is not clear to us whether reporting the valuations truthfully to the no-regret algorithms is an optimal strategy in the setting of Cai et al. (2023).

Appendix C Omitted Details from Section 2

Skreta (2006) shows that our discrete-type space mechanism design problem approximates the mechanism design problem with continuous type space as ΔΔ\Delta\to\inftyroman_Δ → ∞: specifically, Proposition 1 from that paper gives the following claims.

Claim 2.

A mechanism is truthful if and only for every visubscript𝑣𝑖v_{-i}italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT xi(vi,vi)subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖x_{i}(v_{i},v_{-i})italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) is non-decreasing on visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT and pisubscript𝑝𝑖p_{i}italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT satisfy that

|pi(vi,vi)(vixi(vi,vi)0vixi(z,vi)𝑑z)|O(1/Δ).subscript𝑝𝑖subscript𝑣𝑖subscript𝑣𝑖subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖superscriptsubscript0subscript𝑣𝑖subscript𝑥𝑖𝑧subscript𝑣𝑖differential-d𝑧𝑂1Δ\left|\,p_{i}(v_{i},v_{-i})-\left(v_{i}x_{i}(v_{i},v_{-i})-\int_{0}^{v_{i}}x_{% i}(z,v_{-i})dz\right)\,\right|\leq O(1/\Delta).| italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - ∫ start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_z , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) italic_d italic_z ) | ≤ italic_O ( 1 / roman_Δ ) .
Claim 3.

Suppose bidders are rational agents (i.e., they maximize profits). Let OPT𝑂𝑃𝑇OPTitalic_O italic_P italic_T be the revenue of the revenue-maximizing mechanism (among truthful or non-truthful) that the auctioneer can implement, and Rev(rSPA)𝑅𝑒𝑣𝑟𝑆𝑃𝐴Rev(r-SPA)italic_R italic_e italic_v ( italic_r - italic_S italic_P italic_A ) be the revenue of a Second Price Auction with reserve r𝑟ritalic_r. Then for r=min{v:ϕ(v)0}𝑟:𝑣italic-ϕ𝑣0r=\min\{v:\phi(v)\geq 0\}italic_r = roman_min { italic_v : italic_ϕ ( italic_v ) ≥ 0 }, we have that OPT=Rev(rSPA)𝑂𝑃𝑇𝑅𝑒𝑣𝑟𝑆𝑃𝐴OPT=Rev(r-SPA)italic_O italic_P italic_T = italic_R italic_e italic_v ( italic_r - italic_S italic_P italic_A ).

Definition C.1 (No-Regret Learning Property).

Let {biτ}τ[T]subscriptsuperscriptsubscript𝑏𝑖𝜏𝜏delimited-[]𝑇\{b_{i}^{\tau}\}_{\tau\in[T]}{ italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT } start_POSTSUBSCRIPT italic_τ ∈ [ italic_T ] end_POSTSUBSCRIPT be the bid sequence submitted by agent i𝑖iitalic_i’s algorithm, and UiT(𝐛T)=τ=1Tvixiτ(biτ,biτ)piτ(biτ,biτ)superscriptsubscript𝑈𝑖𝑇superscript𝐛𝑇superscriptsubscript𝜏1𝑇subscript𝑣𝑖superscriptsubscript𝑥𝑖𝜏superscriptsubscript𝑏𝑖𝜏subscriptsuperscript𝑏𝜏𝑖superscriptsubscript𝑝𝑖𝜏superscriptsubscript𝑏𝑖𝜏subscriptsuperscript𝑏𝜏𝑖U_{i}^{T}(\mathbf{b}^{T})=\sum_{\tau=1}^{T}v_{i}\cdot x_{i}^{\tau}(b_{i}^{\tau% },b^{\tau}_{-i})-p_{i}^{\tau}(b_{i}^{\tau},b^{\tau}_{-i})italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) = ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) the total reward agent i𝑖iitalic_i receives. We say that this algorithm satisfies the no-regret property if for any sequence 𝐛iTsubscriptsuperscript𝐛𝑇𝑖\mathbf{b}^{T}_{-i}bold_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT it holds that

𝐄[maxbBΔUiT(b𝐛iT)UiT(𝐛T)]=o(T),𝐄delimited-[]subscript𝑏subscript𝐵Δsuperscriptsubscript𝑈𝑖𝑇conditional𝑏superscriptsubscript𝐛𝑖𝑇superscriptsubscript𝑈𝑖𝑇superscript𝐛𝑇𝑜𝑇\mathop{\bf E\/}\left[\max_{b\in B_{\Delta}}U_{i}^{T}(b\mid\mathbf{b}_{-i}^{T}% )-U_{i}^{T}(\mathbf{b}^{T})\right]=o(T)\,,start_BIGOP bold_E end_BIGOP [ roman_max start_POSTSUBSCRIPT italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_b ∣ bold_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) - italic_U start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( bold_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ] = italic_o ( italic_T ) ,

where the expectation is taken with respect to the randomness of the algorithm.

Definition C.2 (Last Iterate Convergence (LIC)).

Let b~iTsuperscriptsubscript~𝑏𝑖𝑇\smash{\tilde{b}}_{i}^{T}over~ start_ARG italic_b end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT the bid distribution of bidder i𝑖iitalic_i in the last round T𝑇Titalic_T. We say that b~iTsuperscriptsubscript~𝑏𝑖𝑇\smash{\tilde{b}}_{i}^{T}over~ start_ARG italic_b end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT converges to some distribution q~~𝑞\tilde{q}over~ start_ARG italic_q end_ARG over BΔsubscript𝐵ΔB_{\Delta}italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT if limTdTV(b~iT,q~)=o(1),subscript𝑇subscript𝑑TVsuperscriptsubscript~𝑏𝑖𝑇~𝑞𝑜1\lim_{T\to\infty}d_{\mathrm{TV}}(\smash{\tilde{b}}_{i}^{T},\tilde{q})=o(1),roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT italic_d start_POSTSUBSCRIPT roman_TV end_POSTSUBSCRIPT ( over~ start_ARG italic_b end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , over~ start_ARG italic_q end_ARG ) = italic_o ( 1 ) , where dTV:=12(bBΔ|b~iT(b)q~(b)|)assignsubscript𝑑TV12subscript𝑏subscript𝐵Δsuperscriptsubscript~𝑏𝑖𝑇𝑏~𝑞𝑏d_{\mathrm{TV}}:=\frac{1}{2}\left(\sum_{b\in B_{\Delta}}|\smash{\tilde{b}}_{i}% ^{T}(b)-\tilde{q}(b)|\right)italic_d start_POSTSUBSCRIPT roman_TV end_POSTSUBSCRIPT := divide start_ARG 1 end_ARG start_ARG 2 end_ARG ( ∑ start_POSTSUBSCRIPT italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT | over~ start_ARG italic_b end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ( italic_b ) - over~ start_ARG italic_q end_ARG ( italic_b ) | ) is the Total-Variation (TV) distance between b~iTsuperscriptsubscript~𝑏𝑖𝑇\smash{\tilde{b}}_{i}^{T}over~ start_ARG italic_b end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT and q~.~𝑞\tilde{q}.over~ start_ARG italic_q end_ARG .

Appendix D Omitted Details from Section 3

Definition D.1 (Non-Degenerate auctions).

A single-item auction (x,p)𝑥𝑝(x,p)( italic_x , italic_p ) for two agents is non-degenerate with respect to the valuation profile (v1,v2)subscript𝑣1subscript𝑣2(v_{1},v_{2})( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) if there are bid profiles b1v1,b2v2,formulae-sequencesubscript𝑏1subscript𝑣1subscript𝑏2subscript𝑣2b_{1}\leq v_{1},b_{2}\leq v_{2},italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≤ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ≤ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , so that

v1x1(v1,b2)p1(v1,b2)subscript𝑣1subscript𝑥1subscript𝑣1subscript𝑏2subscript𝑝1subscript𝑣1subscript𝑏2\displaystyle v_{1}\cdot x_{1}(v_{1},b_{2})-p_{1}(v_{1},b_{2})italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) >v1x1(v11/Δ,b2)p1(v1,b2)0absentsubscript𝑣1subscript𝑥1subscript𝑣11Δsubscript𝑏2subscript𝑝1subscript𝑣1subscript𝑏20\displaystyle>v_{1}\cdot x_{1}(v_{1}-\nicefrac{{1}}{{\Delta}},b_{2})-p_{1}(v_{% 1},b_{2})\geq 0> italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) ≥ 0
v2x2(b1,v2)p2(b1,v2)subscript𝑣2subscript𝑥2subscript𝑏1subscript𝑣2subscript𝑝2subscript𝑏1subscript𝑣2\displaystyle v_{2}\cdot x_{2}(b_{1},v_{2})-p_{2}(b_{1},v_{2})italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) >v2x2(b1,v21/Δ)p2(b1,v21/Δ)0,absentsubscript𝑣2subscript𝑥2subscript𝑏1subscript𝑣21Δsubscript𝑝2subscript𝑏1subscript𝑣21Δ0\displaystyle>v_{2}\cdot x_{2}(b_{1},v_{2}-\nicefrac{{1}}{{\Delta}})-p_{2}(b_{% 1},v_{2}-\nicefrac{{1}}{{\Delta}})\geq 0\,,> italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ) - italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ) ≥ 0 ,

and

max{v1x1(v1,v2)p1(v1,v2),v2x2(v1,v2)p2(v1,v2)}>0.subscript𝑣1subscript𝑥1subscript𝑣1subscript𝑣2subscript𝑝1subscript𝑣1subscript𝑣2subscript𝑣2subscript𝑥2subscript𝑣1subscript𝑣2subscript𝑝2subscript𝑣1subscript𝑣20\displaystyle\max\left\{v_{1}\cdot x_{1}(v_{1},v_{2})-p_{1}(v_{1},v_{2}),v_{2}% \cdot x_{2}(v_{1},v_{2})-p_{2}(v_{1},v_{2})\right\}>0\,.roman_max { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) } > 0 .

In order to show our result, we utilize a characterization (cf. Theorem D.2) regarding the structure of truthful deterministic single-item auctions that charge non-negative payments (see, e.g., Roughgarden (2010, Thm 9.36)) for n𝑛nitalic_n bidders.

Theorem D.2 (Characterization of Truthful Deterministic Single-Item Auctions Roughgarden (2010)).

A single-item auction is truthful, and satisfies NPT, i.e., no payment transfers from the auctioneer to the bidders, if and only if:

  • xi(,vi)subscript𝑥𝑖subscript𝑣𝑖x_{i}(\cdot,v_{-i})italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( ⋅ , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) is monotone for every i[n],viBΔn1.formulae-sequence𝑖delimited-[]𝑛subscript𝑣𝑖superscriptsubscript𝐵Δ𝑛1i\in[n],v_{-i}\in B_{\Delta}^{n-1}.italic_i ∈ [ italic_n ] , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT .

  • For all i[n],viBΔ,viBΔn1formulae-sequence𝑖delimited-[]𝑛formulae-sequencesubscript𝑣𝑖subscript𝐵Δsubscript𝑣𝑖superscriptsubscript𝐵Δ𝑛1i\in[n],v_{i}\in B_{\Delta},v_{-i}\in B_{\Delta}^{n-1}italic_i ∈ [ italic_n ] , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT we have that

    pi(vi,vi)={0, if xi(vi,vi)=0min{bBΔ:xi(b,vi)=1}, if xi(vi,vi)=1.subscript𝑝𝑖subscript𝑣𝑖subscript𝑣𝑖cases0 if subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖0:𝑏subscript𝐵Δsubscript𝑥𝑖𝑏subscript𝑣𝑖1 if subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖1\displaystyle p_{i}(v_{i},v_{-i})=\begin{cases}0,&\text{ if }x_{i}(v_{i},v_{-i% })=0\\ \min\{b\in B_{\Delta}:x_{i}(b,v_{-i})=1\},&\text{ if }x_{i}(v_{i},v_{-i})=1% \end{cases}\,.italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = { start_ROW start_CELL 0 , end_CELL start_CELL if italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 0 end_CELL end_ROW start_ROW start_CELL roman_min { italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT : italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 1 } , end_CELL start_CELL if italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 1 end_CELL end_ROW .
Theorem D.3 (No Deterministic Auction Leads to Truthful Bidding).

Fix a valuation profile (v1,v2)subscript𝑣1subscript𝑣2(v_{1},v_{2})( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and a deterministic truthful auction. Suppose bidders bid using MWU and with non-degenerate learning rates. Let W𝑊Witalic_W (respectively R𝑅Ritalic_R), be the bidder i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 } such that xi(vi,vi)=1subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖1x_{i}(v_{i},v_{-i})=1italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 1 (respectively, xi(vi,vi)=0subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖0x_{i}(v_{i},v_{-i})=0italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 0) and let p^=pW(vW,vR).^𝑝subscript𝑝𝑊subscript𝑣𝑊subscript𝑣𝑅\hat{p}=p_{W}(v_{W},v_{R}).over^ start_ARG italic_p end_ARG = italic_p start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) . Assume that limTηTR/ηTW<subscript𝑇superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊\lim_{T\rightarrow\infty}\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}<\inftyroman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT / start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG < ∞ and vWxW(vW,vR)p^>0.subscript𝑣𝑊subscript𝑥𝑊subscript𝑣𝑊subscript𝑣𝑅^𝑝0v_{W}\cdot x_{W}(v_{W},v_{R})-\hat{p}>0.italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) - over^ start_ARG italic_p end_ARG > 0 . Then, with probability at least 0.990.990.990.99, the winner’s bids converge to a distribution supported between p^,vW^𝑝subscript𝑣𝑊\hat{p},v_{W}over^ start_ARG italic_p end_ARG , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT and the runner-up bidder converges to a bidding distribution satisfying 0<𝐏𝐫[0]𝐏𝐫[1/Δ]𝐏𝐫[vR].0𝐏𝐫delimited-[]0𝐏𝐫delimited-[]1Δ𝐏𝐫delimited-[]subscript𝑣𝑅0<\mathop{\bf Pr\/}[0]\leq\mathop{\bf Pr\/}[1/\Delta]\leq\ldots\leq\mathop{\bf Pr% \/}[v_{R}].0 < start_BIGOP bold_Pr end_BIGOP [ 0 ] ≤ start_BIGOP bold_Pr end_BIGOP [ 1 / roman_Δ ] ≤ … ≤ start_BIGOP bold_Pr end_BIGOP [ italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ] .

Proof of Theorem D.3.

The idea of the proof is to split the horizon T𝑇Titalic_T into continuous non-overlapping epochs of length c/ηTW𝑐superscriptsubscript𝜂𝑇𝑊c/\eta_{T}^{W}italic_c / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT, where c𝑐citalic_c is some sufficiently large constant that depends on the discretization parameter ΔΔ\Deltaroman_Δ. Notice that since limTηTWT=subscript𝑇superscriptsubscript𝜂𝑇𝑊𝑇\lim_{T\rightarrow\infty}\eta_{T}^{W}\cdot T=\inftyroman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_T = ∞ these epochs are well-defined, when T𝑇Titalic_T is sufficiently large. Assume without loss of generality that the weights of all the bids that are at most vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT (resp. vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT) for the winning bidder (resp. runner-up) are initialized to 1. (The proof holds as long as there is some constant mass on each bid at the initialization stage, albeit with different constants.) We denote the epochs by τ𝜏\tauitalic_τ and the rounds of the interaction by t.𝑡t.italic_t .

Let cW=vWp^subscript𝑐𝑊subscript𝑣𝑊^𝑝c_{W}=v_{W}-\hat{p}italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT - over^ start_ARG italic_p end_ARG be the utility the bidder gets when it wins the auction. By assumption, cW>0.subscript𝑐𝑊0c_{W}>0.italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT > 0 . Let WWsubscript𝑊𝑊W_{W}italic_W start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT be the set of bids between p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG and vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT, i.e., WW={p^,p^+1/Δ,,vW}.subscript𝑊𝑊^𝑝^𝑝1Δsubscript𝑣𝑊W_{W}=\{\hat{p},\hat{p}+\nicefrac{{1}}{{\Delta}},\ldots,v_{W}\}.italic_W start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT = { over^ start_ARG italic_p end_ARG , over^ start_ARG italic_p end_ARG + / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , … , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT } . Whenever the runner-up bids vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT all the bids in WWsubscript𝑊𝑊W_{W}italic_W start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT increase their weights by a multiplicative factor of ecWηTWsuperscript𝑒subscript𝑐𝑊superscriptsubscript𝜂𝑇𝑊e^{c_{W}\cdot\eta_{T}^{W}}italic_e start_POSTSUPERSCRIPT italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT, whereas the weights of the other bids remain unchanged. Moreover, since the allocation rule is non-decreasing and the price does not depend on the bid, whenever the weight of some bid bBΔ𝑏subscript𝐵Δb\in B_{\Delta}italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT is increased, the weights of all the bids that are greater than b𝑏bitalic_b are also increased by the same amount. Notice that, since bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is a weakly-dominant strategy for the runner-up type, the mass that it puts on vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT will never decrease relatively to the mass of the rest of the bids. Thus, the probability of bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT for the runner-up type is at least 1/Δ1Δ1/\Delta1 / roman_Δ in every round. Hence, if we consider an interval of size T0=8Δ2/(ηTWcW)subscript𝑇08superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊T_{0}=8\Delta^{2}/(\eta_{T}^{W}\cdot c_{W})italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = 8 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) and we denote by Zi,i[T0],subscript𝑍𝑖𝑖delimited-[]subscript𝑇0Z_{i},i\in[T_{0}],italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] , the indicator variable of whether the runner-up bid vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT in round i[T0]𝑖delimited-[]subscript𝑇0i\in[T_{0}]italic_i ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] we have that for any α>0𝛼0\alpha>0italic_α > 0

𝐏𝐫[Z1++ZT0α]𝐏𝐫[Z~1++Z~T0α],𝐏𝐫delimited-[]subscript𝑍1subscript𝑍subscript𝑇0𝛼𝐏𝐫delimited-[]subscript~𝑍1subscript~𝑍subscript𝑇0𝛼\mathop{\bf Pr\/}\left[Z_{1}+\ldots+Z_{T_{0}}\geq\alpha\right]\geq\mathop{\bf Pr% \/}\left[\tilde{Z}_{1}+\ldots+\tilde{Z}_{T_{0}}\geq\alpha\right]\,,start_BIGOP bold_Pr end_BIGOP [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … + italic_Z start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ italic_α ] ≥ start_BIGOP bold_Pr end_BIGOP [ over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … + over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ italic_α ] ,

where Z~i,[T0]\tilde{Z}_{i},\in[T_{0}]over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] are i.i.d. Bernoulli random variables with mean 1/Δ.1Δ1/\Delta.1 / roman_Δ . Then, the multiplicative version of Chernoff bound on {Z~i}i[T0]subscriptsubscript~𝑍𝑖𝑖delimited-[]subscript𝑇0\{\tilde{Z}_{i}\}_{i\in[T_{0}]}{ over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT } start_POSTSUBSCRIPT italic_i ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT shows that, with probability at least 1eΔ/(ηTWcW)1superscript𝑒Δsuperscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊1-e^{-\Delta/(\eta_{T}^{W}\cdot c_{W})}1 - italic_e start_POSTSUPERSCRIPT - roman_Δ / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT the runner-up type will bid at least 4Δ/(ηTWcW))4\Delta/(\eta_{T}^{W}\cdot c_{W}))4 roman_Δ / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) many times vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT in this window. By a union bound, we know that with probability at least 1(TηTW/c)eΔ/(ηTWcW)1𝑇superscriptsubscript𝜂𝑇𝑊𝑐superscript𝑒Δsuperscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊1-(T\cdot\eta_{T}^{W}/c)\cdot e^{-\Delta/(\eta_{T}^{W}\cdot c_{W})}1 - ( italic_T ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT / italic_c ) ⋅ italic_e start_POSTSUPERSCRIPT - roman_Δ / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) end_POSTSUPERSCRIPT this holds across all the TηTW/c𝑇superscriptsubscript𝜂𝑇𝑊𝑐T\cdot\eta_{T}^{W}/citalic_T ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT / italic_c different epochs. We call this event 1subscript1\mathcal{E}_{1}caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT and condition on it for the rest of the proof. Our assumption that ηTsubscript𝜂𝑇\eta_{T}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT is non-degenerate shows that this probability is at least 1o(1).1𝑜11-o(1).1 - italic_o ( 1 ) .

Let wWτ(b)superscriptsubscript𝑤𝑊𝜏𝑏w_{W}^{\tau}(b)italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) be the total weight that the winning type assigns to b𝑏bitalic_b at the beginning of epoch τ𝜏\tauitalic_τ and mWτ(b)superscriptsubscript𝑚𝑊𝜏𝑏m_{W}^{\tau}(b)italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) be its probability. Notice that at τ=1𝜏1\tau=1italic_τ = 1 this distribution is uniform. Consider the ratio of the weights of any bp^1/Δ𝑏^𝑝1Δb\leq\hat{p}-1/\Deltaitalic_b ≤ over^ start_ARG italic_p end_ARG - 1 / roman_Δ and p^.^𝑝\hat{p}.over^ start_ARG italic_p end_ARG . We have that

wWτ+1(b)wWτ+1(p^)superscriptsubscript𝑤𝑊𝜏1𝑏superscriptsubscript𝑤𝑊𝜏1^𝑝\displaystyle\frac{w_{W}^{\tau+1}(b)}{w_{W}^{\tau+1}(\hat{p})}divide start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG wWτ(b)wWτ(p^)e4cWΔηTW/(cWηTW)=wWτ(b)wWτ(p^)e4Δ,absentsuperscriptsubscript𝑤𝑊𝜏𝑏superscriptsubscript𝑤𝑊𝜏^𝑝superscript𝑒4subscript𝑐𝑊Δsuperscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊superscriptsubscript𝜂𝑇𝑊superscriptsubscript𝑤𝑊𝜏𝑏superscriptsubscript𝑤𝑊𝜏^𝑝superscript𝑒4Δ\displaystyle\leq\frac{w_{W}^{\tau}(b)}{w_{W}^{\tau}(\hat{p})}\cdot e^{-4c_{W}% \cdot\Delta\cdot\eta_{T}^{W}/(c_{W}\cdot\eta_{T}^{W})}=\frac{w_{W}^{\tau}(b)}{% w_{W}^{\tau}(\hat{p})}\cdot e^{-4\Delta}\,,≤ divide start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ⋅ italic_e start_POSTSUPERSCRIPT - 4 italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT / ( italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT = divide start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ⋅ italic_e start_POSTSUPERSCRIPT - 4 roman_Δ end_POSTSUPERSCRIPT , (1)

where wWτ(b),wWτ(p^)superscriptsubscript𝑤𝑊𝜏𝑏superscriptsubscript𝑤𝑊𝜏^𝑝w_{W}^{\tau}(b),w_{W}^{\tau}(\hat{p})italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) , italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) are the weights that the winner puts on b,p^𝑏^𝑝b,\hat{p}italic_b , over^ start_ARG italic_p end_ARG at the beginning of epoch τ𝜏\tauitalic_τ (similarly for the τ+1𝜏1\tau+1italic_τ + 1 terms). For the probability of each bid in MWU, mWτ+1(b)=wWτ+1(b)bBΔwWτ+1(b)superscriptsubscript𝑚𝑊𝜏1𝑏superscriptsubscript𝑤𝑊𝜏1𝑏subscriptsuperscript𝑏subscript𝐵Δsuperscriptsubscript𝑤𝑊𝜏1superscript𝑏m_{W}^{\tau+1}(b)=\frac{w_{W}^{\tau+1}(b)}{\sum_{b^{\prime}\in B_{\Delta}}w_{W% }^{\tau+1}(b^{\prime})}italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) = divide start_ARG italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG ∑ start_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) end_ARG (and symmetrically for the other terms). Thus, by dividing the numerator and the denominator of the RHS of Equation 1 by bBΔwWτ(b)subscriptsuperscript𝑏subscript𝐵Δsuperscriptsubscript𝑤𝑊𝜏superscript𝑏\sum_{b^{\prime}\in B_{\Delta}}w_{W}^{\tau}(b^{\prime})∑ start_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) and the numerator and denominator of the LHS of Equation 1 by bBΔwWτ+1(b)subscriptsuperscript𝑏subscript𝐵Δsuperscriptsubscript𝑤𝑊𝜏1superscript𝑏\sum_{b^{\prime}\in B_{\Delta}}w_{W}^{\tau+1}(b^{\prime})∑ start_POSTSUBSCRIPT italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT end_POSTSUBSCRIPT italic_w start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) we get:

mWτ+1(b)mWτ+1(p^)mWτ(b)mWτ(p^)e4Δ.superscriptsubscript𝑚𝑊𝜏1𝑏superscriptsubscript𝑚𝑊𝜏1^𝑝superscriptsubscript𝑚𝑊𝜏𝑏superscriptsubscript𝑚𝑊𝜏^𝑝superscript𝑒4Δ\frac{m_{W}^{\tau+1}(b)}{m_{W}^{\tau+1}(\hat{p})}\leq\frac{m_{W}^{\tau}(b)}{m_% {W}^{\tau}(\hat{p})}\cdot e^{-4\Delta}.divide start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ≤ divide start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) end_ARG start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ⋅ italic_e start_POSTSUPERSCRIPT - 4 roman_Δ end_POSTSUPERSCRIPT .

Multiplying by mWτ+1(p^)superscriptsubscript𝑚𝑊𝜏1^𝑝m_{W}^{\tau+1}(\hat{p})italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) gives us

mWτ+1(b)mWτ+1(p^)mWτ(p^)mWτ(b)e4Δ.superscriptsubscript𝑚𝑊𝜏1𝑏superscriptsubscript𝑚𝑊𝜏1^𝑝superscriptsubscript𝑚𝑊𝜏^𝑝superscriptsubscript𝑚𝑊𝜏𝑏superscript𝑒4Δm_{W}^{\tau+1}(b)\leq\frac{m_{W}^{\tau+1}(\hat{p})}{m_{W}^{\tau}(\hat{p})}% \cdot m_{W}^{\tau}(b)\cdot e^{-4\Delta}\,.italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) ≤ divide start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ⋅ italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) ⋅ italic_e start_POSTSUPERSCRIPT - 4 roman_Δ end_POSTSUPERSCRIPT .

Notice that mW1(p^)=1/Δ,mWτ(p^)superscriptsubscript𝑚𝑊1^𝑝1Δsuperscriptsubscript𝑚𝑊𝜏^𝑝m_{W}^{1}(\hat{p})=1/\Delta,m_{W}^{\tau}(\hat{p})italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) = 1 / roman_Δ , italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) is non-decreasing in τ𝜏\tauitalic_τ since bidding p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG is a weakly-dominant strategy for the winning type111This is where we are using the assumption that the runner-up type does not overbid. Otherwise, the argument can still go through with a different constant since we can show that the winning type will overbid only some O(ηTW)𝑂superscriptsubscript𝜂𝑇𝑊O(\eta_{T}^{W})italic_O ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) many times and we need to account for this term., and, by definition, mHτ+1(p^)1superscriptsubscript𝑚𝐻𝜏1^𝑝1m_{H}^{\tau+1}(\hat{p})\leq 1italic_m start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) ≤ 1, so mWτ+1(p^)mWτ(p^)Δ.superscriptsubscript𝑚𝑊𝜏1^𝑝superscriptsubscript𝑚𝑊𝜏^𝑝Δ\frac{m_{W}^{\tau+1}(\hat{p})}{m_{W}^{\tau}(\hat{p})}\leq\Delta.divide start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG start_ARG italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( over^ start_ARG italic_p end_ARG ) end_ARG ≤ roman_Δ . Hence,

mWτ+1(b)Δe4ΔmWτ+1(b)<0.1mWτ(b),b<p^,formulae-sequencesuperscriptsubscript𝑚𝑊𝜏1𝑏Δsuperscript𝑒4Δsuperscriptsubscript𝑚𝑊𝜏1𝑏0.1superscriptsubscript𝑚𝑊𝜏𝑏for-all𝑏^𝑝m_{W}^{\tau+1}(b)\leq\Delta e^{-4\Delta}\cdot m_{W}^{\tau+1}(b)<0.1\cdot m_{W}% ^{\tau}(b),\forall b<\hat{p}\,,italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) ≤ roman_Δ italic_e start_POSTSUPERSCRIPT - 4 roman_Δ end_POSTSUPERSCRIPT ⋅ italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ + 1 end_POSTSUPERSCRIPT ( italic_b ) < 0.1 ⋅ italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ end_POSTSUPERSCRIPT ( italic_b ) , ∀ italic_b < over^ start_ARG italic_p end_ARG ,

where the second inequality follows from xe4x<1,x>0.formulae-sequence𝑥superscript𝑒4𝑥1for-all𝑥0xe^{-4x}<1,\forall x>0.italic_x italic_e start_POSTSUPERSCRIPT - 4 italic_x end_POSTSUPERSCRIPT < 1 , ∀ italic_x > 0 . Thus, after each epoch the probability that the winning type does not bid in WWsubscript𝑊𝑊W_{W}italic_W start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT decreases by a factor of 0.9.0.90.9.0.9 . Hence, we can see that after O(ηTWT)𝑂superscriptsubscript𝜂𝑇𝑊𝑇O(\eta_{T}^{W}\cdot T)italic_O ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_T ) epochs that total mass in this region is at most O(0.1ηTWT1)=o(1).𝑂superscript0.1superscriptsubscript𝜂𝑇𝑊𝑇1𝑜1O(0.1^{\eta_{T}^{W}\cdot T-1})=o(1).italic_O ( 0.1 start_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_T - 1 end_POSTSUPERSCRIPT ) = italic_o ( 1 ) . This proves the claim about the distribution of the winning type.

Let Zi,i[T],subscript𝑍𝑖𝑖delimited-[]𝑇Z_{i},i\in[T],italic_Z start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_i ∈ [ italic_T ] , be the random variable that indicates whether vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT bid in {0,1/Δ,,p^1/Δ}01Δ^𝑝1Δ\{0,1/\Delta,\ldots,\hat{p}-1/\Delta\}{ 0 , 1 / roman_Δ , … , over^ start_ARG italic_p end_ARG - 1 / roman_Δ } in round i[T].𝑖delimited-[]𝑇i\in[T].italic_i ∈ [ italic_T ] . Let also Tsuperscript𝑇T^{\prime}italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT denote the total number of epochs. Let Z^τ=Zτ++Zτ+T01subscript^𝑍𝜏subscript𝑍𝜏subscript𝑍𝜏subscript𝑇01\widehat{Z}_{\tau}=Z_{\tau}+\ldots+Z_{\tau+T_{0}-1}over^ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT = italic_Z start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT + … + italic_Z start_POSTSUBSCRIPT italic_τ + italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT, so that 𝔼[Z1+ZT]=τ=1T𝔼[Z^τ].𝔼delimited-[]subscript𝑍1subscript𝑍𝑇superscriptsubscript𝜏1superscript𝑇𝔼delimited-[]subscript^𝑍𝜏\mathbb{E}[Z_{1}+\ldots Z_{T}]=\sum_{\tau=1}^{T^{\prime}}\mathbb{E}[\widehat{Z% }_{\tau}].blackboard_E [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … italic_Z start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT blackboard_E [ over^ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ] . The preceding steps of the proof had shown that after every round, the probability that the winner bids in this region is non-increasing (since the bids in interval I𝐼Iitalic_I are weakly dominated by the bids in {p^,,vW}^𝑝subscript𝑣𝑊\{\widehat{p},\ldots,v_{W}\}{ over^ start_ARG italic_p end_ARG , … , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT }), hence 𝔼[Z^τ]T0𝔼[Z(τ1)T0+1].𝔼delimited-[]subscript^𝑍𝜏subscript𝑇0𝔼delimited-[]subscript𝑍𝜏1subscript𝑇01\mathbb{E}[\widehat{Z}_{\tau}]\leq T_{0}\cdot\mathbb{E}[Z_{(\tau-1)\cdot T_{0}% +1}].blackboard_E [ over^ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ] ≤ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⋅ blackboard_E [ italic_Z start_POSTSUBSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] . Thus, it suffices to bound τ=1T𝔼[Z(τ1)T0+1].superscriptsubscript𝜏1superscript𝑇𝔼delimited-[]subscript𝑍𝜏1subscript𝑇01\sum_{\tau=1}^{T^{\prime}}\mathbb{E}[Z_{(\tau-1)\cdot T_{0}+1}].∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] .

By definition, 𝔼[Z(τ1)T0+1]=b<p^mW(τ1)T0+1(b).𝔼delimited-[]subscript𝑍𝜏1subscript𝑇01subscript𝑏^𝑝superscriptsubscript𝑚𝑊𝜏1subscript𝑇01𝑏\mathbb{E}[Z_{(\tau-1)\cdot T_{0}+1}]=\sum_{b<\widehat{p}}m_{W}^{(\tau-1)\cdot T% _{0}+1}(b).blackboard_E [ italic_Z start_POSTSUBSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] = ∑ start_POSTSUBSCRIPT italic_b < over^ start_ARG italic_p end_ARG end_POSTSUBSCRIPT italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUPERSCRIPT ( italic_b ) . Now, the previous step of the proof had shown that the mass of each bid in interval I𝐼Iitalic_I drops by a factor of 0.9 between the beginning of consecutive epochs, i.e., mWτT0+1(b)0.1mW(τ1)T0+1(b)superscriptsubscript𝑚𝑊𝜏subscript𝑇01𝑏0.1superscriptsubscript𝑚𝑊𝜏1subscript𝑇01𝑏m_{W}^{\tau\cdot T_{0}+1}(b)\leq 0.1\cdot m_{W}^{(\tau-1)\cdot T_{0}+1}(b)italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_τ ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUPERSCRIPT ( italic_b ) ≤ 0.1 ⋅ italic_m start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUPERSCRIPT ( italic_b ) for all b{0,1/Δ,p^1}𝑏01Δ^𝑝1b\in\{0,1/\Delta,\ldots\widehat{p}-1\}italic_b ∈ { 0 , 1 / roman_Δ , … over^ start_ARG italic_p end_ARG - 1 }. This implies 𝔼[ZτT0+1]0.1𝔼[Z(τ1)T0+1].𝔼delimited-[]subscript𝑍𝜏subscript𝑇010.1𝔼delimited-[]subscript𝑍𝜏1subscript𝑇01\mathbb{E}[Z_{\tau\cdot T_{0}+1}]\leq 0.1\cdot\mathbb{E}[Z_{(\tau-1)\cdot T_{0% }+1}].blackboard_E [ italic_Z start_POSTSUBSCRIPT italic_τ ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] ≤ 0.1 ⋅ blackboard_E [ italic_Z start_POSTSUBSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] . Using 𝔼[Z1]1𝔼delimited-[]subscript𝑍11\mathbb{E}[Z_{1}]\leq 1blackboard_E [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ≤ 1, we get τ=1T𝔼[Z(τ1)T0+1]τ=1T(0.1)τ1superscriptsubscript𝜏1superscript𝑇𝔼delimited-[]subscript𝑍𝜏1subscript𝑇01superscriptsubscript𝜏1superscript𝑇superscript0.1𝜏1\sum_{\tau=1}^{T^{\prime}}\mathbb{E}[Z_{(\tau-1)\cdot T_{0}+1}]\leq\sum_{\tau=% 1}^{T^{\prime}}(0.1)^{\tau-1}∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT blackboard_E [ italic_Z start_POSTSUBSCRIPT ( italic_τ - 1 ) ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT ] ≤ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 0.1 ) start_POSTSUPERSCRIPT italic_τ - 1 end_POSTSUPERSCRIPT. Multiplying this by the value of T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT gives

𝐄[Z1++ZT]𝐄delimited-[]subscript𝑍1subscript𝑍𝑇\displaystyle\mathop{\bf E\/}\left[Z_{1}+\ldots+Z_{T}\right]start_BIGOP bold_E end_BIGOP [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … + italic_Z start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ] τ=1T(8Δ2/(ηTWcW))(0.1)τ1absentsuperscriptsubscript𝜏1superscript𝑇8superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊superscript0.1𝜏1\displaystyle\leq\sum_{\tau=1}^{T^{\prime}}(8\Delta^{2}/(\eta_{T}^{W}\cdot c_{% W}))\cdot(0.1)^{\tau-1}≤ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT ( 8 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) ⋅ ( 0.1 ) start_POSTSUPERSCRIPT italic_τ - 1 end_POSTSUPERSCRIPT
τ=1(8Δ2/(ηTWcW))(0.1)τ1absentsuperscriptsubscript𝜏18superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊superscript0.1𝜏1\displaystyle\leq\sum_{\tau=1}^{\infty}(8\Delta^{2}/(\eta_{T}^{W}\cdot c_{W}))% \cdot(0.1)^{\tau-1}≤ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ∞ end_POSTSUPERSCRIPT ( 8 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) ⋅ ( 0.1 ) start_POSTSUPERSCRIPT italic_τ - 1 end_POSTSUPERSCRIPT
16Δ2/(ηTWcW).absent16superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊\displaystyle\leq 16\Delta^{2}/(\eta_{T}^{W}\cdot c_{W})\,.≤ 16 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) .

Hence, using Markov’s inequality we see that

𝐏𝐫[Z1++ZT101(16Δ2/(ηTWcW))]𝐏𝐫delimited-[]subscript𝑍1subscript𝑍𝑇10116superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊\displaystyle\mathop{\bf Pr\/}\left[Z_{1}+\ldots+Z_{T}\geq 101\cdot\left(16% \Delta^{2}/(\eta_{T}^{W}\cdot c_{W})\right)\right]start_BIGOP bold_Pr end_BIGOP [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … + italic_Z start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ≥ 101 ⋅ ( 16 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) ] 𝐄[Z1+ZT]101(16Δ2/(ηTWcW))1101.absent𝐄delimited-[]subscript𝑍1subscript𝑍𝑇10116superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊1101\displaystyle\leq\frac{\mathop{\bf E\/}\left[Z_{1}+\ldots Z_{T}\right]}{101% \cdot\left(16\Delta^{2}/(\eta_{T}^{W}\cdot c_{W})\right)}\leq\frac{1}{101}\,.≤ divide start_ARG start_BIGOP bold_E end_BIGOP [ italic_Z start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT + … italic_Z start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ] end_ARG start_ARG 101 ⋅ ( 16 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) end_ARG ≤ divide start_ARG 1 end_ARG start_ARG 101 end_ARG .

Let us call this event 2subscript2\mathcal{E}_{2}caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT and condition on it.

Let us now consider the bid distribution of the runner-up type after the end of the last epoch. We denote this distribution by m^R()subscript^𝑚𝑅\widehat{m}_{R}(\cdot)over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( ⋅ ). Recall that whenever the winning type bids in WWsubscript𝑊𝑊W_{W}italic_W start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT, the runner-up type performs no updates. Moreover, whenever it does perform an update its utility when it bids vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is at most 1111 greater than bidding b=0.𝑏0b=0.italic_b = 0 . Notice that whenever the weight of some bid b𝑏bitalic_b is increased, the weights of all the bids greater than b𝑏bitalic_b are also increased by the same amount, so the monotonicity of the bid distribution follows immediately. It suffices now to bound the ratio of the probability of bidding zero and the probability of bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT by some quantity that is independent of T.𝑇T.italic_T . We have that

m^R(0)m^R(vR)eηTR101(16Δ2/(ηTWcW))m^R(0)eηTR101(16Δ2/(ηTWcW))Δ,subscript^𝑚𝑅0subscript^𝑚𝑅subscript𝑣𝑅superscript𝑒superscriptsubscript𝜂𝑇𝑅10116superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊subscript^𝑚𝑅0superscript𝑒superscriptsubscript𝜂𝑇𝑅10116superscriptΔ2superscriptsubscript𝜂𝑇𝑊subscript𝑐𝑊Δ\frac{\widehat{m}_{R}(0)}{\widehat{m}_{R}(v_{R})}\geq e^{-\eta_{T}^{R}101\cdot% \left(16\Delta^{2}/(\eta_{T}^{W}\cdot c_{W})\right)}\implies\\ \widehat{m}_{R}(0)\geq\frac{e^{-\eta_{T}^{R}101\cdot\left(16\Delta^{2}/(\eta_{% T}^{W}\cdot c_{W})\right)}}{\Delta}\,,divide start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( 0 ) end_ARG start_ARG over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ) end_ARG ≥ italic_e start_POSTSUPERSCRIPT - italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT 101 ⋅ ( 16 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) end_POSTSUPERSCRIPT ⟹ over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( 0 ) ≥ divide start_ARG italic_e start_POSTSUPERSCRIPT - italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT 101 ⋅ ( 16 roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT / ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_c start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) end_POSTSUPERSCRIPT end_ARG start_ARG roman_Δ end_ARG ,

where the second inequality follows from the fact that the distribution is initialized to be uniform and vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is a weakly-dominant strategy across all rounds, so its probability is not decreased. Notice that

limTηTR/ηTW<C,subscript𝑇superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊𝐶\lim_{T\rightarrow\infty}\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}<C\,,roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT / start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG < italic_C ,

for some discretization-dependent C𝐶Citalic_C, it follows that m^R(0)>C,subscript^𝑚𝑅0superscript𝐶\widehat{m}_{R}(0)>C^{\prime},over^ start_ARG italic_m end_ARG start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( 0 ) > italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , where C>0superscript𝐶0C^{\prime}>0italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT > 0 is some discretization-dependent constant. Since 𝐏𝐫[1]1o(1),𝐏𝐫[2]100/101,formulae-sequence𝐏𝐫delimited-[]subscript11𝑜1𝐏𝐫delimited-[]subscript2100101\mathop{\bf Pr\/}[\mathcal{E}_{1}]\geq 1-o(1),\mathop{\bf Pr\/}[\mathcal{E}_{2% }]\geq 100/101,start_BIGOP bold_Pr end_BIGOP [ caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ] ≥ 1 - italic_o ( 1 ) , start_BIGOP bold_Pr end_BIGOP [ caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ≥ 100 / 101 , we have that 𝐏𝐫[12]99/100,𝐏𝐫delimited-[]subscript1subscript299100\mathop{\bf Pr\/}[\mathcal{E}_{1}\cap\mathcal{E}_{2}]\geq 99/100,start_BIGOP bold_Pr end_BIGOP [ caligraphic_E start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∩ caligraphic_E start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] ≥ 99 / 100 , when T𝑇Titalic_T is large enough. ∎

Theorem D.4 (Effect of Learning Rate on Convergence).

Fix a valuation profile (v1,v2)subscript𝑣1subscript𝑣2(v_{1},v_{2})( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) and a non-degenerate deterministic truthful auction with respect to (v1,v2).subscript𝑣1subscript𝑣2(v_{1},v_{2}).( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ) . Suppose bidders bid using MWU and with non-degenerate learning rates. Let W𝑊Witalic_W (respectively R𝑅Ritalic_R), be the bidder i{1,2}𝑖12i\in\{1,2\}italic_i ∈ { 1 , 2 } such that xi(vi,vi)=1subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖1x_{i}(v_{i},v_{-i})=1italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 1 (respectively, xi(vi,vi)=0subscript𝑥𝑖subscript𝑣𝑖subscript𝑣𝑖0x_{i}(v_{i},v_{-i})=0italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = 0). Let p^^𝑝\hat{p}over^ start_ARG italic_p end_ARG be the minimum winning bid of W𝑊Witalic_W when R𝑅Ritalic_R bids vR.subscript𝑣𝑅v_{R}.italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT . Assume that ηTR/ηTW=ω(1).superscriptsubscript𝜂𝑇𝑅superscriptsubscript𝜂𝑇𝑊𝜔1\nicefrac{{\eta_{T}^{R}}}{{\eta_{T}^{W}}}=\omega(1)./ start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT end_ARG = italic_ω ( 1 ) . Then, with probability at least 1o(1)1𝑜11-o(1)1 - italic_o ( 1 ), bidder R𝑅Ritalic_R converges to bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT and bidder W𝑊Witalic_W converges to a bidding distribution supported in {p^,p^+1/Δ,,vW}.^𝑝^𝑝1Δsubscript𝑣𝑊\{\hat{p},\hat{p}+\nicefrac{{1}}{{\Delta}},\ldots,v_{W}\}.{ over^ start_ARG italic_p end_ARG , over^ start_ARG italic_p end_ARG + / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , … , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT } .

Proof of Theorem D.4.

Consider the first T0=cΔ/ηTWsubscript𝑇0superscriptsubscript𝑐Δsuperscriptsubscript𝜂𝑇𝑊T_{0}=c_{\Delta}^{\prime}/\eta_{T}^{W}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT rounds of the game, for some cΔsuperscriptsubscript𝑐Δc_{\Delta}^{\prime}italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT discretization-dependent constant. Assume without loss of generality that the weights of all the bids that are at most vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT (resp. vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT) for the winning bidder (resp. runner-up) are initialized to 1. (Again, the argument works so long as all the weights are initialized with some constants.) Since the auction is non-degenerate with respect to vW,vR,subscript𝑣𝑊subscript𝑣𝑅v_{W},v_{R},italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , there exists some bid of the winning type bWvWsubscript𝑏𝑊subscript𝑣𝑊b_{W}\leq v_{W}italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ≤ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT so that the runner-up bidder wins the auction when bidding truthfully and gets positive utility, i.e.,

vRxR(vR,bW)pR(vR,bW)>0.subscript𝑣𝑅subscript𝑥𝑅subscript𝑣𝑅subscript𝑏𝑊subscript𝑝𝑅subscript𝑣𝑅subscript𝑏𝑊0v_{R}\cdot x_{R}(v_{R},b_{W})-p_{R}(v_{R},b_{W})>0\,.italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) > 0 .

Moreover, for all bids bR<vRsubscript𝑏𝑅subscript𝑣𝑅b_{R}<v_{R}italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT < italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT it holds

vRxR(vR,bW)pR(vR,bW)(vRxR(bR,bW)pR(bR,bW))>0.subscript𝑣𝑅subscript𝑥𝑅subscript𝑣𝑅subscript𝑏𝑊subscript𝑝𝑅subscript𝑣𝑅subscript𝑏𝑊subscript𝑣𝑅subscript𝑥𝑅subscript𝑏𝑅subscript𝑏𝑊subscript𝑝𝑅subscript𝑏𝑅subscript𝑏𝑊0v_{R}\cdot x_{R}(v_{R},b_{W})-p_{R}(v_{R},b_{W})-\left(v_{R}\cdot x_{R}(b_{R},% b_{W})-p_{R}(b_{R},b_{W})\right)>0\,.italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) > 0 .

Since the auction is truthful, the difference above is minimized at bR=vR1/Δ.subscript𝑏𝑅subscript𝑣𝑅1Δb_{R}=v_{R}-\nicefrac{{1}}{{\Delta}}.italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG . Let

uR:=vRxR(vR,bW)pR(vR,bW)(vRxR(vR1/Δ,bW)pR(vR1/Δ,bW)),assignsubscriptsuperscript𝑢𝑅subscript𝑣𝑅subscript𝑥𝑅subscript𝑣𝑅subscript𝑏𝑊subscript𝑝𝑅subscript𝑣𝑅subscript𝑏𝑊subscript𝑣𝑅subscript𝑥𝑅subscript𝑣𝑅1Δsubscript𝑏𝑊subscript𝑝𝑅subscript𝑣𝑅1Δsubscript𝑏𝑊u^{\prime}_{R}:=v_{R}\cdot x_{R}(v_{R},b_{W})-p_{R}(v_{R},b_{W})-\left(v_{R}% \cdot x_{R}(v_{R}-\nicefrac{{1}}{{\Delta}},b_{W})-p_{R}(v_{R}-\nicefrac{{1}}{{% \Delta}},b_{W})\right)\,,italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT := italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ) ) ,

and, by definition, uR>0.subscriptsuperscript𝑢𝑅0u^{\prime}_{R}>0.italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT > 0 . Let us consider the winning type and look at the worst-case ratio of the probability that is placed on bids bWt=bW,bWt=vWformulae-sequencesuperscriptsubscript𝑏𝑊𝑡subscript𝑏𝑊superscriptsubscript𝑏𝑊𝑡subscript𝑣𝑊b_{W}^{t}=b_{W},b_{W}^{t}=v_{W}italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT at the end of every round t{1,,T0}𝑡1subscript𝑇0t\in\{1,\ldots,T_{0}\}italic_t ∈ { 1 , … , italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT }. We have that

𝐏𝐫[bWt=bW]𝐏𝐫[bWt=vW]𝐏𝐫delimited-[]superscriptsubscript𝑏𝑊𝑡subscript𝑏𝑊𝐏𝐫delimited-[]superscriptsubscript𝑏𝑊𝑡subscript𝑣𝑊\displaystyle\frac{\mathop{\bf Pr\/}[b_{W}^{t}=b_{W}]}{\mathop{\bf Pr\/}[b_{W}% ^{t}=v_{W}]}divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ] end_ARG start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ] end_ARG eηTWvWtabsentsuperscript𝑒superscriptsubscript𝜂𝑇𝑊subscript𝑣𝑊𝑡\displaystyle\geq e^{-\eta_{T}^{W}\cdot v_{W}\cdot t}≥ italic_e start_POSTSUPERSCRIPT - italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ italic_t end_POSTSUPERSCRIPT
eηTWvWT0absentsuperscript𝑒superscriptsubscript𝜂𝑇𝑊subscript𝑣𝑊subscript𝑇0\displaystyle\geq e^{-\eta_{T}^{W}\cdot v_{W}\cdot T_{0}}≥ italic_e start_POSTSUPERSCRIPT - italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT
=ecΔvW,absentsuperscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝑊\displaystyle=e^{-c_{\Delta}^{\prime}\cdot v_{W}}\,,= italic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ,

where the first inequality follows from the fact that bidding vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT always yields at most vWsubscript𝑣𝑊v_{W}italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT utility more than bidding any other bid and the second one because tT0𝑡subscript𝑇0t\leq T_{0}italic_t ≤ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT. Moreover, since 𝐏𝐫[bW1=vW]=1/Δ𝐏𝐫delimited-[]superscriptsubscript𝑏𝑊1subscript𝑣𝑊1Δ\mathop{\bf Pr\/}[b_{W}^{1}=v_{W}]=1/\Deltastart_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 1 end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ] = 1 / roman_Δ and the probability that is placed on bWt=vWsuperscriptsubscript𝑏𝑊𝑡subscript𝑣𝑊b_{W}^{t}=v_{W}italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT is non-decreasing across the executions (since it is a weakly-dominant strategy), we have that

𝐏𝐫[bWt=bW]ecΔvW/Δ,t{1,,T0}.formulae-sequence𝐏𝐫delimited-[]superscriptsubscript𝑏𝑊𝑡subscript𝑏𝑊superscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝑊Δfor-all𝑡1subscript𝑇0\mathop{\bf Pr\/}[b_{W}^{t}=b_{W}]\geq e^{-c_{\Delta}^{\prime}\cdot v_{W}}/% \Delta,\forall t\in\{1,\ldots,T_{0}\}\,.start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT ] ≥ italic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_POSTSUPERSCRIPT / roman_Δ , ∀ italic_t ∈ { 1 , … , italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } .

Let ZT0superscript𝑍subscript𝑇0Z^{T_{0}}italic_Z start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT denote the random variable that counts the number of times the winning type bids bWsubscript𝑏𝑊b_{W}italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT within the first T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds. Let Z~τ,τ[T0]subscript~𝑍𝜏𝜏delimited-[]subscript𝑇0\tilde{Z}_{\tau},\tau\in[T_{0}]over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT , italic_τ ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] be independent Bernoulli random variables with mean ecΔvW/Δsuperscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝑊Δe^{-c_{\Delta}^{\prime}\cdot v_{W}}/\Deltaitalic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_POSTSUPERSCRIPT / roman_Δ. Notice that, α>0,for-all𝛼0\forall\alpha>0,∀ italic_α > 0 , it holds that 𝐏𝐫[ZT0α]𝐏𝐫[τ=1T0Z~τα].𝐏𝐫delimited-[]superscript𝑍subscript𝑇0𝛼𝐏𝐫delimited-[]superscriptsubscript𝜏1subscript𝑇0subscript~𝑍𝜏𝛼\mathop{\bf Pr\/}[Z^{T_{0}}\geq\alpha]\geq\mathop{\bf Pr\/}[\sum_{\tau=1}^{T_{% 0}}\tilde{Z}_{\tau}\geq\alpha].start_BIGOP bold_Pr end_BIGOP [ italic_Z start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ≥ italic_α ] ≥ start_BIGOP bold_Pr end_BIGOP [ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ≥ italic_α ] . Moreover,

𝐄[τ=1T0Z~τ]T0ecΔvW/Δ=cΔ/ηTWecΔvW/Δ.𝐄delimited-[]superscriptsubscript𝜏1subscript𝑇0subscript~𝑍𝜏subscript𝑇0superscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝑊Δsuperscriptsubscript𝑐Δsuperscriptsubscript𝜂𝑇𝑊superscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝑊Δ\mathop{\bf E\/}\left[\sum_{\tau=1}^{T_{0}}\tilde{Z}_{\tau}\right]\geq T_{0}% \cdot e^{-c_{\Delta}^{\prime}\cdot v_{W}}/\Delta=c_{\Delta}^{\prime}/\eta_{T}^% {W}\cdot e^{-c_{\Delta}^{\prime}\cdot v_{W}}/\Delta\,.start_BIGOP bold_E end_BIGOP [ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT over~ start_ARG italic_Z end_ARG start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ] ≥ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ⋅ italic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_POSTSUPERSCRIPT / roman_Δ = italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT / italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ⋅ italic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT end_POSTSUPERSCRIPT / roman_Δ .

To simplify the notation, let us denote c~Δ=cΔecΔvH/Δ.subscript~𝑐Δsuperscriptsubscript𝑐Δsuperscript𝑒superscriptsubscript𝑐Δsubscript𝑣𝐻Δ\tilde{c}_{\Delta}=c_{\Delta}^{\prime}\cdot e^{-c_{\Delta}^{\prime}\cdot v_{H}% }/\Delta.over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT = italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_e start_POSTSUPERSCRIPT - italic_c start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT end_POSTSUPERSCRIPT / roman_Δ . Thus, a multiplicative Chernoff bound shows that, with probability at least 1ec~Δ/(8ηTW)=1o(1),1superscript𝑒subscript~𝑐Δ8superscriptsubscript𝜂𝑇𝑊1𝑜11-e^{-\tilde{c}_{\Delta}/(8\eta_{T}^{W})}=1-o(1),1 - italic_e start_POSTSUPERSCRIPT - over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT / ( 8 italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) end_POSTSUPERSCRIPT = 1 - italic_o ( 1 ) , we have that ZT0c~Δ/(2ηTW).superscript𝑍subscript𝑇0subscript~𝑐Δ2superscriptsubscript𝜂𝑇𝑊Z^{T_{0}}\geq\tilde{c}_{\Delta}/(2\eta_{T}^{W}).italic_Z start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ≥ over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT / ( 2 italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) . Let us call this event E𝐸Eitalic_E and condition on it.

Let us now focus on the bid distribution of the runner-up bidder after the first T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds. Notice that whenever the winning bidder bids bWsubscript𝑏𝑊b_{W}italic_b start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT then bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT yields utility at least uRsubscriptsuperscript𝑢𝑅u^{\prime}_{R}italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT greater than bidding any other bid to the runner-up type, and in the rounds where this does not happen, bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is still a weakly dominant strategy so it generates as much utility as any other bid. Thus, we have that

𝐏𝐫[bRT0=vR1/Δ]𝐏𝐫[bRT0=vR]𝐏𝐫delimited-[]superscriptsubscript𝑏𝑅subscript𝑇0subscript𝑣𝑅1Δ𝐏𝐫delimited-[]superscriptsubscript𝑏𝑅subscript𝑇0subscript𝑣𝑅\displaystyle\frac{\mathop{\bf Pr\/}[b_{R}^{T_{0}}=v_{R}-1/\Delta]}{\mathop{% \bf Pr\/}[b_{R}^{T_{0}}=v_{R}]}divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ ] end_ARG start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ] end_ARG euRηTRZT0absentsuperscript𝑒subscriptsuperscript𝑢𝑅superscriptsubscript𝜂𝑇𝑅superscript𝑍subscript𝑇0\displaystyle\leq e^{-u^{\prime}_{R}\cdot\eta_{T}^{R}\cdot Z^{T_{0}}}≤ italic_e start_POSTSUPERSCRIPT - italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ⋅ italic_Z start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT end_POSTSUPERSCRIPT
eηTRc~Δ/(2ηTWΔ)absentsuperscript𝑒superscriptsubscript𝜂𝑇𝑅subscript~𝑐Δ2superscriptsubscript𝜂𝑇𝑊Δ\displaystyle\leq e^{-\eta_{T}^{R}\cdot\tilde{c}_{\Delta}/(2\eta_{T}^{W}\Delta)}≤ italic_e start_POSTSUPERSCRIPT - italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_R end_POSTSUPERSCRIPT ⋅ over~ start_ARG italic_c end_ARG start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT / ( 2 italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT roman_Δ ) end_POSTSUPERSCRIPT
=o(1)absent𝑜1\displaystyle=o(1)= italic_o ( 1 )

Thus, since bidding vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT is a weakly dominant strategy for the runner-up this ratio is non-increasing in t𝑡titalic_t we can immediately see that

𝐏𝐫[bRT0=vR1/Δ]𝐏𝐫[bRT0=vR]=o(1),𝐏𝐫delimited-[]superscriptsubscript𝑏𝑅subscript𝑇0subscript𝑣𝑅1Δ𝐏𝐫delimited-[]superscriptsubscript𝑏𝑅subscript𝑇0subscript𝑣𝑅𝑜1\frac{\mathop{\bf Pr\/}[b_{R}^{T_{0}}=v_{R}-1/\Delta]}{\mathop{\bf Pr\/}[b_{R}% ^{T_{0}}=v_{R}]}=o(1)\,,divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ ] end_ARG start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT ] end_ARG = italic_o ( 1 ) ,

which gives that

𝐏𝐫[bRT=vR1/Δ]=o(1).𝐏𝐫delimited-[]superscriptsubscript𝑏𝑅𝑇subscript𝑣𝑅1Δ𝑜1\mathop{\bf Pr\/}[b_{R}^{T}=v_{R}-1/\Delta]=o(1)\,.start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ ] = italic_o ( 1 ) .

The same argument can be applied to all bids in {0,1/Δ,,vR1/Δ}.01Δsubscript𝑣𝑅1Δ\{0,1/\Delta,\ldots,v_{R}-1/\Delta\}.{ 0 , 1 / roman_Δ , … , italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ } .

For the winning type, a symmetric argument shows that since after O(ηTW)𝑂superscriptsubscript𝜂𝑇𝑊O(\eta_{T}^{W})italic_O ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) many rounds the runner-up type bids vRsubscript𝑣𝑅v_{R}italic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT with high probability, all the bids in the region {v^W,,vW}subscript^𝑣𝑊subscript𝑣𝑊\{\hat{v}_{W},\ldots,v_{W}\}{ over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_W end_POSTSUBSCRIPT } will yield utility that is larger than bidding vR1/Δsubscript𝑣𝑅1Δv_{R}-1/\Deltaitalic_v start_POSTSUBSCRIPT italic_R end_POSTSUBSCRIPT - 1 / roman_Δ by at least 1/Δ1Δ1/\Delta1 / roman_Δ (again with high probability), so after another ω(ηTW)𝜔superscriptsubscript𝜂𝑇𝑊\omega(\eta_{T}^{W})italic_ω ( italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_W end_POSTSUPERSCRIPT ) rounds its mass will be concentrated on bidding in this region. ∎

Proof of Theorem 3.1.

Let ={r<v1}{r<v2}{v1v2}.𝑟subscript𝑣1𝑟subscript𝑣2subscript𝑣1subscript𝑣2\mathcal{E}=\{r<v_{1}\}\cap\{r<v_{2}\}\cap\{v_{1}\neq v_{2}\}.caligraphic_E = { italic_r < italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT } ∩ { italic_r < italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } ∩ { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ≠ italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } . We can decompose 𝐄v1,v2U[BΔ][Rev(v1,v2;r)]subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]Revsubscript𝑣1subscript𝑣2𝑟\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\mathrm{Rev}(v_{1},v_{2}% ;r)\right]start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ] as:

𝐄v1,v2U[BΔ][Rev(v1,v2;r)]=𝐄v1,v2U[BΔ][Rev(v1,v2;r)|]𝐏𝐫v1,v2U[BΔ][]subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]Revsubscript𝑣1subscript𝑣2𝑟subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalRevsubscript𝑣1subscript𝑣2𝑟subscript𝐏𝐫similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]\displaystyle\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\mathrm{Rev% }(v_{1},v_{2};r)\right]=\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[% \mathrm{Rev}(v_{1},v_{2};r)\right|\mathcal{E}]\cdot\mathop{\bf Pr\/}_{v_{1},v_% {2}\sim U[B_{\Delta}]}[\mathcal{E}]start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ] = start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) | caligraphic_E ] ⋅ start_BIGOP bold_Pr end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ caligraphic_E ]
+𝐄v1,v2U[BΔ][Rev(v1,v2;r)|]𝐏𝐫v1,v2U[BΔ][].subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalRevsubscript𝑣1subscript𝑣2𝑟superscriptsubscript𝐏𝐫similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]superscript\displaystyle+\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\mathrm{% Rev}(v_{1},v_{2};r)\right|\mathcal{E}^{\prime}]\cdot\mathop{\bf Pr\/}_{v_{1},v% _{2}\sim U[B_{\Delta}]}[\mathcal{E}^{\prime}]\,.+ start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) | caligraphic_E start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ⋅ start_BIGOP bold_Pr end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ caligraphic_E start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] .

Notice that under ,superscript\mathcal{E}^{\prime},caligraphic_E start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , the revenue of the auction in the learning setting satisfies

𝐄v1,v2U[BΔ][limT𝐄b1b1T,b2b2T[Rev(b1,b2;r)v1,v2]|]𝐄v1,v2U[BΔ][Rev(v1,v2;r)|].subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalsubscript𝑇subscript𝐄formulae-sequencesimilar-tosubscript𝑏1superscriptsubscript𝑏1𝑇similar-tosubscript𝑏2superscriptsubscript𝑏2𝑇delimited-[]conditionalRevsubscript𝑏1subscript𝑏2𝑟subscript𝑣1subscript𝑣2superscriptsubscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalRevsubscript𝑣1subscript𝑣2𝑟superscript\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\lim_{T\rightarrow\infty% }\mathop{\bf E\/}_{b_{1}\sim b_{1}^{T},b_{2}\sim b_{2}^{T}}[\mathrm{Rev}(b_{1}% ,b_{2};r)\mid v_{1},v_{2}]\,\bigg{|}\,\mathcal{E}^{\prime}\right]\leq\mathop{% \bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\mathrm{Rev}(v_{1},v_{2};r)% \right|\mathcal{E}^{\prime}]\,.start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_Rev ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ∣ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] | caligraphic_E start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] ≤ start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) | caligraphic_E start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ] .

This is because both bidders will be bidding at most their valuation, so the revenue of the auction cannot increase. Let us now focus on the first term. Under the event ,\mathcal{E},caligraphic_E , the revenue of the auction under rational agents is min{v1,v2}>r.subscript𝑣1subscript𝑣2𝑟\min\{v_{1},v_{2}\}>r.roman_min { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } > italic_r . However, in the learning setting, the runner-up bidder will be bidding strictly below their valuation in expectation, by Theorem D.3. Hence, we have that

𝐄v1,v2U[BΔ][limT𝐄b1b1T,b2b2T[Rev(b1,b2;r)v1,v2]|]subscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalsubscript𝑇subscript𝐄formulae-sequencesimilar-tosubscript𝑏1superscriptsubscript𝑏1𝑇similar-tosubscript𝑏2superscriptsubscript𝑏2𝑇delimited-[]conditionalRevsubscript𝑏1subscript𝑏2𝑟subscript𝑣1subscript𝑣2\displaystyle\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\lim_{T% \rightarrow\infty}\mathop{\bf E\/}_{b_{1}\sim b_{1}^{T},b_{2}\sim b_{2}^{T}}[% \mathrm{Rev}(b_{1},b_{2};r)\mid v_{1},v_{2}]\,\bigg{|}\,\mathcal{E}\right]start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT end_POSTSUBSCRIPT [ roman_Rev ( italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) ∣ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ] | caligraphic_E ] <𝐄v1,v2U[BΔ][min{v1,v2}|]cabsentsubscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalsubscript𝑣1subscript𝑣2superscript𝑐\displaystyle<\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\min\{v_{1% },v_{2}\}\,\bigg{|}\,\mathcal{E}\right]-c^{\prime}< start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_min { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } | caligraphic_E ] - italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT
=𝐄v1,v2U[BΔ][Rev(v1,v2;r)|]c.absentsubscript𝐄similar-tosubscript𝑣1subscript𝑣2𝑈delimited-[]subscript𝐵Δdelimited-[]conditionalRevsubscript𝑣1subscript𝑣2𝑟superscript𝑐\displaystyle=\mathop{\bf E\/}_{v_{1},v_{2}\sim U[B_{\Delta}]}\left[\mathrm{% Rev}(v_{1},v_{2};r)\right|\mathcal{E}]-c^{\prime}\,.= start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ italic_U [ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ] end_POSTSUBSCRIPT [ roman_Rev ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ; italic_r ) | caligraphic_E ] - italic_c start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT .

Since 𝐏𝐫[]>0,𝐏𝐫delimited-[]0\mathop{\bf Pr\/}[\mathcal{E}]>0,start_BIGOP bold_Pr end_BIGOP [ caligraphic_E ] > 0 , the result follows by combining the two inequalities. ∎

Appendix E Omitted Details from Section 4

Proof of Lemma 4.2.

Let

γA=mini[n],vBΔ,biBΔn1,bBΔ:bv{ui(v,bi)ui(b,bi)},subscript𝛾𝐴subscript:formulae-sequence𝑖delimited-[]𝑛formulae-sequence𝑣subscript𝐵Δformulae-sequencesubscript𝑏𝑖superscriptsubscript𝐵Δ𝑛1𝑏subscript𝐵Δ𝑏𝑣subscript𝑢𝑖𝑣subscript𝑏𝑖subscript𝑢𝑖𝑏subscript𝑏𝑖\gamma_{A}=\min_{i\in[n],v\in B_{\Delta},b_{-i}\in B_{\Delta}^{n-1},b\in B_{% \Delta}:b\neq v}\{u_{i}(v,b_{-i})-u_{i}(b,b_{-i})\}\,,italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_i ∈ [ italic_n ] , italic_v ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT , italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT : italic_b ≠ italic_v end_POSTSUBSCRIPT { italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) } ,

i.e., the minimum improvement in the utility that is guaranteed to every player when they switch to bidding truthfully from any non-truthful bid, no matter what their valuation and the bids of the opponents are. Notice that for any fixed auction A𝐴Aitalic_A this quantity does not depend on T.𝑇T.italic_T . Moreover, since A𝐴Aitalic_A is a strictly IC auction we have that γA>0.subscript𝛾𝐴0\gamma_{A}>0.italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT > 0 . Consider any round t[T]𝑡delimited-[]𝑇t\in[T]italic_t ∈ [ italic_T ] of the interaction. For any player i[n],𝑖delimited-[]𝑛i\in[n],italic_i ∈ [ italic_n ] , we have that

ut(vi,bit)ut(b,bit)γA,bvi,formulae-sequencesuperscript𝑢𝑡subscript𝑣𝑖subscriptsuperscript𝑏𝑡𝑖superscript𝑢𝑡superscript𝑏subscriptsuperscript𝑏𝑡𝑖subscript𝛾𝐴for-allsuperscript𝑏subscript𝑣𝑖u^{t}(v_{i},b^{t}_{-i})-u^{t}(b^{\prime},b^{t}_{-i})\geq\gamma_{A},\forall b^{% \prime}\neq v_{i}\,,italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ≥ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT , ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ,

no matter what the bids bitsubscriptsuperscript𝑏𝑡𝑖b^{t}_{-i}italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT are. Let δ1,,δnsubscript𝛿1subscript𝛿𝑛\delta_{1},\ldots,\delta_{n}italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_δ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the mean-based parameters of the algorithms that the agents are using. Moreover, let T0=maxi[n]δiT/γA.subscript𝑇0subscript𝑖delimited-[]𝑛subscript𝛿𝑖𝑇subscript𝛾𝐴T_{0}=\max_{i\in[n]}\delta_{i}\cdot T/\gamma_{A}.italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_max start_POSTSUBSCRIPT italic_i ∈ [ italic_n ] end_POSTSUBSCRIPT italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_T / italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT . Notice that since δi=o(1),i[n],formulae-sequencesubscript𝛿𝑖𝑜1for-all𝑖delimited-[]𝑛\delta_{i}=o(1),\forall i\in[n],italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_o ( 1 ) , ∀ italic_i ∈ [ italic_n ] , by picking T𝑇Titalic_T sufficiently large we have that T0<T.subscript𝑇0𝑇T_{0}<T.italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT < italic_T . We immediately get that, for every player i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ]

t=1T0(ut(vi,bit)ut(b,bit))γAT0δiT,bvi,formulae-sequencesuperscriptsubscript𝑡1subscript𝑇0superscript𝑢𝑡subscript𝑣𝑖subscriptsuperscript𝑏𝑡𝑖superscript𝑢𝑡superscript𝑏subscriptsuperscript𝑏𝑡𝑖subscript𝛾𝐴subscript𝑇0subscript𝛿𝑖𝑇for-allsuperscript𝑏subscript𝑣𝑖\sum_{t=1}^{T_{0}}\left(u^{t}(v_{i},b^{t}_{-i})-u^{t}(b^{\prime},b^{t}_{-i})% \right)\geq\gamma_{A}\cdot T_{0}\geq\delta_{i}\cdot T,\forall b^{\prime}\neq v% _{i}\,,∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ) ≥ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≥ italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_T , ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ,

no matter what the bid profile bitsubscriptsuperscript𝑏𝑡𝑖b^{t}_{-i}italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT of the other bidders in every round is. Thus, for every bidder i[n],𝑖delimited-[]𝑛i\in[n],italic_i ∈ [ italic_n ] , by taking a union bound over all bids bvi𝑏subscript𝑣𝑖b\neq v_{i}italic_b ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT, we see that in round T0+1subscript𝑇01T_{0}+1italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 the probability of not bidding truthfully is at most Δδi=o(1).Δsubscript𝛿𝑖𝑜1\Delta\cdot\delta_{i}=o(1).roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_o ( 1 ) . Hence, we have shown the result. ∎

Proof of Theorem 4.4.

Let δ,,δnsubscript𝛿,subscript𝛿𝑛\delta_{,}\ldots,\delta_{n}italic_δ start_POSTSUBSCRIPT , end_POSTSUBSCRIPT … , italic_δ start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the mean-based parameters of the algorithms that the agents are using. Recall that these parameters do depend on T.𝑇T.italic_T . Assume without loss of generality that δ1subscript𝛿1\delta_{1}italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT is the slowest one, i.e., limTδi/δ1C,i[n],formulae-sequencesubscript𝑇subscript𝛿𝑖subscript𝛿1𝐶for-all𝑖delimited-[]𝑛\lim_{T\rightarrow\infty}\nicefrac{{\delta_{i}}}{{\delta_{1}}}\leq C,\forall i% \in[n],roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT / start_ARG italic_δ start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_ARG start_ARG italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG ≤ italic_C , ∀ italic_i ∈ [ italic_n ] , where C𝐶Citalic_C is some discretization-dependent constant. Let A~~𝐴\widetilde{A}over~ start_ARG italic_A end_ARG be a strictly IC auction and define

γA~=mini[n],vBΔ,biBΔn1,bBΔ:bv{u~i(v,bi)u~i(b,bi)}.subscript𝛾~𝐴subscript:formulae-sequence𝑖delimited-[]𝑛formulae-sequence𝑣subscript𝐵Δformulae-sequencesubscript𝑏𝑖superscriptsubscript𝐵Δ𝑛1𝑏subscript𝐵Δ𝑏𝑣subscript~𝑢𝑖𝑣subscript𝑏𝑖subscript~𝑢𝑖𝑏subscript𝑏𝑖\gamma_{\widetilde{A}}=\min_{i\in[n],v\in B_{\Delta},b_{-i}\in B_{\Delta}^{n-1% },b\in B_{\Delta}:b\neq v}\{\widetilde{u}_{i}(v,b_{-i})-\widetilde{u}_{i}(b,b_% {-i})\}\,.italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT = roman_min start_POSTSUBSCRIPT italic_i ∈ [ italic_n ] , italic_v ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n - 1 end_POSTSUPERSCRIPT , italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT : italic_b ≠ italic_v end_POSTSUBSCRIPT { over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - over~ start_ARG italic_u end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) } .

Similarly as in the previous proof, notice that γA~subscript𝛾~𝐴\gamma_{\widetilde{A}}italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT does not depend on T.𝑇T.italic_T . Consider the qTsubscript𝑞𝑇q_{T}italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT-mixture of the auctions A,A~𝐴~𝐴A,\widetilde{A}italic_A , over~ start_ARG italic_A end_ARG and let us denote this auction by A.superscript𝐴A^{\prime}.italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT . Let x,x~,x𝑥~𝑥superscript𝑥x,\widetilde{x},x^{\prime}italic_x , over~ start_ARG italic_x end_ARG , italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be the allocation rules of A,A~,A,𝐴~𝐴superscript𝐴A,\widetilde{A},A^{\prime},italic_A , over~ start_ARG italic_A end_ARG , italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , respectively, and let us define the payment rules in a symmetric way. Notice that since x()=qTx~()+(1qT)x(),p()=qTp~()+(1qT)p()formulae-sequencesuperscript𝑥subscript𝑞𝑇~𝑥1subscript𝑞𝑇𝑥superscript𝑝subscript𝑞𝑇~𝑝1subscript𝑞𝑇𝑝x^{\prime}(\cdot)=q_{T}\widetilde{x}(\cdot)+(1-q_{T})x(\cdot),p^{\prime}(\cdot% )=q_{T}\widetilde{p}(\cdot)+(1-q_{T})p(\cdot)italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) = italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT over~ start_ARG italic_x end_ARG ( ⋅ ) + ( 1 - italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) italic_x ( ⋅ ) , italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) = italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT over~ start_ARG italic_p end_ARG ( ⋅ ) + ( 1 - italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) italic_p ( ⋅ ), it follows immediately that

γAqTγA~.subscript𝛾superscript𝐴subscript𝑞𝑇subscript𝛾~𝐴\gamma_{A^{\prime}}\geq q_{T}\cdot\gamma_{\widetilde{A}}\,.italic_γ start_POSTSUBSCRIPT italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ≥ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT .

Moreover, notice that

|x()x()|superscript𝑥𝑥\displaystyle|x^{\prime}(\cdot)-x(\cdot)|| italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) - italic_x ( ⋅ ) | qT|x~()x()|qTabsentsubscript𝑞𝑇~𝑥𝑥subscript𝑞𝑇\displaystyle\leq q_{T}\cdot|\widetilde{x}(\cdot)-x(\cdot)|\leq q_{T}≤ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ | over~ start_ARG italic_x end_ARG ( ⋅ ) - italic_x ( ⋅ ) | ≤ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
|p()p()|superscript𝑝𝑝\displaystyle|p^{\prime}(\cdot)-p(\cdot)|| italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ( ⋅ ) - italic_p ( ⋅ ) | qT|p~()p()|qT.absentsubscript𝑞𝑇~𝑝𝑝subscript𝑞𝑇\displaystyle\leq q_{T}\cdot|\widetilde{p}(\cdot)-p(\cdot)|\leq q_{T}\,.≤ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ | over~ start_ARG italic_p end_ARG ( ⋅ ) - italic_p ( ⋅ ) | ≤ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT .

Let us focus on agent 1 since it is the one that has the slowest convergence. After T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds of the game we have that

t=1T0(ut(v1,b1t)ut(b,b1t))γAT0qTγA~T0,bv1,formulae-sequencesuperscriptsubscript𝑡1subscript𝑇0superscript𝑢𝑡subscript𝑣1subscriptsuperscript𝑏𝑡1superscript𝑢𝑡superscript𝑏subscriptsuperscript𝑏𝑡1subscript𝛾superscript𝐴subscript𝑇0subscript𝑞𝑇subscript𝛾~𝐴subscript𝑇0for-allsuperscript𝑏subscript𝑣1\sum_{t=1}^{T_{0}}\left(u^{t}(v_{1},b^{t}_{-1})-u^{t}(b^{\prime},b^{t}_{-1})% \right)\geq\gamma_{A^{\prime}}\cdot T_{0}\geq q_{T}\cdot\gamma_{\widetilde{A}}% \cdot T_{0},\forall b^{\prime}\neq v_{1}\,,∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - 1 end_POSTSUBSCRIPT ) ) ≥ italic_γ start_POSTSUBSCRIPT italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≥ italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT , ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ,

no matter what the bid profile of the rest of the bidders in every round is. Thus, in order for the mean-based guarantee of the algorithm of the first bidder to give us the desired convergence we see that we need T0δ1T/qTγA~.subscript𝑇0subscript𝛿1𝑇subscript𝑞𝑇subscript𝛾~𝐴T_{0}\geq\nicefrac{{\delta_{1}\cdot T}}{{q_{T}\cdot\gamma_{\widetilde{A}}}}.italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≥ / start_ARG italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT end_ARG . Since T0T,subscript𝑇0𝑇T_{0}\leq T,italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ≤ italic_T , this places a constraint on the choice of qTsubscript𝑞𝑇q_{T}italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, namely that qTδ1/γA~.subscript𝑞𝑇subscript𝛿1subscript𝛾~𝐴q_{T}\geq\nicefrac{{\delta_{1}}}{{\gamma_{\widetilde{A}}}}.italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ≥ / start_ARG italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT end_ARG . Thus, since this is the only constraint that we have on the choice of qTsubscript𝑞𝑇q_{T}italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT we see that choosing qT=2δ1/γA~=o(1)subscript𝑞𝑇2subscript𝛿1subscript𝛾~𝐴𝑜1q_{T}=\nicefrac{{2\delta_{1}}}{{\gamma_{\widetilde{A}}}}=o(1)italic_q start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = / start_ARG 2 italic_δ start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT over~ start_ARG italic_A end_ARG end_POSTSUBSCRIPT end_ARG = italic_o ( 1 ) suffices to get the result. ∎

Proof of Corollary 4.5.

Let Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be the output of Theorem 4.4 when the input auction is Myerson’s revenue-optimal auction for F.𝐹F.italic_F . For any fixed valuation profile vBΔn𝑣superscriptsubscript𝐵Δ𝑛v\in B_{\Delta}^{n}italic_v ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT, for sufficiently large T,𝑇T,italic_T , each bidder i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] will be bidding visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT except with probability o(1).𝑜1o(1).italic_o ( 1 ) . Moreover, the payments in these two auctions differ by some o(1)𝑜1o(1)italic_o ( 1 ). Thus,

𝐄b1b1T,,bnbnT[limTRev(A;b1,,bn)]Rev(Myerson;v1,,vn)o(1).subscript𝐄formulae-sequencesimilar-tosubscript𝑏1subscriptsuperscript𝑏𝑇1similar-tosubscript𝑏𝑛subscriptsuperscript𝑏𝑇𝑛delimited-[]subscript𝑇Rev𝐴subscript𝑏1subscript𝑏𝑛RevMyersonsubscript𝑣1subscript𝑣𝑛𝑜1\mathop{\bf E\/}_{b_{1}\sim b^{T}_{1},\ldots,b_{n}\sim b^{T}_{n}}\left[\lim_{T% \rightarrow\infty}\mathrm{Rev}(A;b_{1},\ldots,b_{n})\right]\geq\mathrm{Rev}(% \mathrm{Myerson};v_{1},\ldots,v_{n})-o(1)\,.start_BIGOP bold_E end_BIGOP start_POSTSUBSCRIPT italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ∼ italic_b start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_lim start_POSTSUBSCRIPT italic_T → ∞ end_POSTSUBSCRIPT roman_Rev ( italic_A ; italic_b start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_b start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) ] ≥ roman_Rev ( roman_Myerson ; italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT ) - italic_o ( 1 ) .

The result follows by taking the expectation over the random draw of v1,,vn.subscript𝑣1subscript𝑣𝑛v_{1},\ldots,v_{n}.italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT .

We present the formal result about the equilibria of the meta-game below.

Corollary E.1 (Equilibria of Meta-Game).

Let A𝐴Aitalic_A be an IC, IR auction. Let T𝑇Titalic_T be the number of interactions. Assume that n𝑛nitalic_n agents use mean-based no-regret learning algorithms to bid in these repeated auctions. Then, there is an auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT such that

  • |xi(b)xi(b)|=o(1),|pi(b)pi(b)|=o(1),i[n],bBΔn.formulae-sequencesubscript𝑥𝑖𝑏subscriptsuperscript𝑥𝑖𝑏𝑜1formulae-sequencesubscript𝑝𝑖𝑏subscriptsuperscript𝑝𝑖𝑏𝑜1formulae-sequencefor-all𝑖delimited-[]𝑛for-all𝑏subscriptsuperscript𝐵𝑛Δ|x_{i}(b)-x^{\prime}_{i}(b)|=o(1),|p_{i}(b)-p^{\prime}_{i}(b)|=o(1),\forall i% \in[n],\forall b\in B^{n}_{\Delta}.| italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , | italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , ∀ italic_i ∈ [ italic_n ] , ∀ italic_b ∈ italic_B start_POSTSUPERSCRIPT italic_n end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

  • In the meta-game that is induced by Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT every agent can gain at most o(1)𝑜1o(1)italic_o ( 1 ) utility by misreporting its value to the bidding algorithm.

Proof of Corollary E.1.

Let v1,,vnsubscript𝑣1subscript𝑣𝑛v_{1},\ldots,v_{n}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , italic_v start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the values of the agents and let v^1,,v^nsubscript^𝑣1subscript^𝑣𝑛\hat{v}_{1},\ldots,\hat{v}_{n}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , … , over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_n end_POSTSUBSCRIPT be the reports to the bidding algorithms. Let Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT be auction obtained by feeding the auction A𝐴Aitalic_A into the transformation described in Theorem 4.4. The guarantees of this result show that

  • |xi(b)xi(b)|=o(1),|pi(b)pi(b)|=o(1),i[n]bBΔ,formulae-sequencesubscript𝑥𝑖𝑏subscriptsuperscript𝑥𝑖𝑏𝑜1formulae-sequencesubscript𝑝𝑖𝑏subscriptsuperscript𝑝𝑖𝑏𝑜1for-all𝑖delimited-[]𝑛for-all𝑏subscript𝐵Δ|x_{i}(b)-x^{\prime}_{i}(b)|=o(1),|p_{i}(b)-p^{\prime}_{i}(b)|=o(1),\forall i% \in[n]\forall b\in B_{\Delta},| italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_x start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , | italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) - italic_p start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) | = italic_o ( 1 ) , ∀ italic_i ∈ [ italic_n ] ∀ italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ,

  • 𝐏𝐫[biTv^i]=o(1),i[n],formulae-sequence𝐏𝐫delimited-[]superscriptsubscript𝑏𝑖𝑇subscript^𝑣𝑖𝑜1for-all𝑖delimited-[]𝑛\mathop{\bf Pr\/}[b_{i}^{T}\neq\hat{v}_{i}]=o(1),\forall i\in[n],start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ≠ over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] = italic_o ( 1 ) , ∀ italic_i ∈ [ italic_n ] ,

where biTsuperscriptsubscript𝑏𝑖𝑇b_{i}^{T}italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT is the bid of the i𝑖iitalic_i-th agent in round T.𝑇T.italic_T . Thus, with high probability after a large enough number of rounds, for every agent i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] the algorithm is bidding the reported value v^isubscript^𝑣𝑖\hat{v}_{i}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT no matter what the other reports v^isubscript^𝑣𝑖\hat{v}_{-i}over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT are. Since the auction Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is truthful, the utility of each agent is maximized when biT=vi.superscriptsubscript𝑏𝑖𝑇subscript𝑣𝑖b_{i}^{T}=v_{i}.italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . Hence, the optimal strategy, up to o(1)𝑜1o(1)italic_o ( 1 ), is to report vi=v^i,i[n].formulae-sequencesubscript𝑣𝑖subscript^𝑣𝑖for-all𝑖delimited-[]𝑛v_{i}=\hat{v}_{i},\forall i\in[n].italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , ∀ italic_i ∈ [ italic_n ] . To be more formal, the expected utility of the ilimit-from𝑖i-italic_i -th agent in round T𝑇Titalic_T is

𝐄[ui(biT,biT)]𝐄delimited-[]subscriptsuperscript𝑢𝑖superscriptsubscript𝑏𝑖𝑇superscriptsubscript𝑏𝑖𝑇\displaystyle\mathop{\bf E\/}\left[u^{\prime}_{i}(b_{i}^{T},b_{-i}^{T})\right]start_BIGOP bold_E end_BIGOP [ italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT ) ] =ui(v^i,v^i)+o(1),absentsubscriptsuperscript𝑢𝑖subscript^𝑣𝑖subscript^𝑣𝑖𝑜1\displaystyle=u^{\prime}_{i}(\hat{v}_{i},\hat{v}_{-i})+o(1)\,,= italic_u start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) + italic_o ( 1 ) ,

thus, since Asuperscript𝐴A^{\prime}italic_A start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT is truthful, this quantity is maximized for v^i=vi,subscript^𝑣𝑖subscript𝑣𝑖\hat{v}_{i}=v_{i},over^ start_ARG italic_v end_ARG start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , up to the o(1)𝑜1o(1)italic_o ( 1 ) term. ∎

Appendix F Omitted Details from Section 5

Proof of Proposition 5.1.

Let AT=pTA+(1pT)SPA,subscript𝐴𝑇subscript𝑝𝑇𝐴1subscript𝑝𝑇SPAA_{T}=p_{T}\cdot A+(1-p_{T})\cdot\mathrm{SPA},italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_A + ( 1 - italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) ⋅ roman_SPA , where A𝐴Aitalic_A is some auction with γA>0subscript𝛾𝐴0\gamma_{A}>0italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT > 0 and some pTsubscript𝑝𝑇p_{T}italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT that will be defined shortly. Notice that

γATpTγA+(1pT)γSPApTγA.subscript𝛾subscript𝐴𝑇subscript𝑝𝑇subscript𝛾𝐴1subscript𝑝𝑇subscript𝛾SPAsubscript𝑝𝑇subscript𝛾𝐴\gamma_{A_{T}}\geq p_{T}\cdot\gamma_{A}+(1-p_{T})\cdot\gamma_{\mathrm{SPA}}% \geq p_{T}\cdot\gamma_{A}\,.italic_γ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_POSTSUBSCRIPT ≥ italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT + ( 1 - italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) ⋅ italic_γ start_POSTSUBSCRIPT roman_SPA end_POSTSUBSCRIPT ≥ italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT .

Since the bidders are mean-based no-regret learners, we know that when

τ=1T0vixi(vi,bτ)pi(vi,bτ)τ=1T0vixi(b,bτ)pi(b,bτ)+δTT,i{0,1},bBΔ,formulae-sequencesuperscriptsubscript𝜏1subscript𝑇0subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝜏subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝜏superscriptsubscript𝜏1subscript𝑇0subscript𝑣𝑖subscript𝑥𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑖superscript𝑏subscript𝑏𝜏subscript𝛿𝑇𝑇formulae-sequencefor-all𝑖01for-allsuperscript𝑏subscript𝐵Δ\sum_{\tau=1}^{T_{0}}v_{i}\cdot x_{i}(v_{i},b_{\tau})-p_{i}(v_{i},b_{\tau})% \geq\sum_{\tau=1}^{T_{0}}v_{i}\cdot x_{i}(b^{\prime},b_{\tau})-p_{i}(b^{\prime% },b_{\tau})+\delta_{T}\cdot T,\forall i\in\{0,1\},\forall b^{\prime}\in B_{% \Delta}\,,∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) ≥ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) + italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T , ∀ italic_i ∈ { 0 , 1 } , ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ,

they will be bidding truthfully with probability at least 1ΔηT.1Δsubscript𝜂𝑇1-\Delta\cdot\eta_{T}.1 - roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT . We know that in every round

vixi(vi,bτ)pi(vi,bτ)subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝜏subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝜏\displaystyle v_{i}\cdot x_{i}(v_{i},b_{\tau})-p_{i}(v_{i},b_{\tau})italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) vixi(b,bτ)pi(b,bτ)+γATabsentsubscript𝑣𝑖subscript𝑥𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑖superscript𝑏subscript𝑏𝜏subscript𝛾subscript𝐴𝑇\displaystyle\geq v_{i}\cdot x_{i}(b^{\prime},b_{\tau})-p_{i}(b^{\prime},b_{% \tau})+\gamma_{A_{T}}≥ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) + italic_γ start_POSTSUBSCRIPT italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_POSTSUBSCRIPT
vixi(b,bτ)pi(b,bτ)+pTγA,i{0,1},bτ,bBΔ2,bviformulae-sequenceabsentsubscript𝑣𝑖subscript𝑥𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑇subscript𝛾𝐴formulae-sequencefor-all𝑖01subscript𝑏𝜏formulae-sequencesuperscript𝑏superscriptsubscript𝐵Δ2superscript𝑏subscript𝑣𝑖\displaystyle\geq v_{i}\cdot x_{i}(b^{\prime},b_{\tau})-p_{i}(b^{\prime},b_{% \tau})+p_{T}\cdot\gamma_{A},\forall i\in\{0,1\},b_{\tau},b^{\prime}\in B_{% \Delta}^{2},b^{\prime}\neq v_{i}≥ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) + italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT , ∀ italic_i ∈ { 0 , 1 } , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT

Thus, we define T0=min{t:pTγAtδTT}=δTT/pTγA.subscript𝑇0:𝑡subscript𝑝𝑇subscript𝛾𝐴𝑡subscript𝛿𝑇𝑇subscript𝛿𝑇𝑇subscript𝑝𝑇subscript𝛾𝐴T_{0}=\min\{t\in\mathbb{N}:p_{T}\cdot\gamma_{A}\cdot t\geq\delta_{T}\cdot T\}=% \nicefrac{{\delta_{T}\cdot T}}{{p_{T}\cdot\gamma_{A}}}.italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_min { italic_t ∈ blackboard_N : italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ⋅ italic_t ≥ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T } = / start_ARG italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG . The regret is

Reg~T(AT;vL,vH)subscript~Reg𝑇subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻\displaystyle\widetilde{\mathrm{Reg}}_{T}(A_{T};v_{L},v_{H})over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) =Reg~T0(AT;vL,vH)+(t=1TRev(vL,vH;SP)𝐄[t=T0+1TRev(bLt,bHt;A)])absentsubscript~Regsubscript𝑇0subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻superscriptsubscript𝑡1𝑇Revsubscript𝑣𝐿subscript𝑣𝐻SP𝐄delimited-[]superscriptsubscript𝑡subscript𝑇01𝑇Revsuperscriptsubscript𝑏𝐿𝑡superscriptsubscript𝑏𝐻𝑡𝐴\displaystyle=\widetilde{\mathrm{Reg}}_{T_{0}}(A_{T};v_{L},v_{H})+\left(\sum_{% t=1}^{T}\mathrm{Rev}(v_{L},v_{H};\mathrm{SP})-\mathop{\bf E\/}\left[\sum_{t=T_% {0}+1}^{T}\mathrm{Rev}(b_{L}^{t},b_{H}^{t};A)\right]\right)= over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) + ( ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ; roman_SP ) - start_BIGOP bold_E end_BIGOP [ ∑ start_POSTSUBSCRIPT italic_t = italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T end_POSTSUPERSCRIPT roman_Rev ( italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT ; italic_A ) ] )
vLT0+vL(TT0)(2ΔδT)(1pT)+(TT0)pTvLabsentsubscript𝑣𝐿subscript𝑇0subscript𝑣𝐿𝑇subscript𝑇02Δsubscript𝛿𝑇1subscript𝑝𝑇𝑇subscript𝑇0subscript𝑝𝑇subscript𝑣𝐿\displaystyle\leq v_{L}\cdot T_{0}+v_{L}\cdot(T-T_{0})\cdot(2\Delta\cdot\delta% _{T})\cdot(1-p_{T})+(T-T_{0})\cdot p_{T}\cdot v_{L}≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( italic_T - italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⋅ ( 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) ⋅ ( 1 - italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) + ( italic_T - italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⋅ italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT
vL(T0+2ΔδTT(1pT)+TpT)absentsubscript𝑣𝐿subscript𝑇02Δsubscript𝛿𝑇𝑇1subscript𝑝𝑇𝑇subscript𝑝𝑇\displaystyle\leq v_{L}\cdot\left(T_{0}+2\Delta\cdot\delta_{T}\cdot T\cdot(1-p% _{T})+T\cdot p_{T}\right)≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T ⋅ ( 1 - italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) + italic_T ⋅ italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT )
vL(δTTpTγA+2ΔδTT+pTT)absentsubscript𝑣𝐿subscript𝛿𝑇𝑇subscript𝑝𝑇subscript𝛾𝐴2Δsubscript𝛿𝑇𝑇subscript𝑝𝑇𝑇\displaystyle\leq v_{L}\cdot\left(\frac{\delta_{T}\cdot T}{p_{T}\cdot\gamma_{A% }}+2\Delta\cdot\delta_{T}\cdot T+p_{T}\cdot T\right)≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( divide start_ARG italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG + 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T + italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T )
vL(2ΔδTTpTγA+pTT),absentsubscript𝑣𝐿2Δsubscript𝛿𝑇𝑇subscript𝑝𝑇subscript𝛾𝐴subscript𝑝𝑇𝑇\displaystyle\leq v_{L}\cdot\left(\frac{2\Delta\cdot\delta_{T}\cdot T}{p_{T}% \cdot\gamma_{A}}+p_{T}\cdot T\right)\,,≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( divide start_ARG 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG + italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T ) ,

where the first inequality follows from the fact that after the first T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds the auctioneer regret is bounded the sum of the probabilities that the auction is SPA and the bidders do not bid truthfully, which is at most (1p)2ΔηT,1𝑝2Δsubscript𝜂𝑇(1-p)\cdot 2\Delta\cdot\eta_{T},( 1 - italic_p ) ⋅ 2 roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT , and the probability that auction is not SPA, which is pT.subscript𝑝𝑇p_{T}.italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT . The rest of the inequalities are just algebraic manipulations. Thus, by setting pT=2ΔδT/γAsubscript𝑝𝑇2Δsubscript𝛿𝑇subscript𝛾𝐴p_{T}=\sqrt{\nicefrac{{2\Delta\cdot\delta_{T}}}{{\gamma_{A}}}}italic_p start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = square-root start_ARG / start_ARG 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG end_ARG we get that

Reg~T(AT;vL,vH)vL(32ΔδTγAT),subscript~Reg𝑇subscript𝐴𝑇subscript𝑣𝐿subscript𝑣𝐻subscript𝑣𝐿32Δsubscript𝛿𝑇subscript𝛾𝐴𝑇\widetilde{\mathrm{Reg}}_{T}(A_{T};v_{L},v_{H})\leq v_{L}\cdot\left(3\cdot% \sqrt{\frac{2\Delta\cdot\delta_{T}}{\gamma_{A}}}\cdot T\right)\,,over~ start_ARG roman_Reg end_ARG start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ( italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( 3 ⋅ square-root start_ARG divide start_ARG 2 roman_Δ ⋅ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG end_ARG ⋅ italic_T ) ,

which concludes the proof. ∎

Proof of Proposition 5.2.

Consider the vL,vHsubscript𝑣𝐿subscript𝑣𝐻v_{L},v_{H}italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT pairs of the form vH=vL+1/Δsubscript𝑣𝐻subscript𝑣𝐿1Δv_{H}=v_{L}+\nicefrac{{1}}{{\Delta}}italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG, such that both are bounded away from 00 and 1111. Then, Myerson’s payment formula shows that pH(vH,vL)(vH1/Δ)xH(vH,vL)=vLxH(vH,vL).subscript𝑝𝐻subscript𝑣𝐻subscript𝑣𝐿subscript𝑣𝐻1Δsubscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿subscript𝑣𝐿subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿p_{H}(v_{H},v_{L})\leq(v_{H}-\nicefrac{{1}}{{\Delta}})\cdot x_{H}(v_{H},v_{L})% =v_{L}\cdot x_{H}(v_{H},v_{L}).italic_p start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) ≤ ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ) ⋅ italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) = italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) . We first argue that xH(vH,vL)<1.subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿1x_{H}(v_{H},v_{L})<1.italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) < 1 . Indeed, suppose that xH(vH,vL)=1.subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿1x_{H}(v_{H},v_{L})=1.italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) = 1 . Then the low type gets no signal about their bid and hence bids uniformly at random between [0,vL]0subscript𝑣𝐿[0,v_{L}][ 0 , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ]. In particular, with some CΔsubscript𝐶ΔC_{\Delta}italic_C start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT probability that is independent of T,𝑇T,italic_T , the low type bids the value bL=vL/2subscript𝑏𝐿subscript𝑣𝐿2b_{L}=v_{L}/2italic_b start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2. Now the only way for the auction ATsubscript𝐴𝑇A_{T}italic_A start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT to generate (vLo(1))subscript𝑣𝐿𝑜1(v_{L}-o(1))( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT - italic_o ( 1 ) ) revenue from such rounds is if xH(vH,vL/2)xH(vL,vL/2)=1o(1).subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿2subscript𝑥𝐻subscript𝑣𝐿subscript𝑣𝐿21𝑜1x_{H}(v_{H},v_{L}/2)-x_{H}(v_{L},v_{L}/2)=1-o(1).italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 ) - italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 ) = 1 - italic_o ( 1 ) . But if this is the case, then consider the valuation pair (vL/2,vL/2+1/Δ)subscript𝑣𝐿2subscript𝑣𝐿21Δ(v_{L}/2,v_{L}/2+\nicefrac{{1}}{{\Delta}})( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 + / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ): the auctioneer allocates at most xH(vL/2+1/Δ,vL/2)o(1)subscript𝑥𝐻subscript𝑣𝐿21Δsubscript𝑣𝐿2𝑜1x_{H}(v_{L}/2+\nicefrac{{1}}{{\Delta}},v_{L}/2)\leq o(1)italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 + / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 ) ≤ italic_o ( 1 ) per round, and gets almost no revenue from the high type. Moreover, the low type will generate at most vL/2subscript𝑣𝐿2v_{L}/2italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2 revenue, so the the regret of the auctioneer is at linear in T𝑇Titalic_T; this gives the desired contradiction.

Since xH(vH,vL)<1subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿1x_{H}(v_{H},v_{L})<1italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ) < 1, let q:=1xH(vH,vL)assign𝑞1subscript𝑥𝐻subscript𝑣𝐻subscript𝑣𝐿q:=1-x_{H}(v_{H},v_{L})italic_q := 1 - italic_x start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ). Then, xL(vL,vH)qsubscript𝑥𝐿subscript𝑣𝐿subscript𝑣𝐻𝑞x_{L}(v_{L},v_{H})\leq qitalic_x start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≤ italic_q and so uL(vL,vH)uL(vL1/Δ,vH)q1/Δqsubscript𝑢𝐿subscript𝑣𝐿subscript𝑣𝐻subscript𝑢𝐿subscript𝑣𝐿1Δsubscript𝑣𝐻𝑞1Δ𝑞u_{L}(v_{L},v_{H})-u_{L}(v_{L}-\nicefrac{{1}}{{\Delta}},v_{H})\leq q\cdot% \nicefrac{{1}}{{\Delta}}\leq qitalic_u start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) - italic_u start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT - / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) ≤ italic_q ⋅ / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ≤ italic_q. In order to cancel the effect of the learning rate of ηTsubscript𝜂𝑇\eta_{T}italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT, we need to wait for T0:=Ω(1)/(qηT)assignsubscript𝑇0Ω1𝑞subscript𝜂𝑇T_{0}:=\nicefrac{{\Omega(1)}}{{(q\cdot\eta_{T})}}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT := / start_ARG roman_Ω ( 1 ) end_ARG start_ARG ( italic_q ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) end_ARG rounds. For some CΔsubscriptsuperscript𝐶ΔC^{\prime}_{\Delta}italic_C start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT fraction of these T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds the agent of low type will bid vL/2subscript𝑣𝐿2v_{L}/2italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT / 2, and an argument similar the previous paragraph shows that the revenue of the auction will be at least 1/Δo(1)1Δ𝑜11/\Delta-o(1)1 / roman_Δ - italic_o ( 1 ) less than vLsubscript𝑣𝐿v_{L}italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT. Thus, the regret in these T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds will be Ω(T0),Ωsubscript𝑇0\Omega(T_{0}),roman_Ω ( italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) , where we are hiding constants depending on ΔΔ\Deltaroman_Δ. Let us assume that after T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds the low type starts bidding truthfully. Then, the total regret in this period due to allocation of the item to the low type is Ω((TT0)q)Ω𝑇subscript𝑇0𝑞\Omega\left((T-T_{0})\cdot q\right)roman_Ω ( ( italic_T - italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⋅ italic_q ). Summing up the two terms we get a regret of Ω(1/(qηT)+qT1/ηT).Ω1𝑞subscript𝜂𝑇𝑞𝑇1subscript𝜂𝑇\Omega\left(\nicefrac{{1}}{{(q\eta_{T})}}+q\cdot T-\nicefrac{{1}}{{\eta_{T}}}% \right).roman_Ω ( / start_ARG 1 end_ARG start_ARG ( italic_q italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) end_ARG + italic_q ⋅ italic_T - / start_ARG 1 end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG ) . Since ηT=Θ(1/T)subscript𝜂𝑇Θ1𝑇\eta_{T}=\Theta(1/\sqrt{T})italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = roman_Θ ( 1 / square-root start_ARG italic_T end_ARG ), this is Ω(T/q+qTT)Ω𝑇𝑞𝑞𝑇𝑇\Omega(\sqrt{T}/q+qT-\sqrt{T})roman_Ω ( square-root start_ARG italic_T end_ARG / italic_q + italic_q italic_T - square-root start_ARG italic_T end_ARG ), which for any choice of q𝑞qitalic_q is Ω(T3/4).Ωsuperscript𝑇34\Omega\left(T^{3/4}\right).roman_Ω ( italic_T start_POSTSUPERSCRIPT 3 / 4 end_POSTSUPERSCRIPT ) .

Proof of Theorem 5.3.

We will upper bound the auctioneer regret in the two epochs {1,,T0},1subscript𝑇0\{1,\ldots,T_{0}\},{ 1 , … , italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT } , and {T0+1,,T},subscript𝑇01𝑇\{T_{0}+1,\ldots,T\},{ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + 1 , … , italic_T } , separately, where T0[T]subscript𝑇0delimited-[]𝑇T_{0}\in[T]italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ∈ [ italic_T ] is a parameter of the design which we will define shortly. For the first epoch, we will use the simple upper bound of vLT0.subscript𝑣𝐿subscript𝑇0v_{L}\cdot T_{0}.italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT .

Let us consider the bid distribution of the two bidders after T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds. Since they are mean-based no-regret learners we know that if

τ=1T0vixi(vi,bτ)pi(vi,bτ)τ=1T0vixi(b,bτ)pi(b,bτ)+δTT,i{1,2},bBΔ,formulae-sequencesuperscriptsubscript𝜏1subscript𝑇0subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝜏subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝜏superscriptsubscript𝜏1subscript𝑇0subscript𝑣𝑖subscript𝑥𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑖superscript𝑏subscript𝑏𝜏subscript𝛿𝑇𝑇formulae-sequencefor-all𝑖12for-allsuperscript𝑏subscript𝐵Δ\sum_{\tau=1}^{T_{0}}v_{i}\cdot x_{i}(v_{i},b_{\tau})-p_{i}(v_{i},b_{\tau})% \geq\sum_{\tau=1}^{T_{0}}v_{i}\cdot x_{i}(b^{\prime},b_{\tau})-p_{i}(b^{\prime% },b_{\tau})+\delta_{T}\cdot T,\forall i\in\{1,2\},\forall b^{\prime}\in B_{% \Delta}\,,∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) ≥ ∑ start_POSTSUBSCRIPT italic_τ = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) + italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T , ∀ italic_i ∈ { 1 , 2 } , ∀ italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ,

then, by a union bound over the possible bids, they will both be bidding truthfully with probability at least 12ΔηT.12Δsubscript𝜂𝑇1-2\Delta\cdot\eta_{T}.1 - 2 roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT .

We know that in every round τ[T0]𝜏delimited-[]subscript𝑇0\tau\in[T_{0}]italic_τ ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] we have that

vixi(vi,bτ)pi(vi,bτ)subscript𝑣𝑖subscript𝑥𝑖subscript𝑣𝑖subscript𝑏𝜏subscript𝑝𝑖subscript𝑣𝑖subscript𝑏𝜏\displaystyle v_{i}\cdot x_{i}(v_{i},b_{\tau})-p_{i}(v_{i},b_{\tau})italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) vixi(b,bτ)pi(b,bτ)+γA,i{0,1},bτ,bBΔ2,bvi.formulae-sequenceabsentsubscript𝑣𝑖subscript𝑥𝑖superscript𝑏subscript𝑏𝜏subscript𝑝𝑖superscript𝑏subscript𝑏𝜏subscript𝛾𝐴formulae-sequencefor-all𝑖01subscript𝑏𝜏formulae-sequencesuperscript𝑏superscriptsubscript𝐵Δ2superscript𝑏subscript𝑣𝑖\displaystyle\geq v_{i}\cdot x_{i}(b^{\prime},b_{\tau})-p_{i}(b^{\prime},b_{% \tau})+\gamma_{A},\forall i\in\{0,1\},b_{\tau},b^{\prime}\in B_{\Delta}^{2},b^% {\prime}\neq v_{i}\,.≥ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ⋅ italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) - italic_p start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT ) + italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT , ∀ italic_i ∈ { 0 , 1 } , italic_b start_POSTSUBSCRIPT italic_τ end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ≠ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT .

Therefore, we set T0=min{t:tγAtδTT}=δTT/γA.subscript𝑇0:𝑡𝑡subscript𝛾𝐴𝑡subscript𝛿𝑇𝑇subscript𝛿𝑇𝑇subscript𝛾𝐴T_{0}=\min\{t\in\mathbb{N}:t\cdot\gamma_{A}\cdot t\geq\delta_{T}\cdot T\}=% \nicefrac{{\delta_{T}\cdot T}}{{\gamma_{A}}}.italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = roman_min { italic_t ∈ blackboard_N : italic_t ⋅ italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ⋅ italic_t ≥ italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T } = / start_ARG italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG . Thus, we can upper bound the cumulative auctioneer regret by

Reg~(A,,A,SPA,,SPA;vL,vH)~Reg𝐴𝐴SPASPAsubscript𝑣𝐿subscript𝑣𝐻\displaystyle\widetilde{\mathrm{Reg}}(A,\ldots,A,\mathrm{SPA},\ldots,\mathrm{% SPA};v_{L},v_{H})over~ start_ARG roman_Reg end_ARG ( italic_A , … , italic_A , roman_SPA , … , roman_SPA ; italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT ) vLT0+vL(TT0)2ΔηTabsentsubscript𝑣𝐿subscript𝑇0subscript𝑣𝐿𝑇subscript𝑇02Δsubscript𝜂𝑇\displaystyle\leq v_{L}\cdot T_{0}+v_{L}\cdot(T-T_{0})\cdot 2\Delta\cdot\eta_{T}≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT + italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ ( italic_T - italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) ⋅ 2 roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
vLδTTγA+vLT2ΔηTabsentsubscript𝑣𝐿subscript𝛿𝑇𝑇subscript𝛾𝐴subscript𝑣𝐿𝑇2Δsubscript𝜂𝑇\displaystyle\leq v_{L}\cdot\frac{\delta_{T}\cdot T}{\gamma_{A}}+v_{L}\cdot T% \cdot 2\Delta\cdot\eta_{T}≤ italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ divide start_ARG italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG + italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT ⋅ italic_T ⋅ 2 roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT
=O(δTT(1γA+Δ)),absent𝑂subscript𝛿𝑇𝑇1subscript𝛾𝐴Δ\displaystyle=O\left(\delta_{T}\cdot T\cdot\left(\frac{1}{\gamma_{A}}+\Delta% \right)\right)\,,= italic_O ( italic_δ start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ italic_T ⋅ ( divide start_ARG 1 end_ARG start_ARG italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT end_ARG + roman_Δ ) ) ,

where the first inequality follows from the fact that with probability at most 2ΔηT2Δsubscript𝜂𝑇2\Delta\cdot\eta_{T}2 roman_Δ ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT one of the two bidders will not be truthful in the last (TT0)𝑇subscript𝑇0(T-T_{0})( italic_T - italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ) rounds, and the other inequalities are just algebraic manipulations. ∎

Proof of Proposition 5.4.

It is not hard to see that in the setting we are working on the auctioneer cannot have negative auctioneer regret in any interval of the interaction. For instance, when vH=vL1/Δ,subscript𝑣𝐻subscript𝑣𝐿1Δv_{H}=v_{L}-1/\Delta,italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT - 1 / roman_Δ , the SPA performs optimally. Since every At,t[T],subscript𝐴𝑡𝑡delimited-[]𝑇A_{t},t\in[T],italic_A start_POSTSUBSCRIPT italic_t end_POSTSUBSCRIPT , italic_t ∈ [ italic_T ] , is a truthful auction, Myerson’s lemma shows that

uit(vi,bi)uit(b,bi)=z=bvixit(z,bi)𝑑z(vib)xit(b,bi),i{1,2},vi,b,biBΔ3,formulae-sequencesubscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖subscript𝑏𝑖subscriptsuperscript𝑢𝑡𝑖superscript𝑏subscript𝑏𝑖superscriptsubscript𝑧superscript𝑏subscript𝑣𝑖subscriptsuperscript𝑥𝑡𝑖𝑧subscript𝑏𝑖differential-d𝑧subscript𝑣𝑖superscript𝑏subscriptsuperscript𝑥𝑡𝑖superscript𝑏subscript𝑏𝑖formulae-sequencefor-all𝑖12for-allsubscript𝑣𝑖superscript𝑏subscript𝑏𝑖subscriptsuperscript𝐵3Δu^{t}_{i}(v_{i},b_{-i})-u^{t}_{i}(b^{\prime},b_{-i})=\int_{z=b^{\prime}}^{v_{i% }}x^{t}_{i}(z,b_{-i})dz-\left(v_{i}-b^{\prime}\right)\cdot x^{t}_{i}(b^{\prime% },b_{-i}),\forall i\in\{1,2\},\forall v_{i},b^{\prime},b_{-i}\in B^{3}_{\Delta% }\,,italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) = ∫ start_POSTSUBSCRIPT italic_z = italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT end_POSTSUPERSCRIPT italic_x start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_z , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) italic_d italic_z - ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ⋅ italic_x start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) , ∀ italic_i ∈ { 1 , 2 } , ∀ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT ,

so for b=vi1/Δsuperscript𝑏subscript𝑣𝑖1Δb^{\prime}=v_{i}-1/\Deltaitalic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ we get that

uit(vi,bi)uit(vi1/Δ,bi)1Δ,vi,b,biBΔ3.formulae-sequencesubscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖subscript𝑏𝑖subscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖1Δsubscript𝑏𝑖1Δfor-allsubscript𝑣𝑖superscript𝑏subscript𝑏𝑖subscriptsuperscript𝐵3Δu^{t}_{i}(v_{i},b_{-i})-u^{t}_{i}(v_{i}-1/\Delta,b_{-i})\leq\frac{1}{\Delta},% \forall v_{i},b^{\prime},b_{-i}\in B^{3}_{\Delta}\,.italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ≤ divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG , ∀ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

Thus, in every iteration the utility gain of bidding visubscript𝑣𝑖v_{i}italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT is at most 1/Δ1Δ1/\Delta1 / roman_Δ greater than bidding vi1/Δ.subscript𝑣𝑖1Δv_{i}-1/\Delta.italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ . Summing up over the first T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT iterations, we get that

t=1T0(uit(vi,bi)uit(vi1/Δ,bi))T0Δ,vi,b,biBΔ3.formulae-sequencesuperscriptsubscript𝑡1subscript𝑇0subscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖subscript𝑏𝑖subscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖1Δsubscript𝑏𝑖subscript𝑇0Δfor-allsubscript𝑣𝑖superscript𝑏subscript𝑏𝑖subscriptsuperscript𝐵3Δ\sum_{t=1}^{T_{0}}\left(u^{t}_{i}(v_{i},b_{-i})-u^{t}_{i}(v_{i}-1/\Delta,b_{-i% })\right)\leq\frac{T_{0}}{\Delta},\forall v_{i},b^{\prime},b_{-i}\in B^{3}_{% \Delta}\,.∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ) ≤ divide start_ARG italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG roman_Δ end_ARG , ∀ italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT , italic_b start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ∈ italic_B start_POSTSUPERSCRIPT 3 end_POSTSUPERSCRIPT start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT .

Let us now shift our attention to the weights that MWU puts on vi1/Δ,vi,subscript𝑣𝑖1Δsubscript𝑣𝑖v_{i}-1/\Delta,v_{i},italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ , italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , after T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT iterations. We have

𝐏𝐫[biT0=vi]𝐏𝐫[bT0=vi1/Δ]𝐏𝐫delimited-[]subscriptsuperscript𝑏subscript𝑇0𝑖subscript𝑣𝑖𝐏𝐫delimited-[]superscript𝑏subscript𝑇0subscript𝑣𝑖1Δ\displaystyle\frac{\mathop{\bf Pr\/}[b^{T_{0}}_{i}=v_{i}]}{\mathop{\bf Pr\/}[b% ^{T_{0}}=v_{i}-1/\Delta]}divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] end_ARG start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ ] end_ARG =eηTt=1T0(uit(vi,bit)uit(vi1/Δ,bit))absentsuperscript𝑒subscript𝜂𝑇superscriptsubscript𝑡1subscript𝑇0subscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖subscriptsuperscript𝑏𝑡𝑖subscriptsuperscript𝑢𝑡𝑖subscript𝑣𝑖1Δsubscriptsuperscript𝑏𝑡𝑖\displaystyle=e^{\eta_{T}\sum_{t=1}^{T_{0}}\left(u^{t}_{i}(v_{i},b^{t}_{-i})-u% ^{t}_{i}(v_{i}-1/\Delta,b^{t}_{-i})\right)}= italic_e start_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ∑ start_POSTSUBSCRIPT italic_t = 1 end_POSTSUBSCRIPT start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT ( italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ , italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT - italic_i end_POSTSUBSCRIPT ) ) end_POSTSUPERSCRIPT
eηTT0Δ,absentsuperscript𝑒subscript𝜂𝑇subscript𝑇0Δ\displaystyle\leq e^{\eta_{T}\cdot\frac{T_{0}}{\Delta}}\,,≤ italic_e start_POSTSUPERSCRIPT italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ⋅ divide start_ARG italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_ARG start_ARG roman_Δ end_ARG end_POSTSUPERSCRIPT ,

so for T0=Δ/ηTsubscript𝑇0Δsubscript𝜂𝑇T_{0}=\nicefrac{{\Delta}}{{\eta_{T}}}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = / start_ARG roman_Δ end_ARG start_ARG italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT end_ARG we have that

𝐏𝐫[bT0=vi1/Δ]𝐏𝐫[biT0=vi]e.𝐏𝐫delimited-[]superscript𝑏subscript𝑇0subscript𝑣𝑖1Δ𝐏𝐫delimited-[]subscriptsuperscript𝑏subscript𝑇0𝑖subscript𝑣𝑖𝑒\mathop{\bf Pr\/}[b^{T_{0}}=v_{i}-1/\Delta]\geq\frac{\mathop{\bf Pr\/}[b^{T_{0% }}_{i}=v_{i}]}{e}\,.start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ ] ≥ divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] end_ARG start_ARG italic_e end_ARG .

This immediately implies that

𝐏𝐫[bt=vi1/Δ]𝐏𝐫[bit=vi]e,t[T0].formulae-sequence𝐏𝐫delimited-[]superscript𝑏𝑡subscript𝑣𝑖1Δ𝐏𝐫delimited-[]subscriptsuperscript𝑏𝑡𝑖subscript𝑣𝑖𝑒for-all𝑡delimited-[]subscript𝑇0\mathop{\bf Pr\/}[b^{t}=v_{i}-1/\Delta]\geq\frac{\mathop{\bf Pr\/}[b^{t}_{i}=v% _{i}]}{e},\forall t\in[T_{0}]\,.start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ ] ≥ divide start_ARG start_BIGOP bold_Pr end_BIGOP [ italic_b start_POSTSUPERSCRIPT italic_t end_POSTSUPERSCRIPT start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ] end_ARG start_ARG italic_e end_ARG , ∀ italic_t ∈ [ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT ] .

Thus, the probability of bidding truthfully of both algorithms is bounded by 9/10.9109/10.9 / 10 . Thus, when vH=vL+1/Δsubscript𝑣𝐻subscript𝑣𝐿1Δv_{H}=v_{L}+1/\Deltaitalic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT + 1 / roman_Δ when both bidders are not bidding truthfully the revenue loss compared to SPASPA\mathrm{SPA}roman_SPA is at least 1/Δ.1Δ1/\Delta.1 / roman_Δ . Putting it together, we can see that within the first T0subscript𝑇0T_{0}italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT rounds the total revenue loss compared to SPASPA\mathrm{SPA}roman_SPA is at least C1/ΔT0=CηT=CT,𝐶1Δsubscript𝑇0𝐶subscript𝜂𝑇𝐶𝑇C\cdot\nicefrac{{1}}{{\Delta}}\cdot T_{0}=C\cdot\eta_{T}=C\cdot\sqrt{T},italic_C ⋅ / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ⋅ italic_T start_POSTSUBSCRIPT 0 end_POSTSUBSCRIPT = italic_C ⋅ italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = italic_C ⋅ square-root start_ARG italic_T end_ARG , for some absolute constant C>0.𝐶0C>0.italic_C > 0 .

Next, we show that the auction we defined in Definition 5.5 is optimal, in terms of its parameter γA.subscript𝛾𝐴\gamma_{A}.italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT .

Lemma F.1.

In the setting with two bidders it holds that the optimal choice of the parameter γAsubscript𝛾𝐴\gamma_{A}italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT is Θ(1/Δ2).Θ1superscriptΔ2\Theta\left(\nicefrac{{1}}{{\Delta^{2}}}\right).roman_Θ ( / start_ARG 1 end_ARG start_ARG roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG ) . Moreover, the auction defined in Definition 5.5 achieves that bound.

Proof of Lemma F.1.

Consider some auction A𝐴Aitalic_A and fix the bid of the second bidder to be bBΔ.superscript𝑏subscript𝐵Δb^{\prime}\in B_{\Delta}.italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . Then, x1(,b)subscript𝑥1superscript𝑏x_{1}(\cdot,b^{\prime})italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is a non-decreasing function, with 0x1(b,b)1,bBΔ.formulae-sequence0subscript𝑥1𝑏superscript𝑏1for-all𝑏subscript𝐵Δ0\leq x_{1}(b,b^{\prime})\leq 1,\forall b\in B_{\Delta}.0 ≤ italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ 1 , ∀ italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT . Notice that for any consecutive bids, Myerson’s lemma shows that

u1(b,b)u1(b1/Δ,b)1/Δ(x1(b,b)x1(b1/Δ,b)).subscript𝑢1𝑏superscript𝑏subscript𝑢1𝑏1Δsuperscript𝑏1Δsubscript𝑥1𝑏superscript𝑏subscript𝑥1𝑏1Δsuperscript𝑏u_{1}(b,b^{\prime})-u_{1}(b-1/\Delta,b^{\prime})\leq\nicefrac{{1}}{{\Delta}}% \cdot\left(x_{1}(b,b^{\prime})-x_{1}(b-1/\Delta,b^{\prime})\right)\,.italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_u start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b - 1 / roman_Δ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ / start_ARG 1 end_ARG start_ARG roman_Δ end_ARG ⋅ ( italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b - 1 / roman_Δ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ) .

Since there are 1/Δ1Δ1/\Delta1 / roman_Δ different bBΔ𝑏subscript𝐵Δb\in B_{\Delta}italic_b ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT and the function x1(,b)subscript𝑥1superscript𝑏x_{1}(\cdot,b^{\prime})italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( ⋅ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) is monotone and bounded between [0,1]01[0,1][ 0 , 1 ] we have

b>0x1(b,b)x1(b1/Δ,b)subscript𝑏0subscript𝑥1𝑏superscript𝑏subscript𝑥1𝑏1Δsuperscript𝑏\displaystyle\sum_{b>0}x_{1}(b,b^{\prime})-x_{1}(b-1/\Delta,b^{\prime})∑ start_POSTSUBSCRIPT italic_b > 0 end_POSTSUBSCRIPT italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( italic_b - 1 / roman_Δ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) =x1(1,b)x1(0,b)absentsubscript𝑥11superscript𝑏subscript𝑥10superscript𝑏\displaystyle=x_{1}(1,b^{\prime})-x_{1}(0,b^{\prime})= italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 1 , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_x start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ( 0 , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT )
1,absent1\displaystyle\leq 1\,,≤ 1 ,

and since there are 1/Δ1Δ1/\Delta1 / roman_Δ terms in the summation, all of which are non-negative at least one of them must be at most 1/Δ.1Δ1/\Delta.1 / roman_Δ . Let b1BΔsubscriptsuperscript𝑏1subscript𝐵Δb^{*}_{1}\in B_{\Delta}italic_b start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∈ italic_B start_POSTSUBSCRIPT roman_Δ end_POSTSUBSCRIPT be such that x(b1,b)x(b1/Δ,b)1Δ.𝑥subscriptsuperscript𝑏1superscript𝑏𝑥superscript𝑏1Δsuperscript𝑏1Δx(b^{*}_{1},b^{\prime})-x(b^{*}-1/\Delta,b^{\prime})\leq\frac{1}{\Delta}.italic_x ( italic_b start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) - italic_x ( italic_b start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT - 1 / roman_Δ , italic_b start_POSTSUPERSCRIPT ′ end_POSTSUPERSCRIPT ) ≤ divide start_ARG 1 end_ARG start_ARG roman_Δ end_ARG . Then, picking v1=b1subscript𝑣1subscriptsuperscript𝑏1v_{1}=b^{*}_{1}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT = italic_b start_POSTSUPERSCRIPT ∗ end_POSTSUPERSCRIPT start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT witnesses that γA1Δ2.subscript𝛾𝐴1superscriptΔ2\gamma_{A}\leq\frac{1}{\Delta^{2}}.italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT ≤ divide start_ARG 1 end_ARG start_ARG roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT end_ARG .

Appendix G Extensions

In this section we discuss potential extensions of our model and adaptations of our results.

Extension to partial feedback setting.

Our results can be adapted to the partial feedback setting, with different quantitative bounds. In particular, there are mean-based no-regret algorithms such as EXP3 (Braverman et al., 2018) with ηT=O~(T1/4).subscript𝜂𝑇~𝑂superscript𝑇14\eta_{T}=\widetilde{O}(T^{1/4}).italic_η start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT = over~ start_ARG italic_O end_ARG ( italic_T start_POSTSUPERSCRIPT 1 / 4 end_POSTSUPERSCRIPT ) . Notice that our positive results are stated for mean-based learners, so the guarantees hold in this setting as well.

Extension to multiple bidders.

We underline that our results in Section 4 are already stated and proven for multiple bidders. For our upper bounds in Section 5 there is a 1/n1𝑛1/n1 / italic_n degradation to the auctioneer regret bound. When we are dealing with n𝑛nitalic_n bidders we can create a strictly IC auction A𝐴Aitalic_A by building upon our “staircase auction” approach for two bidders in the following way: we select some bidder i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] uniformly at random (independently of their bids) and then we allocate to bidder i𝑖iitalic_i with probability bi.subscript𝑏𝑖b_{i}.italic_b start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT . Thus, for each bidder i[n]𝑖delimited-[]𝑛i\in[n]italic_i ∈ [ italic_n ] their allocation probability xi(b)subscript𝑥𝑖𝑏x_{i}(b)italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_b ) is a linear function with xi(0)=0,xi(1)=1/n.formulae-sequencesubscript𝑥𝑖00subscript𝑥𝑖11𝑛x_{i}(0)=0,x_{i}(1)=1/n.italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 0 ) = 0 , italic_x start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( 1 ) = 1 / italic_n . Hence, Myerson’s lemma shows that ui(vi)ui(vi1/Δ)=Θ(1/(nΔ2)),subscript𝑢𝑖subscript𝑣𝑖subscript𝑢𝑖subscript𝑣𝑖1ΔΘ1𝑛superscriptΔ2u_{i}(v_{i})-u_{i}(v_{i}-1/\Delta)=\Theta(1/(n\Delta^{2})),italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ) - italic_u start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT ( italic_v start_POSTSUBSCRIPT italic_i end_POSTSUBSCRIPT - 1 / roman_Δ ) = roman_Θ ( 1 / ( italic_n roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) , thus, γA=Θ(1/(nΔ2)).subscript𝛾𝐴Θ1𝑛superscriptΔ2\gamma_{A}=\Theta(1/(n\Delta^{2})).italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT = roman_Θ ( 1 / ( italic_n roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) ) . Recall that in the two-bidder case we have shown that this auction gives γA=Θ(1/Δ2),subscript𝛾𝐴Θ1superscriptΔ2\gamma_{A}=\Theta(1/\Delta^{2}),italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT = roman_Θ ( 1 / roman_Δ start_POSTSUPERSCRIPT 2 end_POSTSUPERSCRIPT ) , so the degradation in γAsubscript𝛾𝐴\gamma_{A}italic_γ start_POSTSUBSCRIPT italic_A end_POSTSUBSCRIPT by 1/n1𝑛1/n1 / italic_n leads to a degradation of the same factor in the auctioneer regret compared to the two-bidder setting.

Extension of regret bounds to the distributional setting.

In Section 5 we consider a setting where the auctioneer does not have any distributional knowledge about the valuation of the bidders. Notice that our lower bounds are witnessed by valuation pairs of the low type, high type, of the form vL=v,vH=v+1/Δ.formulae-sequencesubscript𝑣𝐿𝑣subscript𝑣𝐻𝑣1Δv_{L}=v,v_{H}=v+1/\Delta.italic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_v , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_v + 1 / roman_Δ . Let us now consider a distributional setting where v1,v2subscript𝑣1subscript𝑣2v_{1},v_{2}italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT are drawn from distributions 𝒟1,𝒟2,subscript𝒟1subscript𝒟2\mathcal{D}_{1},\mathcal{D}_{2},caligraphic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , and then the two bidders participate in repeated second-price auctions using MWU parametrized by these valuations. Similarly as in the prior-free setting, the goal of the auctioneer is to have small expected regret, where the expectation is over the random draw of the valuations and the random behavior of MWU. Notice that the cumulative revenue of SPA when the bidders are truthful is T𝔼v1𝒟1,v2𝒟2[min{v1,v2}],𝑇subscript𝔼formulae-sequencesimilar-tosubscript𝑣1subscript𝒟1similar-tosubscript𝑣2subscript𝒟2delimited-[]subscript𝑣1subscript𝑣2T\cdot\mathbb{E}_{v_{1}\sim\mathcal{D}_{1},v_{2}\sim\mathcal{D}_{2}}[\min\{v_{% 1},v_{2}\}],italic_T ⋅ blackboard_E start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ caligraphic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ caligraphic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_min { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } ] , so this is the benchmark the auctioneer competes with (in this setting, we can modify the benchmark to be SPA with personalized reserves with the same arguments). If these distributions 𝒟1,𝒟2,subscript𝒟1subscript𝒟2\mathcal{D}_{1},\mathcal{D}_{2},caligraphic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , caligraphic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT , place some constant probability (i.e., independent of T𝑇Titalic_T) on every element of {0,1/Δ,2/Δ,,1}01Δ2Δ1\{0,1/\Delta,2/\Delta,\ldots,1\}{ 0 , 1 / roman_Δ , 2 / roman_Δ , … , 1 } then with some constant probability we will see a draw of the form vL=v,vH=v+1/Δformulae-sequencesubscript𝑣𝐿𝑣subscript𝑣𝐻𝑣1Δv_{L}=v,v_{H}=v+1/\Deltaitalic_v start_POSTSUBSCRIPT italic_L end_POSTSUBSCRIPT = italic_v , italic_v start_POSTSUBSCRIPT italic_H end_POSTSUBSCRIPT = italic_v + 1 / roman_Δ, so these pairs will be contributing a constant fraction of the expected revenue of the second-price auction, i.e., the term 𝔼v1𝒟1,v2𝒟2[min{v1,v2}].subscript𝔼formulae-sequencesimilar-tosubscript𝑣1subscript𝒟1similar-tosubscript𝑣2subscript𝒟2delimited-[]subscript𝑣1subscript𝑣2\mathbb{E}_{v_{1}\sim\mathcal{D}_{1},v_{2}\sim\mathcal{D}_{2}}[\min\{v_{1},v_{% 2}\}].blackboard_E start_POSTSUBSCRIPT italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT ∼ caligraphic_D start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT ∼ caligraphic_D start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT end_POSTSUBSCRIPT [ roman_min { italic_v start_POSTSUBSCRIPT 1 end_POSTSUBSCRIPT , italic_v start_POSTSUBSCRIPT 2 end_POSTSUBSCRIPT } ] . Thus, if the auctioneer wants to have expected regret at most O(RT)𝑂subscript𝑅𝑇O(R_{T})italic_O ( italic_R start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ), they need to have regret at most O(RT)𝑂subscript𝑅𝑇O(R_{T})italic_O ( italic_R start_POSTSUBSCRIPT italic_T end_POSTSUBSCRIPT ) for all such valuation pairs, where in the notation O()𝑂O(\cdot)italic_O ( ⋅ ) we are suppressing all the parameters that do not depend on T.𝑇T.italic_T .