Quadratic v. Pairwise

Comparing two approaches to community-driven public goods funding

Oct 27, 2024

Introduction

As the Gitcoin community is fond of saying, “public goods are good.”

Public goods are defined by economists as goods (“useful things”) which are both non-excludable (anyone can use them) and non-rivalrous (my use doesn’t prevent your use), classic examples of which include knowledge, clean air, lighthouses, and open-source software — obviously valuable things.

The classic problem posed by public goods is that while anyone can enjoy them, no-one has a reason to produce them. Under vanilla capitalism, people invest to earn a return; ergo, projects must have some way of returning capital to investors. Public goods have fewer ways of making money, the story goes, so go under-funded. Known as the “free rider” problem, the challenge of articulating and producing public goods has puzzled economists for years, producing a large body of literature.1

The Web3 community, flush with cash and dependent on public software goods to grow, has over the last five years invested significant time and energy into this problem, with impressive results and tens of millions of dollars distributed to open-source projects. Broadly framed as “grants programs,” large institutional-scale donations are channeled through innovative allocation mechanisms towards hundreds of public-goods projects.

Eschewing centralized grant committees in favor of a distributed “wisdom of the crowds,” these programs rely on large online communities to allocate resources. As such, a key component of these programs are the mechanisms or sets of mechanisms used to determine final allocations of funding. Every mechanism represents a set of tradeoffs and beliefs; the choice of mechanism has major implications for the experience and outcomes of the program.

This post will discuss and compare two approaches to public-goods funding used by the Web3 community: that of quadratic funding, and of pairwise preference. Both produce the same result: an allocation of funding (defined as some percent of the total going to each project), but by very different means. The goal of this analysis is not to advocate for one mechanism over another, but rather through the act of contrast gain greater insight into the underlying problem.

There can be… many, it turns out. (source: ChatGPT 4o)

Quadratic Funding

Theory

Initially described in this December 2018 paper by Vitalik Buterin, Zoë Hitzig, and E. Glen Weyl, Quadratic Funding (QF) adapts existing ideas of Quadratic Voting to the capital-allocation setting.

A QF process consists of three parts: 1) a set of projects, 2) a large matching fund, and 3) a set of participant donations to individual projects. The donations are taken as representing the “collective intelligence” of the community, and used to determine the allocation of the matching funds to the projects, by calculating the percent of total individual donations going to each individual project.

The catch is that the individual donations are not summed up directly, but rather taken to the square root before being added together (hence “quadratic”). The effect is to privilege small donors over large, and thus “correct” for the plutocratic and self-interested tendencies of capitalism-as-usual.

Visual representation of the QF process (source: Gitcoin)

Practice

Quadratic Funding is both theoretically optimal2 and useful in practice, having been used by the Gitcoin Grants community and others to allocate tens of millions of dollars over several years to dozens of projects (including ours). The quadratic constraint encourages participants to spread their attention and resources over multiple projects to maximize their individual influence over the final allocation.

See here and here for discussions of past QF-based Gitcoin Grants rounds, and here for a more in-depth analysis of the use of QF in public-goods-funding broadly.

Further, inasmuch as “budgets are policy,” we can argue that QF has done more than most to advance the broader practice of “DAOs” by evidencing mechanisms for the pro-social coordination of distributed, independent actors.

Limits

Inherent to the goal of privileging smaller voices, QF requires a strong identity system to function. And given a sufficiently strong identity system, QF nonetheless remains vulnerable to collusion among participants, with research showing that “there are no bounds on how much money can be illicitly extracted from a [QF process] if even two agents are capable of coordinating.”

Fortunately, QF has benefitted from hundreds of hours of collective research into these vulnerabilities, resulting in powerful pre- and post-processing techniques such as sybil-resistant identity “passports” and connection-oriented cluster matching (COCM) for penalizing colluding or dishonest actors.

In reality, no identity system is perfect, and projects have every incentive to try and coordinate their contributions to maximize their benefit. As such, every layer of intermediation adds complexity and new challenges. In their retrospective of the community round they ran as part of Gitcoin Grants 21, Open Civics observed that:

Upon reviewing the cluster matching results, we observed discrepancies between the expected matching results and the actual outcomes. This discrepancy was traced back to a parameter within cluster matching that reduces the matching for wallets that only donate to a single project. While this serves as a useful Sybil protection mechanism, it also unintentionally penalized grantees with unique donor bases mobilized specifically for their projects.

We see how the anti-collusion mechanism, while effective at suppressing a specific type of undesirable behavior, creates knock-on effects which must be addressed.3

Future Directions

QF is supported by an active community of researchers, practitioners, and advocates, with a well-understood roadmap: continued improvements in sybil-detection and collusion resistance, better UI and UX, and a larger body of case studies to deepen the experience of practice.

Widely used and well-resourced, it is likely that QF will become increasingly useful and influential in the years to come.

Pairwise Preference

Theory

Initially described in this December 2018 paper by myself, Aron Fischer, and Jack du Rose, Pairwise Preference (PP) adapts a variation of Google’s PageRank algorithm to the capital-allocation setting.4

Like QF, a PP process consists of three parts: 1) a set of projects, 2) some funds, and 3) a set of participant pairwise preferences between the projects. The pairwise preferences are used to determine the allocation of the grant fund, by constructing a large “graph” of preferences and “flowing” allocation through the graph until a steady state is achieved.5

The essential insight of PP is that while valuing the absolute merit of an individual project is cognitively demanding, evaluating the relative merit of two projects is cognitively simple. With the help of PP, a large community can be meaningfully engaged in the project of collective allocation. Further, when compared with QF, PP is less dependent on strong identity systems or matching funds — direct participant contributions may suffice.6

The deeper difference, however, between QF and PP is that QF inputs are objective contributions to individual projects, while PP inputs are subjective preferences between pairs of projects: instead of “$25 to A” it is “A over B.” By bringing decisions “closer to the (phenomenological) metal,” PP can be seen as a machine for transforming subjectivity to objectivity, with the advantage being a more intuitive and engaging experience for the participant, and the higher attentional efficiency this implies.

Visual representation of the PP process (source: Zaratan)

Practice

General Magic (GM) developed an implementation of PP called Pairwise for use in Optimism’s RetroPGF program, which featured in RetroPGF rounds 3, 4, and 5.

In RetroPGF 3, Pairwise was used to help participants create and share “lists” of high quality projects to facilitate the final direct allocation process. In their retrospective, GM observed the following (emphasis added):

The gamification and choosing between two choices was really fun
The community responded positively, showing excitement about the voting mechanism
It was easy to discover and appreciate unknown smaller name projects with Pairwise because of the categories and being compelled to compare them against other projects.

GM continued iterating Pairwise for RetroPGF, extending its use further down the process pipeline. In their retrospective on RetroPGF4, GM observed that:

In general, wallets, privacy dapps and governance projects performed better on Pairwise than in the purely metric based round and meme coins and DeFi projects with relatively lower metrics performed significantly worse.

This suggests that PP can help participants cut through hype to more clearly discern underlying value. Finally, in a voting data analysis of RetroPGF5, Carl Cervone observed that:

… many of the voters who used [Pairwise] had a higher coefficient of variation than those who didn’t. This suggests that Pairwise helped voters develop more nuanced distribution preferences. … It's worth exploring additional applications of Pairwise in future rounds (eg, in eligibility reviews, determining consensus picks, etc).

Limits

While users appreciated the interface, some found the paradigm shift jarring. Consider this observation by the GM team:

Many badgeholders created their own assessment system in spreadsheets and Pairwise was difficult to integrate into that system.

and this user feedback:

“It’s very hard to convert the relative preferences of an individual into an absolute allocation of tokens. I struggled a lot with this step.”

PP makes a different ask than other mechanisms: rather than provide a structured way of thinking through absolute allocations, it provides a structured way of thinking through relative priorities, promising to return an absolute allocation incorporating that relative knowledge.

For participants used to submitting ratings directly, this may feel like an unacceptable loss of control. Getting participants to trust that the final allocations fairly reflect their underlying values and intentions remains an ongoing UX challenge.7 Until this challenge is resolved, PP may be limited to more peripheral use-cases.

Future Directions

A common critique of PP is that it results in more, albeit smaller, decisions; constraining the problem size via a pre-processing step is useful in practice. For example, in Retro PGF5 GM experimented with a pre-processing step in which participants rated each project on a 1-5 scale, sorting the projects into quality tiers and facilitating ease-of-comparison within tiers.

Another approach, however, would be to leverage voting information within the voting process itself. Specifically, in lieu of presenting voters with pairs at random (drawing from a large set of O(n^2) possible combinations), project pairs could be presented in a way that maximizes the marginal information of that particular input. A project which is consistently voted down can be shown less often, reflecting the relatively little information to be gained from another downvote. Instead, a pair of projects which consistently receive alternating preferences can be shown more often, engaging limited voter attention in more nuanced discernment. By leveraging available information to adaptively group and surface projects in a way that continually reduces total uncertainty, the same allocation can be reached with fewer votes, perhaps even O(n) in the number of pairs.8

Discussion

Using this juxtaposition as a entry point, we can briefly consider a few broader questions about public-goods-funding mechanism design in general.

Output unpredictability

With both mechanisms, users have reported frustration with the way that final outcomes diverge from prior expectations, experiencing the results “unfair.” In the QF context, the issue partly stems from the way that the sibyl-resistance and anti-collusion processes opaquely adjust participant inputs. In the PP context, the issue partly stems from the iterative and non-linear process used to convert subjective inputs into absolute allocations.

While it is hard to call surprising outcomes a feature, neither are they fully a bug. Rather, they are an invitation to reframe the problem. Notable game designer Raph Koster has observed that (emphasis added):

Cheating is an apparently advantageous violation of player assumptions about the game. When those assumptions are satisfied, all apparently advantageous methods are fair. When they are violated, no apparently advantageous methods are fair.

Koster’s insight is that perceived fairness is less a result of actual mechanics than it is participant expectations of those mechanics. By encouraging the public-goods-funding community to view each round as an iteration of a long-running evolutionary processes, we create more room for experimentation and learning.

There will inevitably be tradeoffs to make between the often-contradictory goals of predictability, fairness, correctness, and security. As we saw in the example of QF collusion-resistance, while each additional layer of complexity has the potential to create an overall more effective system, every intervention has downsides. The question is not “how can we avoid these tradeoffs” but rather “how can we manage these tradeoffs pareto-optimally over the long-run.”

One could dismiss this thinking as lazy or defeatist, and argue that grants programs can and should do more to align allocations with proven impact. Meaningful work is being done to make “objective” metrics more accessible and legible. Yet, as the authors of the 2024 State of Web3 Grants report write (emphasis added):

Programs are starting to think more about how to measure the long-term impact of their grants, but there’s still a lot of work to do in this area. It’s important to note that this is by no means a challenge solely in web3, but in fact is a common challenge for just about all grant programs.

It is important to acknowledge that, fundamentally, there is no knowable correct allocation for a grants round, no external “ground truth” to benchmark against. Being slightly over- or under-funded relative to expectation becomes part of the experience: occasional disappointment to some, occasional windfall to others.9 The community should embrace a certain degree of “chaos magic” and focus on engaging as many participants as possible in the process of collective sensemaking, trusting that better information and process will lead to course-correction over time.

Preference intensity

A perennial and irresistible debate among decision theorists is that of ranked vs. rated voting, with numerical rating (cardinal) methods providing greater capacity for both information transmission and measurement error when compared their purely ranked (ordinal) counterparts.

For profound epistemological reasons, there is no single answer. After making historical contributions to the theory of ranked voting, Kenneth Arrow famously conceded that “I'm a little inclined to think that score systems where you categorize in maybe three or four classes probably (in spite of what I said about manipulation) is probably the best.”10 That said, data show that while rating scales can be useful for small groups, they easily collapse into binary thumbs up / thumbs down systems in pseudonymous online settings where participants are motivated to vote strategically.

The question of ranked vs. rating is thus highly context-dependent. Ratings should be introduced to the extent that they can be made robust to divergent interpretations.

QF models numerical intensity directly via quadratically-constrained contributions, the sine qua non of the mechanism. PP does not model intensity by default, but can be easily extended to do so. The choice of how and when to consider ranked vs. rated inputs should be made by the practitioner based on their assessment of the audience.

Continuous processes

A subtle but important property of these allocation mechanisms relates to the validity of their inputs over time. To what extent does participant input remain relevant as circumstances change and projects evolve? In the economics literature, this is known as the “Independence of Irrelevant Alternatives” (IIA), related to the Plurality concept of “contextual integrity.”

While both QF and PP judgments are fundamentally relative, PP models that relativity explicitly. A preference for project A over project B is valid regardless of the presence (or absence) of projects C, D, and E. A user’s inputs, then, remain valid (“contextually integral”) even as the set of total projects evolves over time. Under QF, however, this relativity is implicit and the results are thus somewhat more ephemeral: it could very well be that the introduction of project C in a future iteration would affect the user’s chosen contribution to project A; every contribution decision is thus conditional on the entire set.

One implication is that while pairwise preferences can be persisted across funding iterations, quadratic contributions must be generated anew at every iteration. Thus, PP more naturally accommodates long-lived, continuous funding processes over evolving sets of items,11 something tricky but not impossible to do under QF.

Conclusion

An allocation mechanism can be seen as a measurement process, with the goal being the reduction of uncertainty concerning present beliefs about the future. An effective process will gather and leverage as much information as possible while maximizing the signal-to-noise ratio of that information — aims which are often at odds.

It is crucial to recognize that this process is fundamentally speculative, and design mechanisms according to this reality. Rather than chase the goal of a single perfect round, public-goods-funding leaders should instead work to cultivate an engaged and enthusiastic community of participants, a diverse and growing ecosystem of projects, and a deep pool of committed institutional donors.

This sensibility will become increasingly important as the community works to overcome the limits of attention in the public-goods-funding ecosystem. In their retrospective of RetroPGF 3, the Optimism team observed that:

The sheer volume of applications that needed to be processed by each badgeholder was overwhelming.

while sharing this user feedback (emphasis added):

“Spray and pray is a natural reaction to cognitive overload and limited bandwidth. However, our focus shouldn't be solely on creating more efficient tools for spraying. A new feature like a CSV upload button would make the work go faster, but it still encourages us to spray. What we actually need are better ways of designing, iterating, and submitting complete voting strategies.” - Carl

As the world grapples with the detrimental effects of information overload, it may be worth thinking more boldly and radically about the ways we use and leverage attention to make decisions. Applying information-theoretic frameworks to the analysis of these systems may provide great insight.

Ultimately, the most important thing to do is to keep experimenting, and to cultivate a culture which welcomes experimentation. As Ostrom’s law states, “a resource arrangement that works in practice can work in theory.” By continuing to broaden and deepen its collective experience of practice of Quadratic Funding, Pairwise Preference, and other allocation mechanisms, the Web3 public-goods-funding community will also advance the theoretical understanding of these mechanisms, pushing the ecosystem — and the world — forward.

See the seminal work of Yochai Benkler and more recent scholarship of Nadia Asparouhova, as well as this essay by Other Internet.

See here for a longer discussion of QF and a “mild critique” of its optimality.

By penalizing donors who favor a small set of pet projects, strategic donors are now incentivized to make supplemental grants at random to unrelated projects, irrespective of quality — trading one theoretical misallocation for another.

Further reading for anyone interested in the deeper mathematical history of the technique.

The intuition is similar to that of sports rankings: beating a high-ranked team impacts your position more than beating a low-ranked team. A project preferred over a highly-valued project will receive more funding than a project preferred over one lowly-valued.

PP has less well-developed theory, and so we refrain from making confident statements about sibyl-resistance, collusion-resistance, and the like until better research is available.

It is important to note that in the context of Optimisms’s RetroPGF program, Pairwise was never used to produce a final allocation, but rather as an auxiliary sense-making tool or intermediate step in the pipeline. Whether increased end-to-end use of PP would exacerbate or in fact resolve these frictions remains to be seen.

Note that the more truncated the vote set, the more sensitive the results will be to the order in which the projects were presented.

This could arguably be seen as an ecosystem feature, with rounds of under-funding forcing projects to temporarily double down on core competencies (“exploit”), and rounds of over-funding allowing projects to temporarily pursue riskier innovation strategies (“explore”).

Note that his endorsement was not for a real number line, but a small number of semantically-meaningful categories.

As demonstrated by Chore Wheel’s chores system.

Community Systems

Discussion about this post