← All posts
· 8 min read · Market Intelligence

Who Profits When Coffee Data Stays Scarce?

The specialty coffee industry solved price transparency. The harder problem, product metadata, remains relationship-gated. Across 35 suppliers, disclosure ranges from 1 to 13 attributes per listing. That gap is not an accident; it is an economic structure with beneficiaries.

Who Profits When Coffee Data Stays Scarce?

The specialty coffee industry has made real progress on price transparency. The SCA/Emory Sustainable Coffee Buyer’s Guide publishes farmer-reported cost-of-production data. Counter Culture prints FOB prices on their bags. Azahar and others have pushed the norm toward disclosing what buyers pay and what growers receive.

This is genuinely good work. It is also solving the wrong problem first.

When I set out to build a comparison layer across 35 green coffee suppliers, price was the easy part. Every listing has a price. The hard part was everything else. Across those 35 suppliers, the number of structured product attributes per listing ranges from 1 to 13. The median is 7. The typical listing is missing over half the attributes that exist somewhere in the market.

The industry built price transparency while the metadata gap grew unchecked. That gap is where the real information asymmetry lives. And unlike price, where the gap was a coordination failure, the metadata gap has beneficiaries.

The Wrong Transparency Got Solved First

Price transparency answers a specific question: is this transaction fair? That matters. But it does not answer the question that precedes every transaction: is this the right coffee to buy?

To answer that, you need product metadata. Processing method. Cultivar. Cup score. Arrival date. Farm provenance. These are the attributes that separate an informed purchase from a guess.

No standard requires sellers to disclose any of them. Each supplier decides which fields to populate, and those decisions vary wildly.

What 30 Suppliers Actually Disclose

I scraped and normalized product data from 30 live green coffee suppliers. The dataset tracks 17 unique product attributes. Here is how consistently each one appears.

Near-universal:

  • Country of origin: 93%
  • Processing method: 90%
  • Region: 77%
  • Cultivar/variety: 77%

Common but inconsistent:

  • Cupping notes: 67%
  • Grade: 63%
  • Roast recommendations: 43%

Decision-critical:

  • Farm provenance: 33%
  • Arrival date: 27%
  • Cup score: 17%

The pattern maps directly to the cost of generating the information. Country is printed on every bag. Cup scores require someone to cup the coffee and record a number. Arrival dates require tracking logistics. Farm provenance requires maintaining relationships and records that go beyond “we bought this from an exporter.”

The three attributes most relevant to purchasing decisions are the least consistently disclosed.

A Bazaar Economy in a Modern Market

Anthropologist Clifford Geertz studied Moroccan bazaars in the 1970s and described a market structure that feels uncomfortably familiar. In his analysis, information in bazaar economies is “poor, scarce, maldistributed, inefficiently communicated, and intensely valued.” Participants invest heavily in relationships precisely because relationships are the primary channel for reliable information. The market rewards the best networks, not necessarily the best products.

Green coffee distribution fits this pattern closely. The best purchasing intelligence does not live in product listings. It lives in phone calls, long-standing importer relationships, and the accumulated trust of doing business together for years. Which lots scored highest, what just landed, which farms are producing exceptional naturals this season: that information flows through relationship networks, not catalogs. If you know the right people, you operate in a fundamentally different market than the one visible on supplier websites.

The public catalog is the surface layer. The relationship network is where information density actually concentrates. Everyone outside the inner circle gets “Ethiopian Natural, $7/lb.”

The Real Estate Parallel

Real estate had a structurally similar problem, and what happened there is instructive.

For decades, the MLS (Multiple Listing Service) system and agent commission structures created what economists would call information rents: profits generated by controlling information flow rather than by adding proportional value. Levitt and Syverson’s 2008 study showed that real estate agents sold their own homes for roughly 3.7% more than client homes and kept them on market significantly longer. Agents had information advantages and economic incentives that did not fully align with their clients.

The MLS was nominally an information-sharing tool, but in practice it controlled which data buyers could access and how. Agent positioning as the mandatory intermediary depended on being the gatekeeper to listing data. When the 2024 NAR antitrust settlement forced structural changes to commission practices, the core allegation was exactly this: the system maintained middleman positioning by controlling information flow.

Coffee importers and brokers are not running a cartel. But the structural dynamic is recognizable. When product metadata is scarce and relationship-gated, intermediaries who control information access capture margin that would otherwise flow to either producers or buyers. The information gap is not just a transparency problem. It is an economic structure that benefits those positioned between supply and demand.

Healthcare as the Cautionary Endpoint

Healthcare in the US shows where this pattern leads when it compounds unchecked.

For decades, patients could not determine the price of a procedure before receiving it, and they still largely cannot compare quality of care across providers. The information asymmetry runs in both directions simultaneously: you do not know what it costs, and you do not know if it is any good.

Coffee has partially solved the price side. The quality and metadata side remains almost completely opaque at the point of comparison. A buyer shopping across suppliers is in a structurally similar position to a patient choosing between hospitals: the listing looks like a functioning market, but the information needed to make an informed choice is either missing, inconsistent, or gated behind relationships.

The difference is that healthcare’s information gap is now generating regulatory responses: the Hospital Price Transparency Rule (2021) and quality reporting mandates. Coffee has no equivalent regulatory pressure. The gap persists because nobody has made it visible at scale.

This Is the Lemons Problem

George Akerlof’s 1970 paper ”The Market for ‘Lemons’” described a specific market failure: when sellers know more than buyers about product quality and disclosure is voluntary, the market gradually selects against quality.

Green coffee has a variant of this. The selection pressure acts on information density, not directly on coffee quality.

Consider two $7/lb Ethiopian naturals. One listing includes: Yirgacheffe Kochere origin, Heirloom cultivar, 87-point cup score, January 2026 arrival, Daye Bensa washing station, tasting notes of blueberry and jasmine. Eleven fields. The other listing says: “Ethiopian Natural.” Two fields. Same price.

The supplier who invested in cupping, tracking logistics, and documenting farm relationships gets no market premium for that work. Not because buyers would not value the information, but because the information is invisible at the point of comparison.

Jon Allen of Onyx Coffee Lab captured this in a Perfect Daily Grind interview: “No one really shares anything about buying green coffee… many people have really no idea what something is worth.” He was talking about price. The same sentence applies even more forcefully to product metadata.

Over time, this creates pressure against disclosure investment. Why cup 200 lots a year if nobody can see the scores in a comparison context? Why track arrival dates if the market does not reward freshness data? The lemons dynamic quietly erodes the incentive to generate information, even among suppliers who would benefit most from sharing it.

What This Actually Costs the Market

Mispriced inventory. Without arrival dates (available from only 27% of suppliers), buyers cannot distinguish fresh-crop coffee from inventory sitting in a warehouse for 18 months. In a market where freshness materially affects cup quality, two listings at the same price are often two different products.

Discovery suppression. New and small-production origins need information-rich listings to compete against established names. An unknown Ethiopian washing station cannot rely on brand recognition; it needs a cup score, a processing description, and a provenance story. The metadata gap systematically disadvantages discovery-oriented coffees.

Aggregation arbitrage. Any entity that normalizes this data gains structural pricing power over those who do not. This is the Zillow/Redfin playbook applied to a new vertical: standardize the listing, make comparison possible, let the best product win on the data. The question is whether the coffee industry builds this infrastructure itself or lets an outside platform capture the margin.

Opportunity Map

Three plays emerge from this data.

Disclosure scoring. Treat information density as a first-class supplier metric. A supplier providing 11 of 17 fields is operationally different from one providing 3. Surfacing a disclosure completeness score alongside price and cup score changes how buyers evaluate sellers. It also creates market incentive for suppliers to fill gaps, because their investment becomes visible for the first time.

Freshness as a filterable dimension. Arrival date is the sleeper field. Only 27% of suppliers provide it, but it directly affects cup quality. Any sourcing tool that can surface arrival freshness immediately differentiates itself from the catalog status quo.

Aggregation as infrastructure. Individual supplier catalogs will never solve this. The value comes from normalization across sellers, which makes the gap visible and comparison possible. Suppliers with strong disclosure benefit most because their investment finally becomes legible. This is not competitive with suppliers. It is infrastructure that raises the floor for everyone.

The Reframe

Before you can fix information asymmetry, you have to make the asymmetry visible.

Green coffee has a functioning price layer and a broken metadata layer. The relationship networks that actually drive quality sourcing are invisible to anyone outside the inner circle. The public-facing catalog infrastructure rewards omission and penalizes disclosure investment.

The first step is not better data. It is showing buyers how much data they are missing, and showing suppliers that their investment in transparency has market value.

Price was the first transparency problem the industry chose to solve. Metadata is the harder one, and the one that actually determines whether a buyer is making an informed decision or rolling the dice.

Discussion