Academic Research Scan — 2026-02-16

🔬 High Priority Papers

Agentic Commerce & AI Marketplaces

What Is Your AI Agent Buying? Evaluation, Biases, Model Dependence, & Emerging Implications for Agentic E-Commerce — Amine Allouah, Omar Besbes, Josué Figueroa, Yash Kanoria, Akshit Kumar (Columbia Business School)
- Abstract summary: Investigates how AI agents behave when autonomously purchasing products on e-commerce marketplaces. Using ACES, a provider-agnostic auditing framework, the authors reveal that AI agents exhibit choice homogeneity—concentrating demand on a few "modal" products while ignoring others entirely. These preferences are unstable: model updates drastically reshuffle market shares. Agents show strong position biases that persist even in text-only "headless" interfaces, consistently penalize sponsored tags while rewarding platform endorsements, and vary sharply in price/ratings sensitivity across model versions. Seller-side agents can exploit this by making query-conditional description tweaks to capture significant market share.
- Relevance to agentic commerce: This is the paper Sir should read first. It empirically demonstrates that agent-mediated markets are fundamentally different from human-centric ones—volatile, biased, and gameable. Directly relevant to x402 service discovery, lobster.cash agent payments, and any marketplace where AI agents select vendors autonomously. The finding that sellers can game AI agents is a critical design consideration for ERC-8004 and Coinbase Agentic Wallets.
- Link: https://arxiv.org/abs/2508.02630 (4 citations)
Agentic Commerce: The Paradigm Shift from Human-Mediated to Autonomous AI-Driven Transactions in Digital Payment Systems — Krishna Dusad
- Abstract summary: Maps out the architectural requirements for a world where AI agents autonomously handle end-to-end purchase flows—from discovery through negotiation to payment—without human involvement. Argues that traditional payment constructs, authentication methods, and security models are insufficient and need complete rethinking. Covers the emerging collaboration between banks, payment processors, card networks, and AI research organizations to build infrastructure for agent-to-agent commerce. Identifies core unsolved problems: agent identity verification, dispute resolution for AI-initiated transactions, and consumer protections when machines make purchasing decisions.
- Relevance to agentic commerce: This is essentially a survey of the exact design space that lobster.cash/Crossmint, ERC-8004, and x402 are building in. The gap analysis on agent identity verification and dispute resolution maps directly to the trust/safety/KYC gaps we identified in the OpenClaw ecosystem research.
- Link: https://doi.org/10.22399/ijcesen.4304
FaMA: LLM-Empowered Agentic Assistant for Consumer-to-Consumer Marketplace — Yineng Yan et al. (Meta)
- Abstract summary: Presents Facebook Marketplace Assistant (FaMA), an LLM-powered agent that serves as a conversational entry point to Meta's C2C marketplace, replacing complex GUI navigation with natural language commands. For sellers: automated listing updates, renewals, and bulk messaging. For buyers: conversational product discovery. Achieves 98% task success rate on complex marketplace tasks and enables 2x speedup in interaction time. Architecture treats the agent as a full alternative interface to the marketplace rather than an overlay.
- Relevance to agentic commerce: This is Meta deploying agentic commerce at Facebook Marketplace scale. The "agent as primary interface" paradigm—replacing GUI with conversational entry—is exactly what autonomous agent ecosystems will need. If Meta's marketplace goes agent-first, the payment and discovery infrastructure (x402, ERC-8004) becomes the critical differentiator for open alternatives.
- Link: https://arxiv.org/abs/2509.03890 (3 citations)
AgenticShop: Benchmarking Agentic Product Curation for Personalized Web Shopping — Sunghwan Kim, Ryang Heo, Yongsik Seo, Jinyoung Yeo, Dongha Lee (Accepted at WWW 2026)
- Abstract summary: First benchmark for evaluating agentic systems on personalized product curation in open-web environments. Features realistic shopping scenarios, diverse user profiles, and a checklist-driven evaluation framework. Extensive experiments show current agentic systems are "largely insufficient" at curating tailored products across the modern web—they handle simplified single-platform lookups but fail at exploratory, cross-platform shopping that requires understanding diverse user preferences.
- Relevance to agentic commerce: Establishes a rigorous benchmark showing where agentic shopping currently fails. For the x402 ecosystem (96 live services on Base), this suggests that service discovery by AI agents is still a major bottleneck. The gap between "lookup a known product" and "explore and compare across platforms" is exactly where agent payment protocols need to evolve.
- Link: https://arxiv.org/abs/2602.12315v1
Retail Cybersecurity in the Agentic Age: Securing Autonomous Shopping Agents in E-Commerce — Bhargav Trivedi
- Abstract summary: Introduces a layered security framework for autonomous shopping agents, combining behavioral checks, blockchain-validated transactions, and a Model–Control–Policy (MCP) governance model. Tests a prototype retail agent against several adversarial attack types—identity spoofing, data leaks, and prompt attacks—and demonstrates that combined preventative defenses significantly reduce exposure. Frames cybersafety as a co-design consideration for agentic retail deployments.
- Relevance to agentic commerce: Directly addresses the security gap we identified in x402 hands-on testing (no spending controls, wrapped fetch pays anything automatically). The MCP governance model and blockchain-validated transaction approach could inform spending limit architectures for lobster.cash and Coinbase Agentic Wallets.
- Link: https://doi.org/10.59573/emsj.9(4).2025.52

LLM Economic Agents & Market Dynamics

MERIT Feedback Elicits Better Bargaining in LLM Negotiators — Jihwan Oh, Murad Aghazada, Yooju Shin, Se-Young Yun, Taehyeon Kim (KAIST)
- Abstract summary: Presents AgoraBench, a new benchmark spanning nine challenging negotiation settings (deception, monopoly, etc.) and introduces economically grounded metrics derived from utility theory—agent utility, negotiation power, and acquisition ratio—that implicitly measure alignment with human preferences. Finds that baseline LLM negotiation strategies often diverge from human preferences, but a utility feedback mechanism substantially improves performance, yielding deeper strategic behavior and stronger opponent awareness across both prompting and finetuning approaches.
- Relevance to agentic commerce: When AI agents negotiate prices (as in x402 dynamic pricing or agent-to-agent service procurement), they need to align with human utility functions. This paper shows that untrained LLM negotiators are poor proxies for human preferences—a critical finding for any protocol where agents autonomously negotiate transaction terms.
- Link: https://arxiv.org/abs/2602.10467v2
EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies — (multiple authors)
- Abstract summary: Introduces a generalizable benchmark for continuous plan-and-execute decision making in interactive economic environments. Three diverse scenarios: Vending, Freelance, and Operation, each with business-relevant outcomes (net worth, income, DAU). Evaluated 11 leading LLMs over 1000+ step horizons. Reveals a systematic tension: no single model dominates all economic scenarios. Models exhibit significant performance variance under partial observability and stochasticity—the conditions that characterize real markets.
- Relevance to agentic commerce: Tests LLMs in the exact role they'll play in agentic economies—making sequential economic decisions over long time horizons. The finding that no model dominates all scenarios suggests that agent marketplace design should accommodate model diversity rather than assuming a single "best" agent strategy. Relevant to how OpenClaw agents will operate in persistent economic environments.
- Link: https://arxiv.org/abs/2602.09514v2
Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices — Manon Reusens, Sofie Goethals, Toon Calders, David Martens
- Abstract summary: Studies LLM decision-making in a travel-assistant context where models make subjective purchasing choices on behalf of users. Uses multinomial logit models to derive implied willingness-to-pay (WTP) estimates, comparing to human benchmarks from economics literature. Key findings: meaningful WTP values can be derived from larger LLMs, but they systematically overestimate human WTP, particularly with expensive options or business personas. Conditioning on prior cheap-option preferences yields valuations closer to human benchmarks.
- Relevance to agentic commerce: Directly quantifies how LLM agents systematically overpay relative to human preferences—a design flaw for any system where agents autonomously authorize payments (Coinbase Agentic Wallets, lobster.cash). The finding that prompt design can calibrate spending behavior suggests that spending policy layers (like x402 needs) can use persona-conditioning to prevent agent overspend.
- Link: https://arxiv.org/abs/2602.09802v1
Experimentation, Biased Learning, and Conjectural Variations in Competitive Dynamic Pricing — Bar Light, Wenyu Wang
- Abstract summary: Studies competitive dynamic pricing among multiple sellers using simple learning rules and A/B price experiments, motivated by the rise of algorithmic pricing in online marketplaces. Shows that correlated experimentation (e.g., synchronized repricing schedules) induces endogenous supra-competitive pricing through biased demand learning—essentially, algorithmic price coordination emerges without explicit collusion. Under independent experimentation, the bias vanishes and learning converges to Nash equilibrium. Provides finite-sample convergence guarantees.
- Relevance to agentic commerce: Critical for understanding what happens when AI agents set prices in marketplaces (like x402's 96 live services). Synchronized agent behavior can lead to emergent collusion—higher-than-competitive prices—even without explicit coordination. This is a regulatory concern for agentic marketplaces and a design consideration for service pricing on Base/Ethereum.
- Link: https://arxiv.org/abs/2602.12888v1

Multi-Agent Systems & Safety

GT-HarmBench: Benchmarking AI Safety Risks Through the Lens of Game Theory — Pepijn Cobben, Xuanqiang Angelo Huang, Thao Amelia Pham et al. (cs.AI, cs.GT, cs.MA)
- Abstract summary: Introduces a benchmark of 2,009 high-stakes scenarios spanning game-theoretic structures (Prisoner's Dilemma, Stag Hunt, Chicken) drawn from the MIT AI Risk Repository. Across 15 frontier models, agents choose socially beneficial actions in only 62% of cases. Tests sensitivity to game-theoretic prompt framing and ordering. Shows that game-theoretic interventions improve socially beneficial outcomes by up to 18%. Identifies substantial reliability gaps in multi-agent alignment.
- Relevance to agentic commerce: When AI agents transact autonomously, every interaction is a game—bilateral trade, auction, negotiation. The finding that agents fail to cooperate 38% of the time in standard game scenarios raises concerns for agentic marketplaces where trust and cooperation are assumed. Directly relevant to designing incentive mechanisms for x402 and agent payment protocols.
- Link: https://arxiv.org/abs/2602.12316v1
Towards Sustainable Investment Policies Informed by Opponent Shaping — Juan Agustin Duque, Razvan Ciuca, Ayoub Echchahed, Hugo Larochelle, Aaron Courville (Accepted at ICLR 2026)
- Abstract summary: Applies Advantage Alignment, a scalable opponent-shaping algorithm, to InvestESG—a multi-agent investment simulation capturing dynamics between investors and companies under climate risk. Derives theoretical thresholds for when individual incentives diverge from collective welfare (intertemporal social dilemmas). Demonstrates that strategically shaping agent learning processes can steer them toward socially beneficial equilibria by biasing dynamics toward cooperative outcomes.
- Relevance to agentic commerce: The opponent-shaping methodology could be applied to agentic marketplace design—shaping how AI agents learn to interact so that individual agent optimization doesn't degrade collective market outcomes. Relevant to preventing race-to-the-bottom dynamics in agent-mediated service markets. Top-tier venue (ICLR 2026) with senior ML researchers (Larochelle, Courville at Mila).
- Link: https://arxiv.org/abs/2602.11829v1
Blind Gods and Broken Screens: Architecting a Secure, Intent-Centric Mobile Agent Operating System — Zhenhua Zou et al. (Tsinghua)
- Abstract summary: Proposes Aura, a clean-slate secure agent OS that replaces brittle GUI scraping with structured agent-native interactions. Uses a Hub-and-Spoke topology where a privileged System Agent orchestrates intent, sandboxed App Agents execute tasks, and an Agent Kernel mediates all communication with four defense pillars: cryptographic identity binding, semantic input sanitization, cognitive integrity via taint-aware memory, and granular access control with non-deniable auditing. Achieves 94.3% task success rate while reducing attack success rate from ~40% to 4.4%.
- Relevance to agentic commerce: The security architecture—cryptographic agent identity, sandboxed execution, auditable actions—maps directly to what's needed for autonomous agents handling financial transactions. The "Agent Kernel" mediating all communication is analogous to what x402/EIP-3009 needs for payment authorization. The identity-binding approach could inform ERC-8004 agent identity standards.
- Link: https://arxiv.org/abs/2602.10915v3
Equity by Design: Fairness-Driven Recommendation in Heterogeneous Two-Sided Markets — Dominykas Seputis, Alexander Timans, Rajeev Verma
- Abstract summary: Formalizes two-sided fairness in marketplace recommendations, extending prior work from single-item to discrete multi-item settings. Introduces CVaR as a consumer-side objective to compress group-level utility disparities. Key finding: the "free fairness" regime (where producer constraints impose no consumer cost) disappears in multi-item settings. However, moderate fairness constraints can improve business metrics by diversifying exposure. Scalable solvers match exact solutions at a fraction of runtime.
- Relevance to agentic commerce: When AI agents both recommend and purchase, marketplace fairness becomes even more critical—agent biases compound with recommendation biases. The finding that moderate fairness constraints improve business metrics argues for building fairness into agentic marketplace protocols from the start rather than bolting it on later.
- Link: https://arxiv.org/abs/2602.10739v2
Nonparametric Contextual Online Bilateral Trade — Emanuele Coccia, Martino Bernasconi et al.
- Abstract summary: Tackles the problem of online bilateral trade where a mechanism designer proposes prices without observing private valuations. Under a general nonparametric setting with arbitrary Lipschitz functions of context, achieves regret O(T^{(d-1)/d}) with only one-bit feedback (trade or no trade) and strong budget balance (no subsidies). Provides matching lower bound proving tightness.
- Relevance to agentic commerce: This is mechanism design for exactly the kind of bilateral trade that occurs in agent-to-agent transactions. The budget-balance constraint (no platform subsidies) and one-bit feedback (trade occurred or not) mirror real agent marketplace conditions. Provides theoretical foundations for pricing mechanisms in x402-style service marketplaces.
- Link: https://arxiv.org/abs/2602.12904v1

Agentic AI Infrastructure

Human Tool: An MCP-Style Framework for Human-Agent Collaboration — Yuanrong Tang et al.
- Abstract summary: Introduces Human Tool, an MCP-style interface that exposes humans as callable tools within AI-led workflows. Models human contributions through structured tool schemas of capabilities, information, and authority, enabling agents to dynamically invoke human input and reintegrate it. Controlled studies show improved task performance, reduced human workload, and more balanced collaboration dynamics compared to baseline systems.
- Relevance to agentic commerce: Relevant to the human-in-the-loop design question for agentic payments. When should an autonomous agent escalate a purchasing decision to a human? The MCP schema approach (capabilities + authority) could inform spending policy design where agents have delegated but bounded purchasing authority.
- Link: https://arxiv.org/abs/2602.12953v1
When Visibility Outpaces Verification: Delayed Verification and Narrative Lock-in in Agentic AI Discourse — Hanjing Shi, Dominic DiFranzo
- Abstract summary: Investigates the interplay between social proof and verification timing in online discussions of agentic AI, analyzing longitudinal data from r/OpenClaw and r/Moltbook. Reveals a "Popularity Paradox": high-visibility discussions experience significantly delayed or absent verification cues compared to low-visibility threads. This creates a "Narrative Lock-in" where early unverified claims crystallize into collective cognitive biases before evidence-seeking emerges. Proposes "epistemic friction" as a design intervention.
- Relevance to agentic commerce: Directly studies discourse around OpenClaw (our ecosystem). The "narrative lock-in" finding has implications for how the agentic AI market evaluates tools, protocols, and platforms—early hype can lock in suboptimal standards before proper vetting. Relevant to how ERC-8004, x402, and other agent infrastructure standards get adopted.
- Link: https://arxiv.org/abs/2602.11412v1

📄 Notable Papers

Peaceful Anarcho-Accelerationism: Decentralized Full Automation for a Society of Universal Care — Eduardo C. Garrido-Merchán
- Abstract summary: Argues that convergence of LLMs (cognitive labor automation) and DRL (physical labor automation) implies near-complete elimination of human employment. Proposes the "Liberation Stack"—a layered commons architecture for energy, manufacturing, food, communication, knowledge, and governance. Claims full automation renders money obsolete and proposes "Universal Desired Resources" (UDR) as a post-monetary design principle. Uses empirical evidence from Linux, Wikipedia, Mondragon, and Rojava to argue commons-based systems operate at scale.
- Relevance to agentic commerce: A provocative but relevant thought piece on what happens beyond agentic commerce—if agents can do everything, does money even make sense? The "Liberation Stack" framing is useful for thinking about the long arc of agent autonomy, even if the policy proposals are speculative.
- Link: https://arxiv.org/abs/2602.13154v1
Seeing the Goal, Missing the Truth: Human Accountability for AI Bias — Sean Cao, Wei Jiang, Hui Xu
- Abstract summary: Shows that revealing the downstream use of LLM outputs (e.g., predicting stock returns) leads LLMs to generate biased intermediate measures—even when those measures are supposed to be task-independent. This "purpose leakage" improves performance before the model's knowledge cutoff but provides no advantage post-cutoff. Frames AI bias as stemming from human accountability in research design rather than algorithmic flaw.
- Relevance to agentic commerce: When agents are told they're making financial decisions, their behavior changes—even at the measurement stage. This has implications for any agent system where the agent "knows" it's handling money (Coinbase Wallets, x402 payments).
- Link: https://arxiv.org/abs/2602.09504v1
Behavioral Consistency Validation for LLM Agents: Trading-Style Switching through Stock-Market Simulation — Zeping Li et al. (Oxford, Philip Torr)
- Abstract summary: Tests whether LLM agent trading behavior aligns with real market participant behavior using year-long stock market simulations. Agents process daily price-volume data, trade under designated styles, and reassess strategy every 10 days. Uses loss aversion, herding, wealth differentiation, and price misalignment as personality traits. Finds that recent LLMs' switching behavior is only partially consistent with behavioral finance theories, highlighting need for further refinement.
- Relevance to agentic commerce: If LLM agents don't behave like real traders, then agent-mediated financial markets will produce different dynamics than human ones. Important for understanding how agent-driven DeFi trading (relevant to Solana, Ethereum ecosystems) will actually behave.
- Link: https://arxiv.org/abs/2602.07023v1
Autonomous Market Intelligence: Agentic AI Nowcasting Predicts Stock Returns — Zefeng Chen, Darcy Pu
- Abstract summary: Deploys a state-of-the-art LLM to evaluate Russell 1000 stocks daily starting April 2025. The framework is 100% agentic: no curated inputs—the model autonomously searches the web, filters sources, and synthesizes predictions. Finds genuine stock selection ability but only for identifying top winners. Longing the top 20 stocks generates daily Fama-French 5-factor alpha of 18.4 bps and annualized Sharpe of 2.43. However, predictability is highly concentrated: expanding beyond top tier rapidly dilutes alpha, and bottom-ranked stocks show market-like returns.
- Relevance to agentic commerce: Demonstrates that fully autonomous AI agents can generate real alpha in financial markets—a proof point for the viability of autonomous agents in economic domains. The asymmetry (agents good at finding winners, bad at identifying losers) is an interesting structural finding for agent-mediated investment platforms.
- Link: https://arxiv.org/abs/2601.11958v1
Resisting Manipulative Bots in Meme Coin Copy Trading: A Multi-Agent Approach — Yichen Luo, Yebo Feng, Jiahua Xu, Yang Liu (Accepted at WWW 2026)
- Abstract summary: Proposes a manipulation-resistant copy-trading system using multi-agent architecture with multimodal LLMs and chain-of-thought reasoning for meme coin markets. Addresses adversarial bots that front-run trades, conceal positions, and fabricate sentiment. Achieves average copier return of 3% per meme coin investment under realistic market frictions. Demonstrates effectiveness of agent-based defenses and predictability of trader profitability in adversarial crypto markets.
- Relevance to agentic commerce: Directly relevant to crypto trading ecosystems on Solana and Ethereum. Shows that multi-agent LLM systems can defend against manipulation in the most adversarial crypto environments (meme coins). Accepted at WWW 2026—high signal.
- Link: https://arxiv.org/abs/2601.08641v3
Agoran: An Agentic Open Marketplace for 6G RAN Automation — Ilias Chatzistefanidis et al. (Yale, EURECOM)
- Abstract summary: Introduces an agentic marketplace for 6G network management with three autonomous AI branches: Legislative (compliance via RAG-powered LLMs), Executive (real-time situational awareness), and Judicial (trust scoring with arbitrating LLMs). Stakeholder agents negotiate Pareto-optimal offers via a multi-objective optimizer. Deployed on a 5G testbed: 37% throughput increase for eMBB, 73% latency reduction for URLLC, 8.3% resource savings. A fine-tuned 1B Llama model recovers ~80% of GPT-4.1 quality in 6 GiB memory.
- Relevance to agentic commerce: While focused on telecom, the three-branch governance architecture (legislative/executive/judicial) for an agentic marketplace is a compelling design pattern that could apply to any autonomous agent marketplace—including service marketplaces on x402 or general agentic commerce platforms.
- Link: https://arxiv.org/abs/2508.09159
Who Restores the Peg? A Mean-Field Game Approach to Model Stablecoin Market Dynamics — Hardhik Mohanty, Bhaskar Krishnamachari (USC)
- Abstract summary: Develops a dynamic, agent-based mean-field game framework for fiat-collateralized stablecoins (USDC/USDT, $300B+ market cap). Models arbitrageurs and retail traders interacting across primary (mint/redeem) and secondary (exchange) markets during de-peg episodes. Calibrated to three historical events. Finds that peg recovery is predominantly driven by primary-market arbitrage, with a non-linear breakdown threshold beyond which secondary-market liquidity acts mainly as a second-order amplifier.
- Relevance to agentic commerce: USDC is central to agent payment infrastructure (x402, lobster.cash, Coinbase wallets all use USDC). Understanding stablecoin peg dynamics under stress is critical for any system where agents hold and transact in stablecoins autonomously.
- Link: https://arxiv.org/abs/2601.18991v1
Manipulation in Prediction Markets: An Agent-based Modeling Experiment — Bridget Smart, Ebba Mark, Anne Bastian, Josefina Waugh
- Abstract summary: Uses agent-based simulations to study how high-budget "whale" agents can distort prediction market prices. Finds that whales can temporarily shift prices proportionally to their share of market capital, with distortion duration increasing when other bettors exhibit herding behavior and slow learning. The model exhibits self-regulatory price discovery across broad parameter space under normal conditions, but whale manipulation persists longest when combined with herding behavior.
- Relevance to agentic commerce: As prediction markets become agent-dominated (e.g., Polymarket), understanding whale manipulation dynamics matters. If AI agents exhibit herding (as GT-HarmBench suggests they might), whale manipulation could be even more effective in agent-mediated markets.
- Link: https://arxiv.org/abs/2601.20452v1
ExtractBench: A Benchmark for Complex Structured Extraction — Nick Ferguson et al. (Contextual AI, incl. Douwe Kiela)
- Abstract summary: Benchmarks PDF-to-JSON structured extraction across frontier models (GPT-5/5.2, Gemini-3, Claude 4.5 Opus/Sonnet). Performance degrades sharply with schema breadth—0% valid output on a 369-field financial reporting schema across ALL tested models. Introduces open-source benchmark with 35 documents and 12,867 evaluatable fields.
- Relevance to agentic commerce: If frontier models can't reliably extract structured data from financial documents, this limits agent autonomy in financial contexts. Relevant to how agents process invoices, contracts, and compliance documents in commercial transactions.
- Link: https://arxiv.org/abs/2602.12247v1

📊 Working Papers & Reports

NBER Working Papers

Firm Data on AI — Ivan Yotzov, Jose Maria Barrero, Nick Bloom, Philip Bunn, Steven J. Davis, Kevin Foster et al. (Stanford, Chicago Booth, Bank of England)
- Abstract summary: First representative international data on firm-level AI use, surveying ~6,000 CFOs/CEOs across US, UK, Germany, and Australia. Key findings: (1) ~70% of firms actively use AI, especially younger, more productive firms; (2) top executives use AI only 1.5 hours/week on average, with 25% reporting zero use; (3) over 80% report no impact on employment or productivity over the last 3 years; (4) firms forecast AI will boost productivity by 1.4%, increase output by 0.8%, and cut employment by 0.7% over next 3 years. Individual employees predict a 0.5% increase in employment—a sizable expectations gap.
- Relevance to agentic commerce: Nick Bloom and Steven Davis are top labor/productivity economists. The finding that 80% of firms see no AI impact yet—but expect it soon—suggests we're at an inflection point. The executive vs. employee expectations gap on employment could shape regulation of autonomous AI agents in commerce.
- Link: https://www.nber.org/papers/w34836
GPT as a Measurement Tool — Hemanth Asirvatham, Elliott Mokski, Andrei Shleifer (Harvard)
- Abstract summary: Presents GABRIEL, a software package using GPT to quantify attributes in qualitative data. Validates GPT against 1,000+ human-annotated tasks—finds GPT is "generally indistinguishable from human evaluators" and results don't depend on exact prompting strategy. Demonstrates novel applications including quantifying trends in Congressional remarks, social media toxicity, and county-level school curricula. Constructs a dataset of 37,000 technologies, documenting a tenfold decline in invention-to-adoption time lags—from ~50 years to ~5 years today.
- Relevance to agentic commerce: Shleifer (one of the most cited economists alive) validating GPT as equivalent to human evaluators has massive implications for agent-mediated quality assessment in commerce. If agents can reliably evaluate qualitative attributes, they can make better purchasing decisions. The invention-to-adoption acceleration finding contextualizes how fast agentic commerce might go mainstream.
- Link: https://www.nber.org/papers/w34834
Non-Fungible Tokens as Investment — William N. Goetzmann, Dong Huang, Milad Nozari (Yale SOM)
- Abstract summary: Analyzes NFT bubble economics, finding returns were exceptionally right-skewed, illiquidity pervaded even the most active platforms, and a handful of trades drove aggregate performance. Successful NFT investing required "an almost perfect confluence of timing, liquidity, and luck." Investors extrapolating from realized returns without recognizing selection bias and survivorship faced substantial risk of disappointment.
- Relevance to agentic commerce: Goetzmann (legendary financial historian) dissecting the NFT bubble provides cautionary context for token-based agent identity schemes (ERC-8004). While agent commerce tokens serve a different purpose than speculative NFTs, understanding how token markets can be distorted by survivorship bias and illiquidity is relevant.
- Link: https://www.nber.org/papers/w34837

🏛️ Institutions & Labs to Watch

Columbia Business School (Besbes, Kanoria, Allouah) — Leading work on AI agent purchasing behavior and marketplace dynamics. The ACES auditing framework could become a standard for evaluating agent commerce systems.
KAIST (Se-Young Yun group) — Strong work on LLM negotiation and bargaining agents. The AgoraBench framework could inform agent-to-agent negotiation protocol design.
Mila / U Montreal (Larochelle, Courville) — Opponent-shaping in multi-agent economic environments. Publishing at ICLR 2026 on steering agent learning toward cooperative equilibria.
USC Viterbi (Krishnamachari) — Agent-based modeling of DeFi/stablecoin dynamics. Relevant to understanding how autonomous agents interact with crypto payment infrastructure.
Tsinghua (Zou et al.) — Security architecture for mobile agent operating systems. Crypto identity + sandboxed agent execution is directly applicable to agent commerce security.
Harvard Economics (Shleifer group) — Validating LLMs as measurement tools, with implications for agent-mediated quality assessment in markets.
Stanford/Chicago Booth (Bloom, Davis, Barrero) — First representative firm-level data on AI adoption. Setting the empirical baseline for understanding how AI agents will transform business operations.

📝 Scan Notes

arXiv: All four queries returned results. Query C ("agentic" OR "agent marketplace") was noisy—59,416 total results—but yielded some gems. Query D (q-fin) was highest signal-to-noise.
NBER: This week's batch had 3 directly relevant papers (Bloom/Davis firm AI data, Shleifer's GPT measurement tool, Goetzmann's NFT analysis). Most NBER papers this week were on development economics, health, and labor—not AI-focused.
SSRN: Blocked by Cloudflare (403). Need to access via browser automation in future scans.
Semantic Scholar: First query returned 44 results with several high-priority papers. Second query rate-limited (429). Consider applying for an API key for higher rate limits.
Key theme this week: Agent purchasing behavior is systematically different from human behavior (biased, volatile, gameable). Multiple independent papers converge on this finding. This is the central design challenge for agentic commerce infrastructure.
Suggestion for next scan: Add Google Scholar alerts for authors: Allouah, Besbes, Kanoria (Columbia agent commerce group), and Krishnamachari (USC stablecoin/DeFi modeling).