Academic Research Scan — 2026-02-15

Sources: arXiv (4 queries), NBER RSS, Semantic Scholar (rate-limited), SSRN (blocked) Scan time: ~10:50 AM ET, Sunday Feb 15

🔬 High Priority Papers

AI Agents in Economic Environments

EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies — (Multi-institution)
- Benchmark for continuous LLM decision-making in interactive economies: Vending, Freelance, and Operation environments. 1000+ step horizons, business-relevant outcomes (net worth, income, DAU). Tests 11 leading LLMs. Finds no single model dominates; systematic tension between high-level strategy and efficient execution.
- Link: https://arxiv.org/abs/2602.09514
- Why it matters: First rigorous benchmark for AI agents operating autonomously in economic simulations — directly relevant to agentic commerce viability.
Autonomous Market Intelligence: Agentic AI Nowcasting Predicts Stock Returns — Zefeng Chen, Darcy Pu
- Fully agentic LLM that autonomously searches the web and synthesizes information for daily stock predictions on Russell 1000. 100% agentic — no curated inputs. Top-20 picks yield 18.4 bps daily Fama-French alpha, annualized Sharpe ratio of 2.43. But ability is asymmetric: strong for identifying winners, weak for losers.
- Link: https://arxiv.org/abs/2601.11958
- Categories: q-fin.GN, q-fin.PM, q-fin.TR
- Why it matters: First documented evidence of a fully autonomous AI agent with genuine stock-selection ability. The asymmetry finding (positive news = coherent signal, negative = noise) is a fundamental insight about AI agent information processing.
MERIT Feedback Elicits Better Bargaining in LLM Negotiators — Oh, Aghazada, Shin, Yun, Kim
- AgoraBench: 9 bargaining settings (deception, monopoly, etc.) for LLMs. Human-aligned, economically-grounded metrics from utility theory. Finds baseline LLM strategies often diverge from human preferences; their mechanism improves negotiation with deeper strategic behavior.
- Link: https://arxiv.org/abs/2602.10467
- Why it matters: LLM agents negotiating commercial deals is a core use case for agentic commerce. This benchmark and training pipeline directly addresses LLM bargaining competence.
Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices — Reusens, Goethals, Calders, Martens
- Studies LLM decision-making as purchasing assistants using multinomial logit models. Derives willingness-to-pay (WTP) estimates and compares to human benchmarks. LLMs systematically overestimate human WTP, especially with business personas.
- Link: https://arxiv.org/abs/2602.09802
- Why it matters: When AI agents spend money on behalf of humans, WTP calibration is critical. This paper is directly about the economics of AI agent purchasing decisions.

Multi-Agent Markets & Mechanism Design

Manipulation in Prediction Markets: An Agent-based Modeling Experiment — Smart, Mark, Bastian, Waugh
- Agent-based simulation of prediction market manipulation by "whale" agents with biased valuations. Finds whales can temporarily shift prices proportional to capital share; herding behavior amplifies distortion. Proposes theoretical price-dynamics model.
- Link: https://arxiv.org/abs/2601.20452
- Categories: econ.GN, physics.soc-ph, q-fin.TR
- Why it matters: As AI agents enter prediction markets (already happening with Polymarket bots), understanding manipulation dynamics is essential for market integrity.
Resisting Manipulative Bots in Meme Coin Copy Trading: A Multi-Agent Approach — Luo, Feng, Xu, Liu (WWW'26)
- Multi-agent LLM system to defend against bot manipulation in meme coin copy trading. Uses chain-of-thought reasoning. Achieves 3% avg return per meme coin under realistic frictions. Published at ACM Web Conference 2026.
- Link: https://arxiv.org/abs/2601.08641
- Why it matters: Peer-reviewed work on AI agents operating in adversarial crypto markets. Directly about autonomous agent defense in real market conditions.
Who Restores the Peg? A Mean-Field Game Approach to Stablecoin Market Dynamics — Mohanty, Krishnamachari (USC)
- Mean-field game framework for USDC/USDT de-pegging events. Models arbitrageurs and retail traders across primary (mint/redeem) and secondary (exchange) markets. Identifies non-linear breakdown threshold for peg recovery.
- Link: https://arxiv.org/abs/2601.18991
- Categories: q-fin.TR, cs.GT, econ.GN
- Why it matters: Stablecoin stability is infrastructure for agentic commerce. This paper models the agent dynamics that determine whether payment rails hold.
Equity by Design: Fairness-Driven Recommendation in Heterogeneous Two-Sided Markets — Seputis, Timans, Verma
- Formalizes two-sided fairness for multi-item recommendations. Uses CVaR for consumer utility fairness. Finds moderate fairness constraints can improve business metrics by diversifying exposure. FPTAS solvers make it practical at scale.
- Link: https://arxiv.org/abs/2602.10739
- Categories: cs.GT, cs.IR
- Why it matters: Platform marketplace design insight — fairness as a tool for marketplace health, not just a cost. Directly applicable to AI agent marketplace design.

Multi-Agent Game Theory & Coordination

Towards Sustainable Investment Policies Informed by Opponent Shaping — Duque, Ciuca, Echchahed, Larochelle, Courville (Mila/ICLR 2026)
- Multi-agent simulation (InvestESG) of investors and companies under climate risk. Applies Advantage Alignment (opponent shaping) to push agents toward socially beneficial equilibria. Accepted at ICLR 2026.
- Link: https://arxiv.org/abs/2602.11829
- Categories: cs.LG, cs.GT
- Why it matters: Demonstrates that opponent shaping can align market incentives with collective welfare — a key mechanism for governing AI agent economies. Top venue (ICLR), top researchers (Larochelle, Courville).
Convex Markov Games and Beyond: Nash Equilibria — Barakat, Panageas, Varvitsiotis (AISTATS 2026)
- Extends convex Markov games theory. Proves Nash equilibria = fixed points of projected pseudo-gradient dynamics. First analysis of common-interest settings. Policy gradient algorithm with sample complexity bounds.
- Link: https://arxiv.org/abs/2602.12181
- Categories: cs.GT, cs.LG, cs.MA
- Why it matters: Foundational theory for multi-agent market equilibria. Needed for proving properties of AI agent market interactions.
Bandit Learning in Matching Markets with Interviews — Mirfakhar, Wang, Xu, Beyhaghi, Hajiesmaili
- Models matching markets where both sides have uncertain preferences. Introduces strategic deferral (choosing not to hire). Achieves time-independent regret bounds — major improvement over O(log T).
- Link: https://arxiv.org/abs/2602.12224
- Categories: cs.GT, cs.AI, econ.TH
- Why it matters: Matching markets are the backbone of gig economies and agent task marketplaces. This learning framework is directly applicable.

📄 Notable Papers

Behavioral Consistency Validation for LLM Agents: Stock-Market Simulation — Li et al.
- Tests whether LLM agents' trading-style switching (fundamental vs. technical) aligns with behavioral finance theory. Year-long simulations with daily data. Finds only partial consistency, highlighting gaps in LLM market behavior.
- Link: https://arxiv.org/abs/2602.07023
- Categories: q-fin.TR, cs.AI
Seeing the Goal, Missing the Truth: Human Accountability for AI Bias — Cao, Jiang, Xu
- Shows LLMs produce biased financial measurements when told the downstream use ("purpose leakage"). Goal-aware prompting improves pre-cutoff performance but not post-cutoff. AI bias here is a research design issue, not algorithmic.
- Link: https://arxiv.org/abs/2602.09504
- Categories: q-fin.GN, cs.AI
Trade-R1: Bridging Verifiable Rewards to Stochastic Environments — Sun et al.
- RL framework for LLM financial decision-making. Uses RAG-based triangular consistency (evidence × reasoning × decision) to filter noisy market rewards. Reduces reward hacking with cross-market generalization.
- Link: https://arxiv.org/abs/2601.03948
- Categories: cs.AI, q-fin.TR
LLM-Based Multi-Agent Investment System for Chinese Public REITs — Li
- Multi-agent LLM trading framework: 4 analyst agents + prediction agent + decision agent. Compares DeepSeek-R1 vs fine-tuned Qwen3-8B. Both outperform buy-and-hold in backtest. Small model matches large model in some scenarios.
- Link: https://arxiv.org/abs/2602.00082
- Categories: q-fin.ST, cs.AI, q-fin.TR
CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn Agentic Tool Use — Zhang et al.
- RL framework for training agentic tool-use: checklist rewards replace verifiable outcome rewards. 8B model matches judging model performance. Scalable recipe for multi-turn agents.
- Link: https://arxiv.org/abs/2602.12268
- Category: cs.AI
MalTool: Malicious Tool Attacks on LLM Agents — Hu, Jia, Li, Song, Gong (Duke/Berkeley)
- First systematic study of malicious tool code in agent ecosystems. Taxonomy based on CIA triad. Generates 1,200 standalone + 5,287 embedded malicious tools. Existing detection (including VirusTotal) fails. Major security concern for agent marketplaces.
- Link: https://arxiv.org/abs/2602.12194
- Category: cs.CR
Adjusted Winner: from Splitting to Selling — Bredereck, Sun, Briman, Talmon
- Extends the Adjusted Winner fair division method to allow selling resources under budget constraints. FPTAS algorithm for intractability. Relevant to agent resource allocation.
- Link: https://arxiv.org/abs/2602.12231
- Category: cs.GT
When Visibility Outpaces Verification: Agentic AI Discourse — Shi, DiFranzo
- Analyzes r/OpenClaw and r/Moltbook Reddit communities for verification dynamics in agentic AI discussions. Finds "Popularity Paradox": high-visibility threads have delayed verification. Proposes "epistemic friction" design interventions.
- Link: https://arxiv.org/abs/2602.11412
- Categories: cs.CY, cs.AI, cs.HC
Blind Gods and Broken Screens: A Secure Agent Operating System (Aura) — Zou et al.
- Proposes Aura: clean-slate secure agent OS replacing GUI scraping with structured agent-native interaction. Hub-and-Spoke topology, cryptographic identity, semantic firewall. 94.3% task success, 4.4% attack success. Major agent infrastructure paper.
- Link: https://arxiv.org/abs/2602.10915
- Categories: cs.CR, cs.AI
A Human-Centric Framework for Data Attribution in LLMs — Wührl, Ruckdeschel, Lo, Rogers
- Framework for LLM data attribution as part of the data economy. Bridges NLP attribution methods with governance and creator economic incentives. Domain-specific negotiation between creators, users, platforms.
- Link: https://arxiv.org/abs/2602.10995
- Category: cs.CY

📊 Working Papers & Reports

NBER

No AI/agent-specific papers this week. Current NBER batch focuses on: tort reform (w34764), protests & redistribution (w34787), immigration economics (w34788-w34794), work-from-home (w34795), financial network connectedness (w34796), monetary policy (w34798), class mobility (w34800).
Source: https://www.nber.org/new.html

SSRN

Blocked by Cloudflare. SSRN search returned 403. Will retry with browser automation in future scan.

Semantic Scholar

Rate-limited (429). Both queries returned "Too Many Requests." Consider applying for API key: https://www.semanticscholar.org/product/api#api-key-form

🏛️ Institutions & Labs to Watch

Mila (Montréal) — Larochelle & Courville on multi-agent investment/opponent shaping (ICLR 2026). Consistent output on multi-agent coordination.
USC (Krishnamachari lab) — Stablecoin mean-field games. Active in DeFi mechanism design.
Duke/Berkeley (Song, Gong) — Agent security (MalTool). Critical for marketplace trust infrastructure.
KAIST (Se-Young Yun's group) — LLM bargaining/negotiation (MERIT). Building the benchmarks for agent commerce.
Multi-institution EcoGym team — Creating the definitive benchmark for LLM agents in economic environments.

📝 Scan Notes

Source Availability

arXiv: ✅ All 4 queries returned successfully. Rich results, especially in q-fin and cs.GT.
NBER: ✅ RSS feed parsed. No AI-relevant papers this week.
SSRN: ❌ Cloudflare 403. Needs browser-based access or API.
Semantic Scholar: ❌ Rate-limited (429). Need API key for reliable access.
Brave Search: ❌ Not configured. Would help for supplementary discovery.

Key Themes This Week

LLM agents as economic actors — Multiple papers (EcoGym, WTP, MERIT) benchmarking LLMs in economic decision-making. The field is rapidly building evaluation infrastructure.
Autonomous financial agents — The "Autonomous Market Intelligence" paper (Sharpe 2.43) is a landmark result for agentic finance.
Agent security — MalTool and Aura both highlight that agent marketplace security is an urgent unsolved problem.
Multi-agent game theory — ICLR/AISTATS papers advancing the mathematical foundations needed for AI agent markets.
Crypto/DeFi agent dynamics — Stablecoin peg restoration, meme coin defense, and prediction market manipulation all involve agent-based modeling.

Suggestions for Next Scan

Apply for Semantic Scholar API key for reliable access
Set up Brave Search API for supplementary web search
Try SSRN via browser automation (openclaw profile)
Add Google Scholar alerts monitoring (needs workaround for auth)
Track specific authors: Krishnamachari, Se-Young Yun, Dawn Song (recurring in relevant work)