Academic Research Scan β 2026-02-15
Academic Research Scan β 2026-02-15
Sources: arXiv (4 queries), NBER RSS, Semantic Scholar (rate-limited), SSRN (blocked) Scan time: ~10:50 AM ET, Sunday Feb 15
π¬ High Priority Papers
AI Agents in Economic Environments
-
EcoGym: Evaluating LLMs for Long-Horizon Plan-and-Execute in Interactive Economies β (Multi-institution)
- Benchmark for continuous LLM decision-making in interactive economies: Vending, Freelance, and Operation environments. 1000+ step horizons, business-relevant outcomes (net worth, income, DAU). Tests 11 leading LLMs. Finds no single model dominates; systematic tension between high-level strategy and efficient execution.
- Link: https://arxiv.org/abs/2602.09514
- Why it matters: First rigorous benchmark for AI agents operating autonomously in economic simulations β directly relevant to agentic commerce viability.
-
Autonomous Market Intelligence: Agentic AI Nowcasting Predicts Stock Returns β Zefeng Chen, Darcy Pu
- Fully agentic LLM that autonomously searches the web and synthesizes information for daily stock predictions on Russell 1000. 100% agentic β no curated inputs. Top-20 picks yield 18.4 bps daily Fama-French alpha, annualized Sharpe ratio of 2.43. But ability is asymmetric: strong for identifying winners, weak for losers.
- Link: https://arxiv.org/abs/2601.11958
- Categories: q-fin.GN, q-fin.PM, q-fin.TR
- Why it matters: First documented evidence of a fully autonomous AI agent with genuine stock-selection ability. The asymmetry finding (positive news = coherent signal, negative = noise) is a fundamental insight about AI agent information processing.
-
MERIT Feedback Elicits Better Bargaining in LLM Negotiators β Oh, Aghazada, Shin, Yun, Kim
- AgoraBench: 9 bargaining settings (deception, monopoly, etc.) for LLMs. Human-aligned, economically-grounded metrics from utility theory. Finds baseline LLM strategies often diverge from human preferences; their mechanism improves negotiation with deeper strategic behavior.
- Link: https://arxiv.org/abs/2602.10467
- Why it matters: LLM agents negotiating commercial deals is a core use case for agentic commerce. This benchmark and training pipeline directly addresses LLM bargaining competence.
-
Would a Large Language Model Pay Extra for a View? Inferring Willingness to Pay from Subjective Choices β Reusens, Goethals, Calders, Martens
- Studies LLM decision-making as purchasing assistants using multinomial logit models. Derives willingness-to-pay (WTP) estimates and compares to human benchmarks. LLMs systematically overestimate human WTP, especially with business personas.
- Link: https://arxiv.org/abs/2602.09802
- Why it matters: When AI agents spend money on behalf of humans, WTP calibration is critical. This paper is directly about the economics of AI agent purchasing decisions.
Multi-Agent Markets & Mechanism Design
-
Manipulation in Prediction Markets: An Agent-based Modeling Experiment β Smart, Mark, Bastian, Waugh
- Agent-based simulation of prediction market manipulation by "whale" agents with biased valuations. Finds whales can temporarily shift prices proportional to capital share; herding behavior amplifies distortion. Proposes theoretical price-dynamics model.
- Link: https://arxiv.org/abs/2601.20452
- Categories: econ.GN, physics.soc-ph, q-fin.TR
- Why it matters: As AI agents enter prediction markets (already happening with Polymarket bots), understanding manipulation dynamics is essential for market integrity.
-
Resisting Manipulative Bots in Meme Coin Copy Trading: A Multi-Agent Approach β Luo, Feng, Xu, Liu (WWW'26)
- Multi-agent LLM system to defend against bot manipulation in meme coin copy trading. Uses chain-of-thought reasoning. Achieves 3% avg return per meme coin under realistic frictions. Published at ACM Web Conference 2026.
- Link: https://arxiv.org/abs/2601.08641
- Why it matters: Peer-reviewed work on AI agents operating in adversarial crypto markets. Directly about autonomous agent defense in real market conditions.
-
Who Restores the Peg? A Mean-Field Game Approach to Stablecoin Market Dynamics β Mohanty, Krishnamachari (USC)
- Mean-field game framework for USDC/USDT de-pegging events. Models arbitrageurs and retail traders across primary (mint/redeem) and secondary (exchange) markets. Identifies non-linear breakdown threshold for peg recovery.
- Link: https://arxiv.org/abs/2601.18991
- Categories: q-fin.TR, cs.GT, econ.GN
- Why it matters: Stablecoin stability is infrastructure for agentic commerce. This paper models the agent dynamics that determine whether payment rails hold.
-
Equity by Design: Fairness-Driven Recommendation in Heterogeneous Two-Sided Markets β Seputis, Timans, Verma
- Formalizes two-sided fairness for multi-item recommendations. Uses CVaR for consumer utility fairness. Finds moderate fairness constraints can improve business metrics by diversifying exposure. FPTAS solvers make it practical at scale.
- Link: https://arxiv.org/abs/2602.10739
- Categories: cs.GT, cs.IR
- Why it matters: Platform marketplace design insight β fairness as a tool for marketplace health, not just a cost. Directly applicable to AI agent marketplace design.
Multi-Agent Game Theory & Coordination
-
Towards Sustainable Investment Policies Informed by Opponent Shaping β Duque, Ciuca, Echchahed, Larochelle, Courville (Mila/ICLR 2026)
- Multi-agent simulation (InvestESG) of investors and companies under climate risk. Applies Advantage Alignment (opponent shaping) to push agents toward socially beneficial equilibria. Accepted at ICLR 2026.
- Link: https://arxiv.org/abs/2602.11829
- Categories: cs.LG, cs.GT
- Why it matters: Demonstrates that opponent shaping can align market incentives with collective welfare β a key mechanism for governing AI agent economies. Top venue (ICLR), top researchers (Larochelle, Courville).
-
Convex Markov Games and Beyond: Nash Equilibria β Barakat, Panageas, Varvitsiotis (AISTATS 2026)
- Extends convex Markov games theory. Proves Nash equilibria = fixed points of projected pseudo-gradient dynamics. First analysis of common-interest settings. Policy gradient algorithm with sample complexity bounds.
- Link: https://arxiv.org/abs/2602.12181
- Categories: cs.GT, cs.LG, cs.MA
- Why it matters: Foundational theory for multi-agent market equilibria. Needed for proving properties of AI agent market interactions.
-
Bandit Learning in Matching Markets with Interviews β Mirfakhar, Wang, Xu, Beyhaghi, Hajiesmaili
- Models matching markets where both sides have uncertain preferences. Introduces strategic deferral (choosing not to hire). Achieves time-independent regret bounds β major improvement over O(log T).
- Link: https://arxiv.org/abs/2602.12224
- Categories: cs.GT, cs.AI, econ.TH
- Why it matters: Matching markets are the backbone of gig economies and agent task marketplaces. This learning framework is directly applicable.
π Notable Papers
-
Behavioral Consistency Validation for LLM Agents: Stock-Market Simulation β Li et al.
- Tests whether LLM agents' trading-style switching (fundamental vs. technical) aligns with behavioral finance theory. Year-long simulations with daily data. Finds only partial consistency, highlighting gaps in LLM market behavior.
- Link: https://arxiv.org/abs/2602.07023
- Categories: q-fin.TR, cs.AI
-
Seeing the Goal, Missing the Truth: Human Accountability for AI Bias β Cao, Jiang, Xu
- Shows LLMs produce biased financial measurements when told the downstream use ("purpose leakage"). Goal-aware prompting improves pre-cutoff performance but not post-cutoff. AI bias here is a research design issue, not algorithmic.
- Link: https://arxiv.org/abs/2602.09504
- Categories: q-fin.GN, cs.AI
-
Trade-R1: Bridging Verifiable Rewards to Stochastic Environments β Sun et al.
- RL framework for LLM financial decision-making. Uses RAG-based triangular consistency (evidence Γ reasoning Γ decision) to filter noisy market rewards. Reduces reward hacking with cross-market generalization.
- Link: https://arxiv.org/abs/2601.03948
- Categories: cs.AI, q-fin.TR
-
LLM-Based Multi-Agent Investment System for Chinese Public REITs β Li
- Multi-agent LLM trading framework: 4 analyst agents + prediction agent + decision agent. Compares DeepSeek-R1 vs fine-tuned Qwen3-8B. Both outperform buy-and-hold in backtest. Small model matches large model in some scenarios.
- Link: https://arxiv.org/abs/2602.00082
- Categories: q-fin.ST, cs.AI, q-fin.TR
-
CM2: Reinforcement Learning with Checklist Rewards for Multi-Turn Agentic Tool Use β Zhang et al.
- RL framework for training agentic tool-use: checklist rewards replace verifiable outcome rewards. 8B model matches judging model performance. Scalable recipe for multi-turn agents.
- Link: https://arxiv.org/abs/2602.12268
- Category: cs.AI
-
MalTool: Malicious Tool Attacks on LLM Agents β Hu, Jia, Li, Song, Gong (Duke/Berkeley)
- First systematic study of malicious tool code in agent ecosystems. Taxonomy based on CIA triad. Generates 1,200 standalone + 5,287 embedded malicious tools. Existing detection (including VirusTotal) fails. Major security concern for agent marketplaces.
- Link: https://arxiv.org/abs/2602.12194
- Category: cs.CR
-
Adjusted Winner: from Splitting to Selling β Bredereck, Sun, Briman, Talmon
- Extends the Adjusted Winner fair division method to allow selling resources under budget constraints. FPTAS algorithm for intractability. Relevant to agent resource allocation.
- Link: https://arxiv.org/abs/2602.12231
- Category: cs.GT
-
When Visibility Outpaces Verification: Agentic AI Discourse β Shi, DiFranzo
- Analyzes r/OpenClaw and r/Moltbook Reddit communities for verification dynamics in agentic AI discussions. Finds "Popularity Paradox": high-visibility threads have delayed verification. Proposes "epistemic friction" design interventions.
- Link: https://arxiv.org/abs/2602.11412
- Categories: cs.CY, cs.AI, cs.HC
-
Blind Gods and Broken Screens: A Secure Agent Operating System (Aura) β Zou et al.
- Proposes Aura: clean-slate secure agent OS replacing GUI scraping with structured agent-native interaction. Hub-and-Spoke topology, cryptographic identity, semantic firewall. 94.3% task success, 4.4% attack success. Major agent infrastructure paper.
- Link: https://arxiv.org/abs/2602.10915
- Categories: cs.CR, cs.AI
-
A Human-Centric Framework for Data Attribution in LLMs β WΓΌhrl, Ruckdeschel, Lo, Rogers
- Framework for LLM data attribution as part of the data economy. Bridges NLP attribution methods with governance and creator economic incentives. Domain-specific negotiation between creators, users, platforms.
- Link: https://arxiv.org/abs/2602.10995
- Category: cs.CY
π Working Papers & Reports
NBER
- No AI/agent-specific papers this week. Current NBER batch focuses on: tort reform (w34764), protests & redistribution (w34787), immigration economics (w34788-w34794), work-from-home (w34795), financial network connectedness (w34796), monetary policy (w34798), class mobility (w34800).
- Source: https://www.nber.org/new.html
SSRN
- Blocked by Cloudflare. SSRN search returned 403. Will retry with browser automation in future scan.
Semantic Scholar
- Rate-limited (429). Both queries returned "Too Many Requests." Consider applying for API key: https://www.semanticscholar.org/product/api#api-key-form
ποΈ Institutions & Labs to Watch
- Mila (MontrΓ©al) β Larochelle & Courville on multi-agent investment/opponent shaping (ICLR 2026). Consistent output on multi-agent coordination.
- USC (Krishnamachari lab) β Stablecoin mean-field games. Active in DeFi mechanism design.
- Duke/Berkeley (Song, Gong) β Agent security (MalTool). Critical for marketplace trust infrastructure.
- KAIST (Se-Young Yun's group) β LLM bargaining/negotiation (MERIT). Building the benchmarks for agent commerce.
- Multi-institution EcoGym team β Creating the definitive benchmark for LLM agents in economic environments.
π Scan Notes
Source Availability
- arXiv: β All 4 queries returned successfully. Rich results, especially in q-fin and cs.GT.
- NBER: β RSS feed parsed. No AI-relevant papers this week.
- SSRN: β Cloudflare 403. Needs browser-based access or API.
- Semantic Scholar: β Rate-limited (429). Need API key for reliable access.
- Brave Search: β Not configured. Would help for supplementary discovery.
Key Themes This Week
- LLM agents as economic actors β Multiple papers (EcoGym, WTP, MERIT) benchmarking LLMs in economic decision-making. The field is rapidly building evaluation infrastructure.
- Autonomous financial agents β The "Autonomous Market Intelligence" paper (Sharpe 2.43) is a landmark result for agentic finance.
- Agent security β MalTool and Aura both highlight that agent marketplace security is an urgent unsolved problem.
- Multi-agent game theory β ICLR/AISTATS papers advancing the mathematical foundations needed for AI agent markets.
- Crypto/DeFi agent dynamics β Stablecoin peg restoration, meme coin defense, and prediction market manipulation all involve agent-based modeling.
Suggestions for Next Scan
- Apply for Semantic Scholar API key for reliable access
- Set up Brave Search API for supplementary web search
- Try SSRN via browser automation (openclaw profile)
- Add Google Scholar alerts monitoring (needs workaround for auth)
- Track specific authors: Krishnamachari, Se-Young Yun, Dawn Song (recurring in relevant work)