AI/ML Is Most Valuable When It Improves the Existing Investment Process

The real opportunity is not to replace the investment process, but to make it more robust, adaptive, and implementable.

Perhaps understandably, much of the energy, effort, and investment around artificial intelligence and machine learning (AI/ML) in asset management has focused on alpha generation. The first instinct is often to ask whether AI/ML can improve return forecasts, discover new signals, or identify patterns that traditional methods miss. That is a natural place to start, and in some settings it is productive.

But this framing is too narrow. The harder and often more valuable problem is not simply to build a better prediction engine, but to understand where AI/ML can improve the investment process itself. This distinction has become more important as powerful models, pretrained components, open-source tools, and scalable computing infrastructure have become broadly available. In many settings, the bottleneck is no longer access to sophisticated models; it is translating noisy information into robust, repeatable investment decisions.

Investing is not simply a prediction problem. It is a decision problem under uncertainty, constraints, frictions, and feedback. A forecast is only one input into a broader investment process that links data to signals, signals to portfolios, portfolios to trades, trades to realized implementation costs, and realized outcomes to attribution, monitoring, and future decisions. The goal, therefore, should not be to replace the existing investment process with an AI system. The more realistic, and often more valuable, opportunity is to improve the process that already exists.

This distinction matters. Many institutional investors already have meaningful expertise embedded in their portfolio construction frameworks, risk systems, execution infrastructure, mandate constraints, and accumulated market knowledge. The question is therefore not simply: "Can we use AI/ML to find alpha?" A better question is: "Where, specifically, can AI/ML improve the investment process, and how do we know that the improvement is real?"

This also means that not every component of an investment process needs to become AI/ML-driven. In many cases, the right answer is not to replace an existing method, but to identify where AI/ML can complement well-tested risk, portfolio construction, and execution tools.

Over the past several years, in my work with investment organizations incorporating AI/ML into their research and investment processes, I have found this reframing to be essential. The most useful conversations are rarely about whether a particular algorithm is fashionable. They are about where the process is fragile, slow, noisy, overly manual, insufficiently adaptive, or disconnected from implementation realities. They are about which decisions can be improved, what data are truly informative, how forecasts are translated into trades, and how the process behaves when the market environment changes.

In what follows, I focus on a few aspects of that broader discussion: the shift from prediction to decision-making; the importance of data selection and data use; the need for regime awareness; the central role of execution and trading frictions; the disciplined use of unstructured data; the role of AI/ML in research workflows; and why human judgment and organizational design become more important, not less. The common theme is simple: AI/ML creates durable value in investing when it improves a real decision process, not when it is treated as a standalone modeling exercise.

From Prediction to Decision

In many generic machine learning applications, the problem is naturally framed as a mapping from inputs to outputs: given data X, predict label Y. In investment management, this framing is too narrow. The relevant chain is closer to

data → forecast → portfolio → execution → risk monitoring.

The important point is that this is not a one-way pipeline. Risk monitoring, attribution, and realized implementation costs feed back into the next round of data selection, model development, portfolio construction, and execution. Each link in this chain matters. A forecast that looks impressive in isolation can be economically irrelevant if it cannot be translated into scalable positions after accounting for turnover, liquidity, nonlinear price impact, capacity, and risk constraints.

This is why model evaluation in finance should not stop at prediction accuracy. For many investment problems, the right test is not whether a model improves a statistical loss function, but whether it improves decisions. Does it lead to better portfolio outcomes? Does it reduce unnecessary turnover? Does it improve downside behavior? Does it preserve alpha after costs? Does it remain useful across market regimes? Does it scale?

In practice, the standard is often risk-adjusted performance after trading costs. A signal that increases forecast accuracy but also increases turnover, concentrates exposures, or trades in capacity-constrained markets may reduce rather than improve the investment process. Conversely, a signal with modest standalone predictive power may be valuable if it diversifies existing signals, improves risk control, or produces implementable positions with attractive net performance.

A useful AI/ML model in finance should therefore be judged by its contribution to an implementable investment process. In some settings, that contribution may come from better return forecasts. In others, it may come from improved risk forecasts, better regime diagnostics, more disciplined signal combination, or more adaptive execution. The value of AI/ML is therefore context-dependent. It depends on the decision being improved.

This also means that a modestly predictive model, properly integrated into portfolio construction and execution, may be more valuable than a statistically sophisticated model that ignores implementation. Conversely, a model with excellent backtest performance may have little value if its profits are driven by excessive turnover, exposure instability, or unrealistic assumptions about liquidity.

The practical implication is straightforward but important: investment teams should evaluate models through the process in which they will actually be used. A return forecast should be tested as part of a portfolio construction and execution workflow. A risk forecast should be tested by whether it improves sizing, hedging, stress behavior, or drawdown control. A text signal should be tested by whether it adds incremental information after accounting for existing signals and implementation constraints.

The Edge Is Often in the Data and the Framing

A second misconception is that the durable edge comes primarily from increasingly sophisticated model architectures. Complexity can be useful, especially when the input data are high-dimensional and unstructured. But in finance, model sophistication often matters less than what data you use, how you use it, and how the model integrates with the actual investment decision.

This is particularly true today because many modeling tools are increasingly commoditized. State-of-the-art architectures, pretrained models, open-source libraries, cloud infrastructure, and scalable data tools are widely available. Access to complex models alone is no longer a durable competitive advantage. The more durable edge lies in data selection, data quality, temporal integrity, economic framing, and disciplined research.

The phrase "data quality" is sometimes used too narrowly. It is not only about clean data. It is also about choosing the right data for the decision at hand. Does the dataset contain incremental information? Is it point-in-time? Does it have survivorship bias? Are the timestamps aligned with when the information was actually available? Is the coverage stable? Are missing observations informative? Can the data be mapped reliably to tradable assets?

These questions are especially important for alternative and unstructured data. A dataset may be interesting, novel, and statistically related to future returns, yet still fail to add value to an investment process. The relationship may be unstable, too slow, too crowded, too expensive to implement, or too difficult to translate into scalable positions. In other cases, the same dataset may be valuable not because it directly predicts returns, but because it improves understanding of revenues, margins, demand trends, supply-chain stress, customer behavior, or risk exposures. The actionable implication is that firms should evaluate data through a decision lens. A useful screening framework asks questions such as:

What investment decision could this data improve?
Is the information available at the time the decision is made?
Does the dataset add incremental information beyond what we already know?
Can the signal be implemented after costs, liquidity, and capacity constraints?
Does it remain robust across market environments?

This is a more demanding standard than asking whether a dataset produces a statistically significant backtest. But it is also a more useful one. Many datasets can produce interesting research results. Far fewer improve a live, repeatable, and scalable investment process.

Regime Awareness Is Not Optional

Markets are noisy, adaptive, and non-stationary. This is one of the central challenges in applying AI/ML to investment problems. Many models interpolate well within the historical environments on which they are trained, but extrapolate poorly when the environment changes.

Regime shifts can arise from many sources, including monetary policy changes, inflation shocks, liquidity stress, changes in market structure, crowding, leverage cycles, volatility spikes, or changes in investor behavior. A strategy that works in one environment may become ineffective or even dangerous in another. This is not necessarily because the model is poorly designed. It may be because the model is being asked to operate outside the conditions under which its learned relationships are valid.

For this reason, regime awareness should be viewed as a core component of AI/ML in investing, not as a "decorative overlay". Regime-aware modeling can take many forms, such as latent-state models, regime-switching models, conditional factor models, mixture-of-experts architectures, state-dependent risk models, or simpler rule-based diagnostics that monitor volatility, liquidity, correlation, and macro conditions.

The point is not that any one regime model will solve non-stationarity. It will not. The point is that the investment process should explicitly recognize that the world changes. Forecasts, risk estimates, trading costs, liquidity, and correlations are all conditional on the environment. A model that ignores this conditioning may appear stable in a full-sample backtest while failing at precisely the moments when robustness matters most.

A practical approach is to separate the problem into several tasks, including:

identifying the current market environment;
estimating how forecasts, risks, and costs behave in that environment; and
deciding how the portfolio should adapt.

The broader point is that the challenge is not merely to learn patterns. It is to understand which patterns remain relevant as market conditions evolve. AI/ML can help with this, but only when regime awareness is built into the research, validation, and monitoring process.

Execution and Frictions Are Part of the Model

The traditional investment pipeline often separates forecasting, portfolio construction, and execution. Research produces signals, portfolio construction converts signals into positions, and execution implements trades. This separation is convenient, but it can be suboptimal.

In real markets, execution and frictions are not secondary details. They determine how much of a forecast becomes realized performance. Transaction costs, market impact, liquidity, turnover, and capacity can change the economic value of a signal dramatically. Many apparent alpha signals are weakened or eliminated once these costs are incorporated honestly.

This is why execution-aware and friction-aware modeling is one of the most important directions for AI/ML in systematic investing. Forecasts should not be evaluated only as forecasts. They should be evaluated as inputs to a trading process. A signal that changes rapidly may look statistically attractive but generate excessive turnover. A signal that concentrates in illiquid names may be difficult to scale. A signal that performs well before costs may be unattractive after impact.

Modern methods can help by aligning modeling objectives more closely with implementation. For example, forecasts can be trained or regularized to reduce unnecessary turnover. Portfolio construction can be formulated as a multi-period problem that balances alpha decay against transaction costs. Execution data can be fed back into research to understand which signals survive trading and which do not. Reinforcement-learning-inspired and model-predictive-control approaches can be useful when the problem is genuinely sequential and when costs, constraints, and state variables are specified carefully.

The important qualification is that more advanced methods are not automatically better. Reinforcement learning, for example, is a natural framework for trading because trading is sequential. But it is also data-intensive, sensitive to the realism of the environment, and sometimes difficult to evaluate. In many cases, the most productive approach is hybrid: use well-understood forecasting and risk models upstream, and use dynamic decision methods to translate those inputs into trades under realistic frictions.

A practical insight for investment teams is to evaluate signals through a fixed implementation harness. Take candidate forecasts, run them through the same portfolio construction rule, apply the same cost and impact assumptions, and compare net performance, turnover, capacity, drawdowns, and exposure stability. This shifts the discussion from "Which model predicts best?" to "Which model improves the investment process?"

This kind of implementation-aware evaluation is often where the most useful insights emerge. It reveals whether a signal is robust, whether it scales, whether it is too expensive to trade, and whether it adds value after interacting with the rest of the portfolio. In many cases, the improvement comes not from inventing a new signal, but from improving the way existing forecasts are sized, traded, monitored, and combined.

Unstructured Data Requires Discipline

Unstructured data is one of the areas where AI/ML can genuinely add value. Earnings calls, filings, news, transcripts, investor presentations, and other textual sources contain information that is difficult to capture with traditional structured datasets. Large language models can help extract meaning, classify information, summarize changes, and create stable representations.

The main challenge, however, is not simply generating text features. It is turning text into investable information without chasing transient noise. Naive sentiment scores are often too crude. More durable signals tend to come from economically framed features, including changes in guidance, uncertainty, risk disclosures, tone relative to a firm's own history, differences relative to peers, and language that indicates revisions to expectations or changes in business conditions. I find the following principles useful.

First, text should be interpreted in context. The same phrase may mean different things for different firms, sectors, or market environments. Comparing a company to its own history and to its peer group is often more informative than using raw sentiment.

Second, temporal alignment is critical. The feature must be based only on information available at the time of the decision. This becomes especially subtle with large language models because the model itself may have been trained on information that was not available historically. For systematic backtesting, one must be careful not to introduce look-ahead bias through the model or the data pipeline.

Finally, text signals should be evaluated based on novelty and incremental contribution to decisions. Do they contain information not already captured by existing signals, prices, analyst expectations, or firm fundamentals? Do they improve forecasts, risk diagnostics, regime identification, or interpretation of portfolio exposures?

Used properly, AI/ML can make unstructured data more useful. Used casually, it can create persuasive but unstable narratives. The distinction is not necessarily the sophistication of the language model. It is the discipline of the investment process around it.

AI/ML as a Research Workflow Technology

One of the most immediate uses of AI/ML is not alpha generation, but research productivity. Large language models and related tools can accelerate coding, documentation, experiment design, diagnostic analysis, and interpretation. They can help researchers explore alternatives more quickly, summarize results, identify inconsistencies, and make research workflows more reproducible.

This does not replace domain expertise. On the contrary, it increases the leverage of domain experts. A skilled researcher can use these tools to test ideas faster, document assumptions more clearly, and examine failure modes more systematically. The benefit is not merely speed, but the possibility of a more disciplined research workflow.

Many investment organizations still rely heavily on informal research memory: notebooks, undocumented experiments, partially reproducible backtests, and the accumulated judgment of individual researchers. AI-assisted workflows can help institutionalize that knowledge. They can improve experiment tracking, summarize why models were rejected, document data assumptions, and create a clearer audit trail from idea generation to implementation.

This matters because overfitting in finance is not only a statistical problem. It is also a workflow problem. Researchers try many ideas, tune many parameters, test many datasets, and make many informal decisions. A disciplined workflow makes these choices more visible and therefore easier to govern.

The same logic applies after deployment. Monitoring, attribution, and model diagnostics should be part of the research loop rather than a separate operational layer. A model that degrades under certain market conditions, generates unexpected exposures, or becomes too expensive to trade is not merely a failed model. It is information. A mature investment process captures that information and uses it to improve the next generation of research.

The Human Element Becomes More Important

As models become more accessible, human judgment becomes more important, not less. This may sound paradoxical, but it reflects the nature of financial markets. AI/ML can identify patterns, process large datasets, and improve workflows, but it does not eliminate the need to define objectives, understand constraints, interpret failures, and decide when historical relationships are no longer reliable.

Human-in-the-loop systems are not a weakness. In many institutional settings, they are a robustness feature. Portfolio managers, traders, risk managers, data engineers, and researchers each see different parts of the process. The most effective AI/ML systems integrate these perspectives rather than attempting to replace them.

Organizational design matters. Firms that succeed with AI/ML tend to have cross-disciplinary teams, strong data infrastructure, rigorous validation standards, and clear links between research, risk, and execution. They also have realistic expectations. AI/ML does not remove uncertainty. It provides tools for managing it more systematically.

This is why cultural and workflow changes are often as important as technology. A firm that treats AI/ML as a separate initiative may produce interesting experiments, but it is less likely to produce durable investment improvements. A firm that embeds AI/ML into the way research is conducted, decisions are made, risks are monitored, and trades are executed is much more likely to create lasting value.

A Practical Starting Point

A useful starting point is not to ask, "Which AI/ML model should we use?" but to map the investment process into its key decision points: data selection, signal construction, portfolio construction, execution, risk monitoring, attribution, and research workflow. For each step, the practical questions are: where is the process slow, noisy, unstable, overly manual, or disconnected from implementation? Where would better information, better diagnostics, or better automation change the decision? Those are often the highest-value entry points for AI/ML.

For example, the highest-value use case may be improving a noisy regime dashboard, reducing unnecessary turnover in an existing signal, making alternative data evaluation more systematic, or building a more reliable bridge between forecasts and implementation. These may sound less glamorous than a new alpha model, but they often address the places where investment processes actually lose value.

The success metric should also be defined before the model is built. In some settings, success may mean higher risk-adjusted returns after trading costs. In others, it may mean lower turnover, more stable exposures, improved downside behavior, better capacity, faster research cycles, clearer attribution, or more reliable risk diagnostics. The metric should be tied to the decision the model is intended to improve. Otherwise, the organization risks optimizing a statistical proxy that has little connection to investment value.

This process-oriented perspective also changes the implementation roadmap. Rather than launching a broad AI/ML initiative, many firms are better served by selecting one or two concrete decision bottlenecks and building a disciplined evaluation framework around them. If the model improves the decision, scales operationally, survives realistic costs and constraints, and can be monitored over time, then it can be expanded.

Conclusion

AI/ML will remain central to the evolution of investment management. But the firms that benefit most will not necessarily be those with the most complex models. They will be the firms that understand where AI/ML fits into the investment process and how to evaluate its contribution.

The most valuable applications will often be those that improve existing processes: better data selection, more robust forecasts, clearer regime diagnostics, more realistic portfolio construction, more adaptive execution, stronger monitoring, and more disciplined research workflows. These improvements may be less glamorous than the idea of a fully automated alpha engine, but they are more likely to create durable value.

In investing, the real edge rarely comes from prediction alone; it comes from building better decision systems.

Written by Petter Kolm, Professor of Mathematics, Courant Institute School of Mathematics, Computing, and Data Science, New York University

Return to Blog

Decode the Market.
Build the Future.
Capture the Alpha.

AI/ML Is Most Valuable When It Improves the Existing Investment Process

Organized By:

QUESTIONS?

Decode the Market. Build the Future.Capture the Alpha.

AI/ML Is Most Valuable When It Improves the Existing Investment Process

Organized By:

QUESTIONS?

Decode the Market.
Build the Future.
Capture the Alpha.