Read the Beforeitsnews.com story here. Advertise at Before It's News here.
Profile image
By Quantitative Trading (Reporter)
Contributor profile | More stories
Story Views
Now:
Last hour:
Last 24 hours:
Total:

Have LLMs improved over the last year? Comparing their responses to our books' prompts a year later.

% of readers think this story is Fact. Add your two cents.


The answer to this question may seem obvious if you read the breathless proclamations of AI luminaries, but good quantitative investors should be hype-immune. We want to carefully compare the ChatGPT’s unsatisfactory responses to a couple of the prompts outlined in our book (which are mostly generated in early 2024) to their responses now (April 2025). In addition, if ChatGPT’s response is still not satisfactory, we want to compare the responses of several popular LLMs, some of which weren’t even available in early 2024 or weren’t launched in production mode yet. Of course, to test all the LLMs (e.g. ChatGPT, Copilot, DeepSeek, Gemini, Grok, Perplexity, …) is very time consuming, so we also use Chatbot Arena to randomly pick 2 for each prompt (see https://www.bloomberg.com/news/articles/2025-02-18/before-deepseek-blew-up-one-website-announced-its-arrival).

Let’s try the first and very simple task: plotting annualized returns of an ETF. Shockingly, ChatGPT failed at this task previously, as you could read from our book. It committed the egregious error of annualizing returns by multiplying by sqrt(252).

User Prompt Write a Python program to download the adjusted closing prices of the ETF BILL, and plot its annualized daily returns chart. We only need YYYY as x-axis label.

Unfortunately, many LLMs (such as ChatGPT, Perplexity, gemma-3-4b-it by Google, command-a-03-2025 by Cohere) misinterpreted the question and produced some complicated calculations for the rolling 252-day compound returns. The only chatbot that correctly and simply multiplied the daily returns by 252 and displayed that is Microsoft’s Copilot. Bravo Mr. Nadella!

Now let’s proceed to another task that ChatGPT failed miserably before: summarizing a trading strategy based on Simon and Campasano (2014).

User Prompt Pretend you are a quantitative finance expert. Please summarize the trading strategies described in this paper. [Author Note: We attach the pdf.]

ChatGPT Response

Absolutely—here’s a summary of the trading strategies described in the paper “The VIX Futures Basis: Evidence and Trading Strategies” by David P. Simon and Jim Campasano:


🧠 Core Insight:

The VIX futures basis—the difference between VIX futures prices and the VIX spot index—does not predict changes in the VIX spot index, but does predict changes in VIX futures prices. This creates trading opportunities due to the convergence of futures prices to spot at expiry.


📊 Main Strategy:

Exploit the “Roll” in the VIX Futures Curve

  • Short VIX futures when the curve is in contango (futures > spot) beyond a threshold.

  • Buy VIX futures when the curve is in backwardation (futures

These trades take advantage of the fact that VIX futures converge toward the spot VIX price as they approach settlement.


🔧 Implementation Details:

  • Daily Roll Signal: The basis is divided by the number of business days to settlement.

    • Enter short if daily roll > $100

    • Enter long if daily roll

  • Exit criteria:

    • Daily roll falls below $50 (short) or rises above –$50 (long), or

    • Maximum holding period of 9 business days


⚖️ Hedging Equity Risk:

  • Because VIX futures prices move inversely to equity markets, trades are hedged using mini-S&P 500 futures.

    • Short VIX = Short S&P hedge

    • Long VIX = Long S&P hedge

  • Hedge ratios are computed using regressions of VIX futures price changes on S&P futures returns, updated daily out-of-sample.


💵 Profitability Summary (2007–2011):

Strategy Mean P&L Sortino Ratio Notes Short VIX (hedged) $792 1.26 62 trades, 2:1 win/loss Long VIX (hedged) $1,018 1.03 40 trades Unhedged versions Slightly higher P&L, but significantly higher downside risk

  • Transaction costs included full bid-ask spread + brokerage fees.

  • Strategies showed robust performance even out-of-sample and across sub-periods (pre/post-2009).


🧾 Key Takeaway:


The profitability is mainly driven by the roll yield (the premium or discount in futures relative to spot), and much of the risk can be hedged using equity index futures, enabling statistically significant and economically meaningful returns.


Let me know if you’d like this turned into a code simulation, visual walkthrough, or executive-style slide!


This is a much better response than the one recorded in our book. We truly can’t ask for a better answer. Let’s take ChatGPT up on its offer and turn it into code. The results can found here: https://chatgpt.com/canvas/shared/680a8810a5808191aaa35e4b31d0a813. Looks great, doesn’t it? What we don’t know, however, is whether ChatGPT has used our previous prompt as part of its training data for this version, so this isn’t a completely fair benchmark.

TL;DR

Yes, the answers got much better over the last year. But still not all LLMs give equally satisfactory answers – sometimes one to try quite a few to find the suitable one. We suggest you try out other prompts in our book and see if the answers improved!


Source: http://epchan.blogspot.com/2025/05/have-llms-improved-over-last-year.html



Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world.

Anyone can join.
Anyone can contribute.
Anyone can become informed about their world.

"United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.

Before It’s News® is a community of individuals who report on what’s going on around them, from all around the world. Anyone can join. Anyone can contribute. Anyone can become informed about their world. "United We Stand" Click Here To Create Your Personal Citizen Journalist Account Today, Be Sure To Invite Your Friends.


LION'S MANE PRODUCT


Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules


Mushrooms are having a moment. One fabulous fungus in particular, lion’s mane, may help improve memory, depression and anxiety symptoms. They are also an excellent source of nutrients that show promise as a therapy for dementia, and other neurodegenerative diseases. If you’re living with anxiety or depression, you may be curious about all the therapy options out there — including the natural ones.Our Lion’s Mane WHOLE MIND Nootropic Blend has been formulated to utilize the potency of Lion’s mane but also include the benefits of four other Highly Beneficial Mushrooms. Synergistically, they work together to Build your health through improving cognitive function and immunity regardless of your age. Our Nootropic not only improves your Cognitive Function and Activates your Immune System, but it benefits growth of Essential Gut Flora, further enhancing your Vitality.



Our Formula includes: Lion’s Mane Mushrooms which Increase Brain Power through nerve growth, lessen anxiety, reduce depression, and improve concentration. Its an excellent adaptogen, promotes sleep and improves immunity. Shiitake Mushrooms which Fight cancer cells and infectious disease, boost the immune system, promotes brain function, and serves as a source of B vitamins. Maitake Mushrooms which regulate blood sugar levels of diabetics, reduce hypertension and boosts the immune system. Reishi Mushrooms which Fight inflammation, liver disease, fatigue, tumor growth and cancer. They Improve skin disorders and soothes digestive problems, stomach ulcers and leaky gut syndrome. Chaga Mushrooms which have anti-aging effects, boost immune function, improve stamina and athletic performance, even act as a natural aphrodisiac, fighting diabetes and improving liver function. Try Our Lion’s Mane WHOLE MIND Nootropic Blend 60 Capsules Today. Be 100% Satisfied or Receive a Full Money Back Guarantee. Order Yours Today by Following This Link.


Report abuse

Comments

Your Comments
Question   Razz  Sad   Evil  Exclaim  Smile  Redface  Biggrin  Surprised  Eek   Confused   Cool  LOL   Mad   Twisted  Rolleyes   Wink  Idea  Arrow  Neutral  Cry   Mr. Green

MOST RECENT
Load more ...

SignUp

Login

Newsletter

Email this story
Email this story

If you really want to ban this commenter, please write down the reason:

If you really want to disable all recommended stories, click on OK button. After that, you will be redirect to your options page.