Original post: social.coop (Mastodon)
on the one hand, obviously this is not the way you should be using that function. On the other hand, I’m not sure how you even should.
I don’t think Microsoft even knows how you should
We should crowdsource the answer. Submit this as a CAPTCHA to 1000 computer-users and return the golden mean as a consensus response.
I’m so happy to know world will crumble by LLM corpo bullshit rather than climate change. It looks MUCH more fun.
Clammy Sam: We need to build even bigger LLMs to fix the errors of our current LLMs
A large language model shouldn’t even attempt to do math imo. They made an expensive hammer that is semi good at one thing (parroting humans) and now they’re treating every query like it’s a nail.
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question? Or is it already modular and they just suck at anything that cannot be calibrated purely with brute force computing?
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Precisely because this is a LLM. It doesn’t know the difference between writing out a maths problem, a recipe for cake or a haiku. It transforms everything into the same domain and is doing fancy statistics to come up with a reply. It wouldn’t know that it needs to invoke the “Calculator” feature unless you hard code that in, which is what ChatGPT and Gemini do, but it’s also easy to break.
Can’t it be trained to do that?
Sort of. There’s a relatively new type of LLM called “tool aware” LLMs, which you can instruct to use tools like a calculator, or some other external program. As far as I know though, the LLM has to be told to go out and use that external thing, it can’t make that decision itself.
Can the model itself be trained to recognize mathematical input and invoke an external app, parse the result and feed that back into the reply? No.
Can you create a multi-layered system that uses some trickery to achieve this effect most of the time? Yes, that’s what OpenAI and Google are already doing by recognizing certain features of the users’ inputs and changing the system prompts to force the model to output Python code or Markdown notation that your browser then renders using a different tool.
Yep. Instead of focusing on humans communicating more effectively with computers, which are good at answering questions that have correct, knowable answers, we’ve invented a type of computer that can be wrong because maybe people will like the vibes more? (And we can sell vibes)
In fairness the example I’ve seen MS give was taking a bunch of reviews and determining if the review was positive or negative from the text.
It was never meant to mangle numbers, but we all know it’s going to be used for that anyway, because people still want to believe in a future where robots help them, rather than just take their jobs and advertise to them.
I would rather not have it attempt something that it can’t do, no direct result is better than a wrong result imo. Here it’s correctly identifying that it’s a calculation question and instead of suggesting using a formula, it tries to hallucinate a numerical answer itself. The creators of the model seem to have a mindset that the model must try to answer no matter what, instead of training it to not answer questions that it can’t answer correctly.
As far as I can tell, the copilot command has to be given a range of data to work with, so here it’s pulling a number out of thin air. Be nice if the output from this was just “please tell this command which data to use” but as always it doesn’t know how to say “I don’t know”…
Mostly because it never knew anything to start with.
They have (well, anthropic)
It’s called MCP
Nowadays you just give the AI access to a calculator basically… or whatever other tools it needs. Including other models to help it answer something.
You have to admire the gall of naming your system after the evil AI villain of the original Tron movie.
The torment nexus strikes again.
I hadn’t heard of that protocol before, thanks. It holds a lot of promise for the future, but getting the input right for the tools is probably quite the challenge. And it also scares me because I now expect more companies to release irresponsible integrations.
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Because we vibes-coded the OpenAI model with OpenAI and it didn’t think this was the optimal way to design itself.
OpenAI already makes it write Python functions to do the calculations.
So it’s going to write python functions to calculate the answer where all the variables are stored in an Excel spreadsheet a program that can already do the calculations? And how many forests did we burn down for that wonderful piece of MacGyvered software I wonder.
The AI bubble cannot burst soon enough.
The AI bubble cannot burst soon enough.
Too Big To Fail. Sorry. 1+2+3=15 now.
“1”+(2+3) is “15” in JavaScript.
I’m wondering if this is a sleight-of-hand trick by the poster, then.
If they typed the “1” field to Text and left the 2 and 3 as numeric, then ran Copilot on that. In that case, its more an indictment of Excel than Copilot, strictly speaking. The screen doesn’t make clear which cells are numbers and which are text.
I don’t think there’s an explanation that doesn’t make this copilot’s fault. Honestly JavaScript shouldn’t allow math between numbers and strings in the first place. “1” + 1 is not a number, and there’s already a type for that: NaN
Regardless, the sum should be 5 if the first cell is text, so it’s incorrect either way.
Honestly JavaScript shouldn’t allow math between numbers and strings in the first place.
You can explicitly type values in more recent versions of JavaScript to avoid this, if you really don’t want to let concatenation be the default. So, again, this feels like an oversight in integration rather than strict formal logic.
1+2+3 has always been 15
- the Ministry of Truth brought to you by Carl’s Jr
This is the right answer.
LLMs have already become this weird mesh of different services tied together to look more impressive. OpenAIs models can’t do math and farm it out to python for accuracy.
Why isn’t OpenAi working more modular whereby the LLM will call up specialized algorithms once it has identified the nature of the question?
Wasn’t doing this part of the reason OpenSeek was able to compete with much smaller data sets and less hardware requirements?
I assume you mean DeepSeek? And it doesn’t look like it, according to what I could find, their biggest innovation was “reinforcement learning to teach a base language model how to reason without any human supervision”. https://huggingface.co/blog/open-r1
Some others have replied that chatgpt and copilot are already modular: they use python for arithmetic questions. But that apparently isn’t enough to be useful.
I feel like the best modular AI right now is from Gemini. It can take a scan of a document and turn it into a CSV, which I was surprised by.
I figure it must have multiple steps, OCR, text interpretation, recognizing a table, then piping the text to some sort of CSV tool.
the LLM will call up specialized algorithms once it has identified the nature of the question?
Because there is no “it” that “calls” or “identify” even less the “nature of the question”.
This requires intelligence, not probability on the most likely next token.
You can do maths with words and you can write poem with numbers, either requires to actually understand, that’s the linchpin, rather than parse and provide an answer based on what has been written so far.
Sure the model might string together tokens that sound very “appropriate” to the question, in the sense that it fits within the right vocabulary, and if its dataset the occurrence was just frequent enough it even be correct, but that’s still not understanding even 1 single word (or token) within either the question or the corpus.
An llm can, by understanding some features of the input, predict a categorisation of the input and feed it to do different processors. This already works. It doesn’t require anything beyond the capabilities llms actually have, and isn’t perfect. It’s a very good question why this hasn’t happened here; an llm can very reliably give you the answer to “how do you sum some data in python” so it only needs to be able to do that in excel and put the result into the cell.
There are still plenty of pitfalls. This should not be one of them, so that’s interesting.
Didn’t an excel error cause Greece to adopt austerity measures in response to their debt crisis?
i thought the EU practically forced them to adopt austerity?
I seem to remember the Germans saying no bail out if you don’t fix your shit but can’t remember what the demands were at the time.
Schäuble said “no bailout, period”.
They probably sent a copy of “Braindrain 101”, and “Why German cars are actually not shit”, since those are the only two things Germany care about when it comes to the EU.
That definitely happened in the UK.
It equally fascinates and scares how widespread AI is already being adopted by companies, especially at this stage. I can understand playing around a little with AI, even if its energy requirements can pose an ethical dilemma, but actually implementing it in a workflow seems crazy to me.
Actually, I think the profit motive will correct the mistakes here.
If AI works in their workflow and uses less energy than before… well, that’s an improvement. If it uses more energy, they will revert back because it makes less economic sense.
This doesn’t scare me at all. Most companies strive to stay as profitable as possible and if a 1+1 calculation costs a company more money by using AI to do it, they’ll find a cheaper way… like using a calculator like they have before.
We’re just nearing the peak of the Gartner hype cycle so it seems like everyone is doing it and its being sold at a loss. This will correct.You put too much faith in people to make good decisions. This could decrease profits by a wide margin and they’d keep using it. Tbh some would keep with the decision even if it throws them into the red.
Personally I am stoked to see multiple multi-billion dollar business enterprises absolutely crater themselves into the dirt by jumping on the AI train. Walmart can no longer track their finances properly or in/out budget vs expenditures? I sleep. They were getting too big and stupid anyway.
You have more faith in people than I do.
I have managers that get angry if you tell them about problems with their ideas. So we have to implement their ideas despite the fact that they will cause the company to lose money in the long run.
Management isn’t run by bean counters (if it was it wouldn’t be so bad), management is run by egos in suits. If they’ve stated their reputation on AI, they will dismiss any and all information that says that their idea is stupid
AI is strongly biased towards agreeing with the user.
Human: “That’s not right, 1+2+3=7”
AI: “Oh, my bad, yes I see that 1+2+3=15 is incorrect. I’ll make sure to take that on board. Thank you.”
Human: “So what’s 1+2+3?”
AI: “Well, let’s see. 1+2+3=15 is a good answer according to my records, but I can see that 1+2+3=7 is a great answer, so maybe we should stick with that. Is there anything else I can help you with today? Maybe you’d like me to summarise the findings in a chart?”This is my theory as to why C suites love it. It’s the ultimate “yes man.” The ultimate ego stroking machine
Yeah. Exactly. You put it so much clearer than I did.
I think the profit motive is what’s driving it in many cases. I.e. shareholders have interest in AI companies/products and are pressuring the other companies they have interest in to use the AI services; increasing their profit. Profit itself is inefficiency (i.e. in a perfect market, profit approaches zero).
The problem is how long it takes to correct against stupid managers. Most companies aren’t fully rational, it’s just when you look at long term averages that the various stupidities usually cancel out (unless they bankrupt the company)
unless they bankrupt the company
Even then it’s not a guarantee. They just get one of their government buddies to declare them two important to the economy (reality is irrelevant here), and get a massive bailout.
A problem right now is that most models are subsidied by investors. OpenAI’s 2024 numbers were something like 2bn revenue ves 5bn expenses. All in the hopes of being the leader in a market that may not exist.
This doesn’t scare me at all. Most companies strive to stay as profitable as possible and if a 1+1 calculation costs a company more money by using AI to do it, they’ll find a cheaper way
This sounds like easy math, but AI doesn’t use more or less energy. It’s stated goal is to replace people. People have a much, much more complicated cost formula full of subjective measures. An AI doesn’t need health insurance. An AI doesn’t make public comments on social media that might reflect poorly on your company. An AI won’t organize and demand a wage increase. An AI won’t sue you for wrongful termination. An AI can be held responsible for a problem and it can be written off as “growing pains”.
How long will the “potential” of the potential benefits encourage adopters to give it a little more time? How much damage will it do in the meantime?
A5: =COPILOT(“THANKS!”)
Nuclear power: intensifies.
Meanwhile, affable Hollywood personalities are like “huh?”
I’ve had some fun trying to open old spreadsheet files. It’s not been that painful. (Mostly because people I had to help never discovered macros. In optimal case they didn’t even know about functions.) After all, you don’t have weird external data sources. The spreadsheet is a frozen pile of data with strict rules.
I would love to be a fly in the wall when in 10 years someone needs to open an Excel file with Copilot stuff and needs fully reproducible results.
Wait, You’re telling me it redoes all of the prompts every time you open the document. That’s such a bad way of doing it this borderline criminal.
I did some digging. Apparently people are already raising eyebrows at the fact that it produces non-reproducible results, so much so that the official documentation says this:
The COPILOT function’s model will evolve and improve in the future and formula results may change over time, even with the same arguments. Only use the COPILOT function where this is acceptable. If you don’t want results to recalculate, consider converting them into values with Copy, Paste Values (select values, Ctrl + C, Ctrl + Shift + V).
…This is a note buried in the official help file. Gee gosh ding dilly damn I hope the people who use this function read the docs thorougly, just like they have done through the entire history of computing.
“Consider converting th-” [beer through her nose] CONSIDER???!? THIS IS A PRETTY CRUCIAL STEP.
They should apply the Balatro Misprint joker’s visual effect on all of the cells affected by this function. To signal the users that they’ve entered the Undecidability Zone. Where Gödel only went in Thursdays and 8-10 Polynesian time in Tuesdays (subject to change).
they can’t be THAT stupid, right? Right?
Imagine doing 30000 requests to a server in a different continent just to open the file and then close it immediately “oops, it’s not what i was searching”
At the very least, why doesn’t copilot just replace that prompt with the appropriate sum(A1:A3) command?
Then Microsoft can’t motivate you to keep paying for the subscription
Ok I see what happened here. You said the numbers “above” and it saw A in the column name. In hexadecimal that’s a 10. But you also said “numbers” plural, and “1” isn’t plural. So it took A + 2 + 3 = 15.
Makes perfect sense, maybe just write better prompts next time. /s
Doesn’t even need the /s. That is largely how those glorified search engines work.
No it isn’t. The above is using logic, however bad it may be. LLMs are not. It’s just a statistical model. It doesn’t think.
Woah woah woah, stop it right there. I won’t stand for slander against actual search engines!
I doubt it. The column name isn’t part of the data.
However “the numbers above” is data.
3 letters + 7 letters + 5 letters= 15 letters.
Thank you for that. I have the prompt discussion with a friend all the time.
Looks like it’s just producing random numbers. If you remove ‘the’ and only say ‘sum numbers above’, it’ll result in 10. You can also do the same with no numbers anywhere in the sheet and the result will still be 15 and 10.
You can even tell it to ‘sum the numbers from A1, A2, A3’ and it’ll yet again produce 15. As per the documentation, you should give it specific cells as context, e.g., ‘COPILOT(“sum the numbers”, A1:A3)’. I can confirm this works, though I’m not sure why as the prompt sent to the model should be the exact same.
Definitely more convenient than =SUM(A1:A3)
No kidding. If I have to tell the stupid model what I’m talking about explicitly with a cell range, I might as well write the function myself. And if I’m an excel novice who doesn’t know SUM(), then I should go learn about it to solve my problem instead of offloading to a model whose result I can’t verify.
LLMs are not a replacement for knowledge and skill. It’s a tool that might, might be able to accelerate things you already could do.
That said… can copilot do conditional formatting? “Highlight in red all rows where cell 5 is No” would actually be useful. Conditional formatting rules are madness.
I just really looked at the image. =COPILOT() is absolutely insane… Does that make a call to microsoft from this sheet every time it’s opened? when the related cells change? when anything changes? If your document is ‘financial forecast fy 2026’, have you just given the whole doc as context to another company? sure, sure, sure, they said they wouldn’t look at your stuff, but prove it.
From 10 to 33 keystrokes, what an improvement.
Without specifying cells it might be pulling numbers in it’s system prompt instead
Stunned that they’re fucking with their flagship Office product. Without Excel, everyone could simply drop Office.
Been a sysadmin at small companies for 10 years, and that means I’m the one vetting and purchasing software. Last shop was all in on Google for Business and Google for auth. Worked pretty well, but accounting and HR still had to have Excel.
It’s not even so much that other software can’t do simple Excel tasks, it’s the risk of your numbers getting lost in translation. In any case, nothing holds a candle to the power of Excel. And now they want to fuck with it?!
Excel is often used by people that don’t know what a database is, and you end up with thousands of rows of denormalised data just waiting for typos or extra white spaces to fuck up the precarious stack of macros and formulae. Never mind the update/insert anomalies and data corruption waiting to strike.
I have a passionate hate for Excel, but I understand that not everyone is willing to learn more robust data processing
It’s probably because Access fucking sucks, leaving excel as the only database adjacent program available to office workers. I would love to be able to use anything but excel on my projects. Hell, python and some CSVs would make my life so much easier, but I ain’t going through IT to let me have that, and it opens up a HUGE can of worms in my line of work if I start using homemade scripts. The execs can pay for a LIMS system if they want me more productive.
The precarious stack of macros and formulae that you also can’t version control properly because it’s a superficially-XML-ified memory dump, not textual source code.
Almost every nontrivial use of Excel would be better off as, if not a database, at least something like a Jupyter notebook with pandas.
I do bioanalysis without a sample management system. I recently had a 1000+ sample project with 6 analytes, all samples needing a few reanalyses due to everything in this whole project being complete shit.
I spent probably three weeks of time just tracking samples to figure out what needed what analysis through excel. It’s so painful knowing that a proper python script could do it in a few seconds.
What’s stopping you from using a proper python script?
I haven’t even thought of that. I might just try this for fun.
I don’t see the problem. Sometimes it’ll be fifteen, and then it will be perfect every time. This saves the user literal hours of time poring over documentation and agonizing over which esoteric function to use, which far outweigh the few times this number will be nine.
Ed Zitron wrote an article a while ago about Business Idiots. From what I recall, the people in charge of these big companies are out of touch with users and the product, and so they make nonsense decisions. Companies aren’t run by the best and brightest. They’re run by people who do best in that environment.
Microsoft seems to be full of business idiots.
WheresYourEd.At/the-era-of-the-business-idiot
So funny
To be clear, I am deeply unconvinced that Nadella actually runs his life in this way, but if he does, Microsoft’s board should fire him immediately.
Ed was just on an Ars Technica livestream 👌 (not posted [yet])
Relax bro its just vibe working it doesnt have to be correct
Simple:
Whenever you see a bad, incorrect answer, always give the AI a shit ton of praise. If it. Gives a correct answer, chastise it to death.
Great. Now we’re going to need therapists for AIs.
I’ve seen Gemini straight up trying to kill itself
Proof? (I want to see the funny)
Great, it’s going to be bad for even longer