Gemma4 in my view is good enough to do things similar to Gemini 2.5 flash, meaning if I point it code and ask for help and there is a problem with the code itāll answer correctly in terms of suggestions but itās not great at using all tools or one shooting things that require a lot of context or āexpert knowledgeā
If a couple more iterations of this, say gemma6 is as good as current opus and runs completely locally on a Mac, I wonāt really bother with the cloud models.
I think the difference is that with LLMs, in a lot of cases you do see some diminishing returns.
I won't deny that the latest Claude models are fantastic at just one shotting loads of problems. But we have an internal proxy to a load of models running on Vertex AI and I accidentally started using Opus/Sonnet 4 instead of 4.6. I genuinely didn't know until I checked my configuration.
AI models will get to this point where for 99% of problems, something like Gemma is gonna work great for people. Pair it up with an agentic harness on the device that lets it open apps and click buttons and we're done.
I still can't fathom that we're in 2026 in the AI boom and I still can't ask Gemini to turn shuffle mode on in Spotify. I don't think model intelligence is as much of an issue as people think it is.
I mean to me even difference between Opus and Sonnet is as clear as day and night, and even Opus and the best GPT model. Opus 4.6 just seems much more reliable in terms of me asking it to do something, and that to actually happen.
It depends what you're asking it though. Sure, in a software development environment the difference between those two models is noticeable.
But think about the general user. They're using the free Gemini or ChatGPT. They're not using the latest and greatest. And they're happy using it.
And I am willing to bet that a lot of paying users would be served perfectly fine by the free models.
If a capable model is able to live on device and solve 99% of people's problems, then why would the average person ever need to pay for ChatGPT or Gemini?
"XYZ Corp" won't allow their developers to write their desktop app in Rust because they want to consume only 16MB RAM, then another implementation for mobile with Swift and/or Kotlin, when they can release good enough solution with React + Electron consuming 4GB RAM and reuse components with React Native.
People get hung up on bad optimization. It you are the working at sufficiently large scale, yes, thinking about bytes might be a good use of your time.
But most likely, it's not. At a system level we don't want people to do that. It's a waste of resources. Making a virtue out of it is bad, unless you care more about bytes than humans.
These bytes are human lives. The bytes and the CPU cycles translate to software that takes longer to run, that is more frustrating, that makes people accomplish less in longer time than they could, or should. Take too much, and you prevent them from using other software in parallel, compounding the problem. Or you're forcing them to upgrade hardware early, taking away money they could better spend in different areas of their lives. All this scales with the number of users, so for most software with any user base, not caring about bytes and cycles is wasting much more people-hours than is saving in dev time.
Look at the whole history of computing. How many times has the pendulum swung from thin to fat clients and back?
I don't think it's even mildly controversial to say that there will be an inflection point where local models get Good Enough and this iteration of the pendulum shall swing to fat clients again.
Assuming improvements in LLMs follow a sigmoid curve, even if the cloud models are always slightly ahead in terms of raw performance it won't make much of a difference to most people, most of the time.
The local models have their own advantages (privacy, no -as-a-service model) that, for many people and orgs, will offset a small performance advantage. And, of course, you can always fall back on the cloud models should you hit something particularly chewy.
(All IMO - we're all just guessing. For example, good marketing or an as-yet-undiscovered network effect of cloud LLMs might distort this landscape).
Yep, and to be honest we don't really need local models for intensive tasks. At least yet. You can use openrouter (and others) to consume a wide variety of open models which are capable of using tools in an agentic workflow, close to the SOTA models, which are essentially commodities - many providers, each serving the same model and competing with each-other on uptime, throughput, and price. At some point we will be able to run them on commodity hardware, but for now the fact that we can have competition between providers is enough to ensure that rug pulls aren't possible.
Plus having Gemma on my device for general chat ensures I will always have a privacy respecting offline oracle which fulfils all of the non-programming tasks I could ever want. We are already at the point where the moat for these hyper scalers has basically dissolved for the general public's use case.
If I was OpenAI or Anthropic I would be shitting my pants right now and trying every unethical dark pattern in the book to lock in my customers. And they are trying hard. It won't work. And I won't shed a single tear for them.
Local models seem somewhere between 9 and 24 months behind. I'm not saying I won't be impressed with what online models will be able to do in two years, but I'm pretty satisfied with the prediction that I won't really need them in a couple of years.
A lot of people are making the mistake of noticing that local models have been 12-24 months behind SotA ones for a good portion of the last couple years, and then drawing a dotted line assuming that continues to hold.
It simply.. doesn't. The SotA models are enormous now, and there's no free lunch on compression/quantization here.
Opus 4.6 capabilities are not coming to your (even 64-128gb) laptop or phone in the popular architecture that current LLMs use.
Now, that doesn't mean that a much narrower-scoped model with very impressive results can't be delivered. But that narrower model won't have the same breadth of knowledge, and TBD if it's possible to get the quality/outcomes seen with these models without that broad "world" knowledge.
It also doesn't preclude a new architecture or other breakthrough. I'm simply stating it doesn't happen with the current way of building these.
edit: forgot to mention the notion of ASIC-style models on a chip. I haven't been following this closely, but last I saw the power requirements are too steep for a mobile device.
Yeah, but that's the current state of the art after decades of aggressive optimizations, there's no foreseeable future where we'll ever be able to cram several orders of magnitude more ram into a phone.
We already cram several orders of magnitude more flash storage into phone than RAM (e.g. my phone has 16 GB RAM but 1 TB storage); even now, with some smart coding, if you don't need all that data at the same time for random access at sub millisecond speed, it's hard to tell the difference.
Pretty sure thereās at least a couple orders of magnitude in purely algorithmic areas of LLM inference; maybe training, too, though Iām less confident here. Rationale: meat computers run on 20W, though pretraining took a billion years or so.
Yes, I agree that this is the right solution, because for a locally-hosted model I value more the quality of the output than the speed with which it is produced, so I prefer the models as they were originally trained, not with further quantizations.
While that paper praises the Apple advantage in SSD speed, which allows a decent performance for inference with huge models, nowadays SSD speeds equal or greater than that can be achieved in any desktop PC that has dual PCIe 5.0 SSDs, or even one PCIe 5.0 and one PCIe 4.0 SSDs.
Because I had also independently reached this conclusion, like I presume many others, I have just started to work a week ago on modifying llama.cpp to use in an optimal manner weights stored on SSDs, while also batching many tasks, so that they will share each pass through the SSDs. I assume that in the following months we will see more projects in this direction, so the local hosting of very large models will become easier and more widespread, allowing the avoidance of the high risks associated with external providers, like the recent enshittification of Claude Code.
But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback / review mechanisms or having to babysit it prompt by prompt.
By the time gemma6 allows you to do the above the proprietary models supposedly will already be on the next step change. It just depends if you need to ride the bleeding edge but specially because it's "intelligence", there's an obvious advantage in using the best version and it's easy to hype it up and generate fomo.
> But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback
Do people actually build meaningful things like that?
It's basically impossible to leave any AI agent unsupervised, even with an amazing harness (which is incredibly hard to build). The code slowly rots and drifts over time if not fully reviewed and refactored constantly.
Even if teams of agents working almost fully autonomously were reliable from a functional perspective (they would build a functional product), the end product would have ever increasing chaos structurally over time.
When that happens, you'll have fomo from not using opus 5.x. The numbers that they showed for Mythos show that the frontier is still steadily moving (and maybe even at a faster pace than before)
There is a cognitive ceiling for what you can do with smaller models. Animals with simpler neural pathways often outperform whatever think they are capable of but there's no substitute for scale. I don't think you'll ever get a 4B or 8B model equivalent to Opus 4.6. Maybe just for coding tasks but certainly not Opus' breadth.
The only thing that we are sure can't be highly compressed is knowledge, because you can only fit so much information in given entropy budget without losing fidelity.
The minimal size limits of reasoning abilities are not clear at all. It could be that you don't need all that many parameters. In which case the door is open for small focused models to converge to parity with larger models in reasoning ability.
If that happens we may end up with people using small local models most of the time, and only calling out to large models when they actually need the extra knowledge.
> and only calling out to large models when they actually need the extra knowledge
When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.
And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.
> we already know it's possible to make a pretty darn clever reasoning model
There's is a problem though: we know that it is possible, but we don't know how to (at least not yet and as far as I am aware). So we know the answer to "what?" question, but we don't know the answer to "how?" question.
I think you underestimate the amount of knowledge needed to deal with the complexities of language in general as opposed to specific applications. We had algorithms to do complex mathematical reasoning before we had LLMs, the drawback being that they require input in restricted formal languages. Removing that restriction is what LLMs brought to the table.
Once the difficult problem of figuring out what the input is supposed to mean was somewhat solved, bolting on reasoning was easy in comparison. It basically fell out with just a bit of prompting, "let's think step by step."
If you want to remove that knowledge to shrink the model, we're back to contorting our input into a restricted language to get the output we want, i.e. programming.
I think you are underestimating the strength a small model can get from tool use. There may be no substitute for scale, but that scale can live outside of the model and be queried using tools.
In the worst case a smaller model could use a tool that involves a bigger model to do something.
except you don't want knowledge in the model, and most of that "size" comes from "encoded knowledge", i.e. over fitting. The goal should be to only have language handling in the model, and the knowledge in a database you can actually update, analyze etc. It's just really hard to do so.
"world models" (for cars) maybe make sense for self driving, but they are also just a crude workaround to have a physics simulation to push understanding of physics. Through in
difference to most topics, basic, physics tend to not change randomly and it's based on observation of reality, so it probably can work.
Law, health advice, programming stuff etc. on the other hand changes all the time and is all based on what humans wrote about it. Which in some areas (e.g. law or health) is very commonly outdated, wrong or at least incomplete in a dangerous way. And for programming changes all the time.
Having this separation of language processing and knowledge sources is ... hard, language is messy and often interleaves with information.
But this is most likely achievable with smaller models. Actually it might even be easier with a small model. (Through if the necessary knowledge bases are achievable to fit on run on a mac is another topic...)
And this should be the goal of AI companies, as it's the only long term sustainable approach as far as I can tell.
I say should because it may not be, because if they solve it that way and someone manages to clone their success then they lose all their moat for specialized areas as people can create knowledge bases for those areas with know-how OpenAI simple doesn't have access to. (Which would be a preferable outcome as it means actual competition and a potential fair working market.)
TLS cipher X25519MLKEM768 is recommended to be enabled on servers which do support it
last time I checked AI didn't even list it when you asked it for a list of TLS 1.3 ciphers (through it has been widely supported since even before it was fully standardized..)
this isn't surprising as most input sources AI can use for training are outdated and also don't list it
maybe someone of OpenAI will spot this and feet it explicitly into the next training cycle, or people will cover it more and through this it is feed implicitly there
but what about all that many niche but important information with just a handful of outdated stack overflow posts or similar? (which are unlikely to get updated now that everyone uses AI instead..)
The current "lets just train bigger models with more encoded data approach" just doesn't work, it can get you quite far, tho. But then hits a ceiling. And trying to fix it by giving it also additional knowledge "it can ask if it doesn't know" has so far not worked because it reliably doesn't realize it doesn't know if it has enough outdated/incomplete/wrong information encoded in the model. Only by assuring it doesn't have any specialized domain knowledge can you make sure that approach works IMHO.
This is the classic apple approach - wait to understand what the thing is capable of doing (aka let others make sunk investments), envision a solution that is way better than the competition and then architect a path to building a leapfrog product that builds a large lead.
Pretty much it. That said, they did try to appease the markets by announcing 'Apple Intelligence' so they didn't appear to be behind everyone.
They did do the smart thing of not throwing too much capital behind it. Once the hype crumbles, they will be able to do something amazing with this tech. That will be a few years off but probably worth the wait.
For consumers AI has anti hype right now. It's off-putting to see consumer products slapped with a hundred AI labels. I see people talk about how you can turn off all of Apple Intelligence with one toggle rather than hundreds on Samsung.
Firefox is also marketing how easy it is to disable AI.
I think a lot of people are not hype about AI in their toaster, but... I don't think people are generally turned off form deeper integration in their OS itself. Especially when for some people this is representing ideas similar to how programmer-types get excited about Shortcuts.
Decently accessible automation and discovery, without having to go figure out a bunch of stuff
> Decently accessible automation and discovery, without having to go figure out a bunch of stuff
Sure, but is this actually happening? Last time I tried, Atlassian's heavily-pushed AI couldn't even turn a Jira ticket number of Confluence into a clickable link. Similarly, Windows has been actively moving away from providing locally-installed applications in the Start menu search towards offering random internet garbage.
I'm all for using a LLM to make something like Siri able to understand both "Siri, turn off the lights" and "Siri, make it dark!" - but that's not what's being pushed onto consumers, because there is no way anyone is going to pay $100/month for any version of that.
People like features, benefits, and outcomes. AI isn't a feature, it's a technology that can enable features. But it's being marketed as the only thing that matters.
The user does not give two shits if the new laptop "has AI". This is how Apple has been killing it lately, they market the macbooks being powerful, cheap, with long batteries, and a premium feel. Things the user cares about. Most of the stuff marketers are just blanket labeling "AI" will eventually be shuffled to the background and rebranded with a more specific term to highlight the feature being delivered rather than the fact it's
AI".
You're right, there is plenty of space for features that require AI to work but that are undistinguishable from "classical" feature. Better autocompletion is a proven one for example.
Yeah exactly the Apple Intelligence thing was pure BS to shut people up who kept saying apple was going to get disrupted by missing out.
Apple seems to follow the values that Steve laid out. Tim isnāt a visionary but he seems to follow the principles associated with being disciplined with cash quite well. They havenāt done any stupid acquisitions either. Quite the contrast with OAI.
The competition has also attached it to a toxic brand and heavily integrated it with actively user-hostile applications. It doesn't matter if your tech is years ahead when people expect using it will mean your image content info will be sold to anyone willing to pay a cent for it.
I would have, and I work in tech. I'd guess that most people who use iOS have zero idea of what Android can and can't do, because they never use it and probably never will so what's the point of trying to find out.
The Vision Pro was a Development Kit; Just like the first generation Apple Watch. It's not meant for the consumers, it's meant for the developers among the consumers.
We will see if they ever release a new VisionOS device, but it's not the first time they did that; see also the Apple Watch.
Depending on price I would or would not buy an Apple car; but I am quite interested in options for a car that (1) is electric; (2) doesn't spy on me and sell my data; (3) doesn't take video of me and my passengers and do weird things with it; and (4) doesn't support Republicans / white supremacists / Elon Musk.
And I imagine that like-minded consumers are a pretty large market.
Will this strategy work every time ? Maybe for AI it will work (market is competitive and Apple just purchases the best model for its consumers).
But this approach may not work in other areas: e.g. building electric batteries, wireless modems, electric cars, solar cell technology, quantum computing etc.
Essentially Apple got lucky with AI but it needs to keep investing in cutting edge technology in the various broad areas it operates in and not let others get too far ahead !
It works often enough for the company to be wildly successful. They can simply cut their losses and withdraw from industries where it hasn't, such as EVs.
I think their M chips are a good example. They ran on intel for so long, then did the impossible of changing architecture on Mac, even without much transition pain.
Obviously that was built upon years of iPhone experience, but it shows they can lag behind, buy from other vendors, and still win when it becomes worth it to them.
How is changing the architecture of a platform that only you make hardware for doing the impossible?
They could change the architecture again tonight, and start releasing new machines with it. The users will adopt because there is literally no other choice.
Every machine they release will be fastest and most capable on the platform, because there is no other option
Rosetta 1 delivered 50-80% of the performance of native, during the PPC->Intel transition. It turns out, you can deliver not particularly impressive performance and still not ruin your app ecosystem, because developers have to either update to target your new platform, or leave your platform entirely.
You can also voluntarily cut off huge chunks of your own app ecosystem intentionally, by giving up 32bit support and requiring everything to be 64bit capable.
...because users have no other choice when only one vendor controls the both the hardware+software. They can either use the apps still available to them, or they can leave. And the cost of leaving for users is a lot higher.
It's also notably not the first time they switched. They did the Motorola (I think MIPS?) Archictecure, then IBM PowerPC, then Intel x86 (for a single generation, then x86_64) and now Apple M-Series.
They do the things they think they can do very well.
Why would they try to build electric batteries, wireless modems, electric cars, solar cells, or quantum computers, if their R&D hadn't already determined that they would likely be able to do so Very Well?
It's not like any of those are really in their primary lines of business anyway.
When have they done that since the first iPhone in 2007? The watch maybe? Though not sure that's "leapfrog" better than anyone else's smartwatch, but I don't have one so maybe I'm wrong.
The parent poster is saying (and I agree) that Airpods and Airtags are only superior because Apple anti-competitively privileges their integration with iPhones. It's not that they are better at the hardware level by itself.
And since iPhones form the largest single company's device network in the rich countries, that is a pretty big advantage.
> wait to understand what the thing is capable of doing
My parents use Android to ask āWhat are the 5 biggest towers in Chicagoā or āRemove the people on my pictureā while apparently iPhone is only capable of doing āHey Siri start the Chronometer / There is no contact named Chronometer in your phoneā.
My iPhone is lagging a ridiculous 10 years behind. Itās just that I donāt trust Google with my credit card.
Apple's AI stuff also uses cloud features, though you can't use them on other platforms. The problem with Apple's new cloud features is that they generally just suck. I'm surprised iCloud works so well with how hard they're fumbling basic stuff like this.
I would argue that they are as bad as each other. I have to repeat most voice commands to Siri and Alexa than getting it right first time. No experience with Google.
Itās even more superpowered than previous implementations of this strategy.
When they made the iPhone, iPod, and Apple Watch they had no specific hardware advantage over competitors. Especially with early iPhone and iPod: no moat at all, make a better product with better marketing and youāll beat Apple.
Now? Good luck getting any kind of reasonably priced laptop or phone that can run local AI as well as the iPhone/MacBook. It doesnāt matter that Apple Intelligence sucks right now, what matters is that every request made to Gemini is losing money and possibly always will.
This is especially true in 2026 where Windows laptops are climbing in price while MacBooks stay the same.
They're talking about free inference like Android and Google Home devices. No one is paying subscription fees for these and they're running their inference in the cloud. Apple Intelligence, for the most part, is running on the device.
Apples advantage was that they did everything in house and had the marketing and distribution capabilities. And now youāve got the ecosystem lock in.
In hindsight itās obvious why they pulled it off - nobody else could do it. They all had pieces missing.
Apple arenāt in the business of building chatbots to impress investors (other than some WWDC2024 vaporware theyād rather not talk about any more). Theyāre in the business of consumer hardware.
Consumers want iPhones and (if Apple are right) some form of AR glasses in the next decade. Thatās their focus. Thereās a huge amount of machine learning and inference thatās required to get those to work. But itās under the hood and computed locally. Hence their chips. I donāt see what Apple have to gain by building a competitor to what OpenAI has to offer.
~25% of Apple's revenue came from services in FY25 (and 50% from iPhone, ~25% from other hardware). They made $415B in that year, so ~$100B from services alone!
Agreed, Iāve been a loyal iPhone user for a long time, and very few people I know use iMessage. I use it with my parents because they donāt have any other messenger, and they donāt even really know itās iMessage, they just think of it as texting. Everyone I know is using something else for messages, whether itās Discord, Instagram DMs, WhatsApp, or occasionally Telegram or Snapchat.
No one uses iMessage in my country. Yet iPhones are sought after. Some of us just really like iPhones for the experience - not everything is a conspiracy. People can have different tastes and are more free to choose than people on HN like to believe.
I totally buy this as someone located in the US, but what is everybody else using? It canāt be WhatsApp? Is everyone sending all their connection graphdata to Meta?
Itās WhatsApp. No one thinks about sending data to Meta. The world is much bigger than the HN bubble, where almost no one thinks about privacy implications.
Absolutely this. No one cares about privacy. 99,9% population has no clue how tech works. āOh, itās an app on my phone.ā Thatās what typical consumer understands. How text travels from one phone to other is something magical.
Got WhatsApp, because there is no other channel to communicate with customers. Itās literally used by everyone without exceptions. Really scary.
What I don't get about Apple is when everyone else was giving up on yet another VR attempt, moving into AI, they decide AI isn't worth it, and it was the right time for a me too VR headset.
So no VR, given the price and lack of developer support, and late arrival into AI.
It is the same pattern, late on VR, late on AI. Those two tech have a pricing problem. I would guess that Apple is working to create the conditions to make these tech cheap enough to sell it to everyone.
I don't like companies forcing their newest features on me noisily and constantly trying to ship new features and see what sticks so you can't trust whether a feature advertised one week will even be there the next.
However, I have even less patience for companies forcing paid-for third-party ads down my throat on a paid product. Slack at least doesn't sell my eyeballs. Facebook, Twitter, Google's ads are worse to me than new feature dialogues.
Which brings me to Apple. I pay for a $1k+ device, and yet the app store's first result is always a sponsored bit of spam, adware, or sometimes even malware (like the fake ledger wallet on iOS, that was a sponsored result for a crypto stealer). On my other devices, I can at least choose to not use ad-ridden BS (like on android you can use F-Droid and AuroraStore, on Linux my package manager has no ads), but on iOS it's harder to avoid.
Apple hasn't sunk to Google levels in terms of ads, but they've crossed a line.
I get it but... well I think of App Store as... a store. I don't have to go there.
I'm actually pretty disappointed in the lack of discovery available in the App Store, but I rarely go there. I'm fine with advertising being there. I wish it was better but I'm not offended that there is paid promotion in a store.
>comes up with other banks, BankNames US app (not the country
you are in)
>revolut etc (cant use in the country you are in)
>ten minutes later
even worse when its your telecomm telling you to install their Official App so you can pay your bills or they will cut your cellular service, and you cant find it
I donāt see what that has to do with (increased) advertising on the App Store (IMO search there never has been good) or the comment you replied to in which colechristensen said: āI'm actually pretty disappointed in the lack of discovery availableā.
I think paid advertising may even help improve discoverability on the App Store because, instead of making 10 or 20 to do list apps and hoping to get them to rank high by a combination of sheer luck and SEO tricks, scammers may only make one, and pay to get that to the top of the list.
In super markets product placement is affected by two factors: how much producers are willing to pay for a good spot (e.g. by offering lower wholesale prices if the product gets a more visible place) and vetting by the store owner.
I donāt think different solutions exist in the App Store. Apple doesnāt want to do much vetting, making advertising the only thing that may help (and yes, it would be awesome if there were a store that did do much vetting, but that requires a world where many different stores exist, and we arenāt there (yet))
As someone who recently moved to NL from the US I encounter this issue about once a week and itās blocking me from doing serious things like paying for parking, taxes, utilities or government services, all of which have apps that are only available on the Dutch app store.
I have a separate Dutch Apple ID I can switch to, but each time I log out I risk accidentally deleting all my data.
> all of which have apps that are only available on the Dutch app store.
This isnāt really on Apple though. Blame the companies/developers for geo gating their apps. Itās a simple checkbox in the store to make it available for other countries.
I get an app recommendation from a friend, I go to the App Store and search for it. I have to be super careful about which link I'm actually clicking on and which app I'm installing, because the App Store is riddled with spam and malware.
I wouldn't mind, except that Apple charge 30% of everything with the justification that they are keeping the ecosystem free of spam and malware...
Iāve been installing apps from the App Store for more than a decade and have never ever accidentally downloaded spam or malware. Iām sure itās there but itās really not āriddledā with it in my experience searching for apps. What itās riddled with is subscription-based apps whose free tier is worthless
I haven't noticed this at all and I wonder if you're mistaking curation for advertising? When I open up the App Store I get a panel written "games we love" and a listing of indie games that are clearly not paid for ads. The ads in search are visibly marked as ads, and while I don't particularly like ads in general, they are pretty easy to avoid.
Mine is Moneris Go, and the top review is titled "Garbage App!!!!" lol
Honestly the last time I remember using the App Store was years ago and I can't recall if they had ads or not. Imo it's distasteful and I wish they didn't have them. Still leagues better than the fucking ads in the start menu which caused me to give up on gaming and Windows forever.
If I open the app store and search "Gemini", the first result is "ChatGPT (advertisement)"
If I search for my bank, I get another bank. If I search for "Wordle", I get a bunch of ad-supported spamware (both the ad and non-ad results) before the real NYT Games app.
The app store has ads in search results. This is the primary way that my technologically inept relatives end up with the wrong app installed btw, is by searching and clicking the first result, and getting complete trash adware.
Apple should be ashamed of selling out their users.
Apple keeps nagging me to upgrade to godawful Tahoe. Every time thereās a system update (which includes Safari, Safari TP, CLT etc. updates) Tahoe is always default checked. Even when I specifically click on a Sequoia point update, the Tahoe update is always checked instead of that point release. This has way more destructive potential than ātry our new AI featureā in apps.
To add insult to injury, the one AI feature that I may want to evaluateāClaude Code integration in Xcodeāis gated behind Tahoe upgrade, even though it has absolutely no reason to do so, given that every other IDE integrates AI features just fine on any recent OS.
Edit: Oh and Iām not getting bombarded in Slack at all, maybe because my company doesnāt pay for any of the AI stuff there. Last time I got a banner or something like that was months ago.
> Think about the App Store. Apple didnāt build the apps, they built the platform where apps ran best, and the ecosystem followed.
As far as I remember Apple basically got forced into opening the platform to 3rd party developers. Not by regulation but by public pressure. It wasn't their initial intention to allow it.
Nvidia restricts gamer cards in data centers through licensing, eventually they will probably release a cheaper consumer AI card to corner the local AI market that can't be used in data centers if they feel too much of a threat from Apple.
Imagine a future where Nvidia sells the exact same product at completely different prices, cheap for those using local models, and expensive for those deploying proprietary models in data centers.
[WSJ] sources expect.. first units in H1 2026, with GTC as the most likely unveiling stage.. NPU reportedly exceeds both Intel and AMDās current neural processing units.. If the integrated GPU delivers RTX 5070-class performance in a thin laptop form factor, it would eliminate the need for a separate GPU die, fundamentally changing how gaming laptops are designed.
If they can get Valve/Steam for an OS that handles most games well that could in fact be huge if the pricepoint is a bit lower initially but with plenty of unified RAM (both for AI but also games).
That said, gaming laptops cooling issues are so often around the GPU so it'd also require a seasoned manufacturer to make it correctly.
Thing is, Apple never considered racing against LLM runners. Apple's success comes from human-centered design, it is not trying to launch a me-too product just because it increases their stock price.
iPod was not the first MP3 player.
iPhone was not even 3G at launch -- in the middle of 3G marketing craze.
They sure got lucky that unified memory is well-suited for running AI, but they just focused on having cost- and energy-efficient computing power. They've been having glasses in sight for the last 10 years (when was Magic Leap's first product?) and these chips have been developed with that in mind. But not only the chips: nothing was forcing Apple to spend the extra money for blazing fast SSD -- but they did.
So yes, Apple is a hardware company. All the services it sells run on their hardware. They've just designed their hardware to support their users' workflows, ignoring distractions.
With that said, LLM makes the GPU + memory bandwidth fun again. NVidia can't do it alone, Intel can't do it alone, but Apple positioned itself for it. It reminds me how everyone was surprised when then introduced 64-bit ARM for everyone: very few people understood what they were doing.
Tbh there are NVidia GPUs that beat Apple perf 2x or 3x, but these are desktop or server chips consuming 10x the power. Now all Apple needs to do is keep delivering performance out of Apple Silicon at good prices and best energy efficiency. Local LLM make sense when you need it immediately, anywhere, privately -- hence you need energy efficiency.
Honestly, I think part of the reason Apple hasn't jumped deep into AI is due to two big reasons:
1) Apple is not a data company.
2) Apple hasn't found a compelling, intuitive, and most of all, consistent, user experience for AI yet.
Regarding point 2: I haven't seen anyone share a hands down improved UX for a user driven product outside of something that is a variation of a chat bot. Even the main AI players can't advertise anything more than, "have AI plan your vacation".
Put proper LLM into Siri. Encourage developers to expose the functionality of their apps as functions, allow Siri LLM to access those (and sprinkle some magic security dust over it).
Boom, you have an agent in the phone capable of doing all the stuff you can do with the apps. Which means pretty much everything in our life.
there are always three elements in the equations of business model:
1. marginal cost
2. marginal revenue
3. value created
for llm providers, i always believe the key is to focus on high value problems such as coding or knowledge work, becaues of the high marginal cost of having new customers - the token burnt. and low marginal revenue if the problem is not valuable enough. in this sense no llm providers can scale like previous social media platforms without taking huge losses. and no meaning user stickiness can be built unless you have users' data. and there is no meaningful business model unless people are willing to pay a high price for the problem you solve, in the same way as paying for a saas.
i am really not optimistic about the llm providers other than anthropic. it seems that the rest are just burning money, and for what? there is no clear path for monetization.
and when the local llm is powerful enough, they will soon be obsolete for the cost, and the unsustainable business model. in the end of the day, i do agree that it is the consumer hardware provider that can win this game.
I am super bullish on Google, they are my best bet to earn from models. Mostly because they are vertically integrated (other revenue streams) + open to provide services to other companies (Apple deal).
I don't think I have unique insight on this but the common belief is they are desperately trying to reach AGI or a least have some halo model that will allow them to rise over the other companies. The problem is they have a hilariously large monthly burn paying for compute. If they don't produce something, they are in trouble if investors stop offering capital.
Apple is almost 2 years out from their announcement of Apple Intelligence. It has barely delivered on any of the hype. New Siri was delayed and barely mentioned in the last WWDC; none of the features are released in China.
In other news, people keep buying iPhones, and Apple just had its best quarter ever in China. AAPL is up 24% from last year.
i dont even care about apple intelligence. stays off, not sure anyone really cares about it who is also interested in what this ai shenanigans is about on a local device. i think people keep conflating apple intelligence with all these convos about how macs are kinda dope for joe consumer wanting to tinker with llms.
that's the other part of the story that matters, not apple intelligence. this writeup tries to touch on that, apple is uniquely positioned to do really well in this arena if/when local llm's becoming commodities that can do really impressive stuff. we're getting there a lot faster than we thought, someone had a trillion parameter qwen3,5 model going on his 128gb macbook and now people are thinking of more creative ways to swap out whats in memory as needed.
Indeed, a lot of the people that bought iPhones are now buying Macs with a binned version of the chip they already bought. So much so that Apple is in danger of running out of them.
I think the article is missing a whole aspect on how Apple is ensuring to not face actual competition while they're "playing it safe":
Even if the investment is overblown, there is market-demand for the services offered in the AI-industry. In a competitive playing field with equal opportunities, Apple would be affected by not participating. But they are establishing again their digital market concept, where they hinder a level playing field for Apple users.
Like they did with the Appstore (where Apple is owning the marketplace but also competes in it) they are setting themselves up as the "the bakn always wins" gatekeeper in the Apple ecosystem for AI services, by making "Apple Intelligence" an ecosystem orchestration layer (and thus themselves the gatekeeper).
1. They made a deal with OpenAI to close Apple's competitive gap on consumer AI, allowing users to upgrade to paid ChatGPT subscriptions from within the iOS menu. OpenAI has to pay at least (!) the usual revenue share for this, but considering that Apple integrated them directly into iOS I'm sure OpenAI has to pay MORE than that. (also supported by the fact that OpenAI doesn't allow users to upgrade to the 200USD PRO tier using this path, but only the 20USD Plus tier) [1]
2. Apple's integration is set up to collect data from this AI digital market they created: Their legal text for the initial release with OpenAI already states that all requests sent to ChatGPT are first evaluated by "Apple Intelligence & Siri" and "your request is analyzed to determine whether ChatGPT might have useful results" [2]. This architecture requires(!) them to not only collect and analyze data about the type of requests, but also gives them first-right-to-refuse for all tasks.
3. Developers are "encouraged" to integrate Apple Intelligence right into their apps [3]. This will have AI-tasks first evaluated by Apple
4. Apple has confirmed that they are interested to enable other AI-providers using the same path [4]
--> Apple will be the gatekeeper to decide whether they can fulfill a task by themselves or offer the user to hand it off to a 3rd party service provider.
--> Apple will be in control of the "Neural Engine" on the device, and I expect them to use it to run inference models they created based on statistics of step#2 above
--> I expect that AI orchestration, including training those models and distributing/maintaining them on the devices will be a significant part of Apple's AI strategy. This could cover alot of text and image processing and already significantly reduce their datacenter cost for cloud-based AI-services. For the remaining, more compute-intensive AI-services they will be able to closely monitor (via above step#2) when it will be most economic to in-source a service instead of "just" getting revenue-share for it (via above step#1).
So the juggernaut Apple is making sure to get the reward from those taking the risk. I don't see the US doing much about this anti-competitive practice so far, but at least in the EU this strategy has been identified and is being scrutinized.
> Pure strategy, luck, or a bit of both? I keep going back and forth on this, honestly, and I still donāt know if this was Appleās strategy all along, or they didnāt feel in the position to make a bet and are just flowing as the events unfold maximising their optionality.
Maximizing the available options is in fact a "strategy", and often a winning one when it comes to technology. I would love to be reminded of a list of tech innovators who were first and still the best.
How do you rate Vision Pro? It was not the first one, but it was certainly the best one. Total dud though, while Meta Ray Bans are selling like hot cakes (irrespective of what you think of the company)
It's the same everywhere: great fundamentals pay off. It's true of martial arts, dance, and absolutely about software platforms. You just have to trust that process and invest in it, which Apple does (although frustratingly not enough!).
For the love of all that's holy - folks please stop using AI to publish smart sounding texts. While you may think you are "polishing" your text, you are just disrespecting your readers. Write in your own words.
This seems mistaken to me. The core idea is that LLMs are commoditizing and that the UI (Siri in this case) is what users will stick with.
But... what's the argument that the bulk of "AI value" in the coming decade is going to be... Siri Queries?! That seems ridiculous on its face.
You don't code with Siri, you don't coordinate automated workforces with Siri, you don't use Siri to replace your customer service department, you don't use Siri to build your documentation collation system. You don't implement your auto-kill weaponry system in Siri. And Siri isn't going to be the face of SkyNet and the death of human society.
Siri is what you use to get your iPhone to do random stuff. And it's great. But ... the world is a whole lot bigger than that.
> Won't be surprised for the re-introduction of Xserve again but for AI.
This means, Apple is gonna spend a lot of money standing up data centers (CapEx). And the article in question is essentially saying that Apple is smart not to spend any money.
It sounds like there's a bit of wishful thinking on - Whatever Apple is doing is 4D chess. Apple not spending any money - That's genuis. Apple re-introducing Xserve racks - genius.
> Then Stargate Texas was cancelled, OpenAI and Oracle couldnāt agree terms, and the demand that had justified Micronās entire strategic pivot simply vanished. Micronās stock crashed.
Well.. no. The Stargate expansion was cancelled the orginally planned 1.2MW (!) datacenter is going ahead:
> The main site is located in Abilene, Texas, where an initial expansion phase with a capacity of 1.2 GW is being built on a campus spanning over 1,000 acres (approximately 400 hectares). Construction costs for this phase amount to around $15 billion. While two buildings have already been completed and put into operation, work is underway on further construction phases, the so-called Longhorn and Hamby sections. Satellite data confirms active construction activity, and completion of the last planned building is projected to take until 2029.
> The Stargate story, however, is also a story of fading ambitions. In March 2026, Bloomberg reported that Oracle and OpenAI had abandoned their original expansion plans for the Abilene campus. Instead of expanding to 2 GW, they would stick with the planned 1.2 GW for this location. OpenAI stated that it preferred to build the additional capacity at other locations. Microsoft then took over the planning of two additional AI factory buildings in the immediate vicinity of the OpenAI campus, which the data center provider Crusoe will build for Microsoft. This effectively creates two adjacent AI megacampus locations in Abilene, sharing an industrial infrastructure. The original partnership dynamics between OpenAI and SoftBank proved problematic: media reports described disagreements over site selection and energy sources as points of contention.
Apple's reality distortion field is really really strong. People love to claim Apple is doing 4D chess, when in reality Apple has certain strengths but AI is anything but.
Which is why they were completely caught offguard with botched rollout of Apple Intelligence. Even when they were playing to their strengths, things have not gone for them (Apple Vision Pro). Liquid Glass has had mixed reception, and that's often explained away as "Apple is setting up a world for Spatial Computing by unifying design language" and when the lead designer was fired it was like "Thank God Alan Dye is gone, he was bad for Apple anyway".
But why do I feel like the quality of the software from Apple declined sharply in recent years? The liquid glass design feels very unpolished and not well thought out throughout almost everywhere⦠seems like even Apple canāt resist falling victim to AI slop
I donāt think itās AI slop. Even before modern generative AI, Iāve noticed a decline in Appleās software quality.
Rather, I feel that Apple has forgotten its roots. The Mac was āthe computer for the rest of us,ā and there were usability guidelines backed by research. What made the Mac stand out against Windows during a time when Windows had 95%+ marketshare was the Macās ease of use. The Mac really stood out in the 2000s, with Panther and Tiger being compelling alternatives to Windows XP.
I think Apple is less perfectionistic about its software than it was 15-20 years ago. I donāt know what caused this change, but I have a few hunches:
0. Thereās no Steve Jobs.
1. When the competition is Windows and Android, and where thereās no other commercial competitors, thereās a temptation to just be marginally better than Windows/Android than to be the absolute best. Windowsā shooting itself in the foot doesnāt help matters.
2. The amazing performance and energy efficiency of Apple Silicon is carrying the Mac.
3. Many of the people who shaped the culture of Appleās software from the 1980s to the 2000s are retired or have even passed away. Additionally, there are not a lot of young software developers who have heard of people like Larry Tesler, Bill Atkinson, Bruce Tognazzini, Don Norman, and other people who shaped Appleās UI/UX principles.
4. Speaking of Bruce Tognazzini and Don Norman, I am reminded of this 2015 article (https://www.fastcompany.com/3053406/how-apple-is-giving-desi...) where they criticized Appleās design as being focused on form over function. Itās only gotten worse since 2015. The saving grace for Apple is that the rest of the industry has gone even further in reducing usability.
I think what it will take for Apple to readopt its perfectionism is if competition forced it to.
Apple will just drip feed locally running models that enable minor conveniences. They will probably drop the Apple Intelligence label later and just have things with their own names like "magic eraser".
Apple have had Siri for decades without any meaningful movement. If you think Apple is suddenly going to get better, that's just wishful thinking. Apple neither has the expertise nor the capability to do any of that. They'd hvae demonstrated that with Siri long time back.
What Apple does it build beautiful hardware. The software has been shambles for a really long time.
I like how we are acting like this market is so novel and emergent revering the luck of some while lamenting the failures of others when it was all "roadmapped" a decade ago. It's like watching a Shaanxi shadow puppet show with artificial folk lore about the origins of the industry. I hate reality television!
Gemma4 in my view is good enough to do things similar to Gemini 2.5 flash, meaning if I point it code and ask for help and there is a problem with the code itāll answer correctly in terms of suggestions but itās not great at using all tools or one shooting things that require a lot of context or āexpert knowledgeā
If a couple more iterations of this, say gemma6 is as good as current opus and runs completely locally on a Mac, I wonāt really bother with the cloud models.
Thatās a problem.
For the others anyway.
similar vibes as "640k ought to be enough for anybody"
I think the difference is that with LLMs, in a lot of cases you do see some diminishing returns.
I won't deny that the latest Claude models are fantastic at just one shotting loads of problems. But we have an internal proxy to a load of models running on Vertex AI and I accidentally started using Opus/Sonnet 4 instead of 4.6. I genuinely didn't know until I checked my configuration.
AI models will get to this point where for 99% of problems, something like Gemma is gonna work great for people. Pair it up with an agentic harness on the device that lets it open apps and click buttons and we're done.
I still can't fathom that we're in 2026 in the AI boom and I still can't ask Gemini to turn shuffle mode on in Spotify. I don't think model intelligence is as much of an issue as people think it is.
I mean to me even difference between Opus and Sonnet is as clear as day and night, and even Opus and the best GPT model. Opus 4.6 just seems much more reliable in terms of me asking it to do something, and that to actually happen.
It depends what you're asking it though. Sure, in a software development environment the difference between those two models is noticeable.
But think about the general user. They're using the free Gemini or ChatGPT. They're not using the latest and greatest. And they're happy using it.
And I am willing to bet that a lot of paying users would be served perfectly fine by the free models.
If a capable model is able to live on device and solve 99% of people's problems, then why would the average person ever need to pay for ChatGPT or Gemini?
Well you can do a lot with 640kā¦if you try. We have 16G in base machines and very few people know how to try anymore.
The world has moved on, that code-golf time is now spent on ad algorithms or whatever.
Escaping the constraint delivered a different future than anticipated.
> you can do a lot with 640kā¦if you try.
it is economically not viable to try anymore.
"XYZ Corp" won't allow their developers to write their desktop app in Rust because they want to consume only 16MB RAM, then another implementation for mobile with Swift and/or Kotlin, when they can release good enough solution with React + Electron consuming 4GB RAM and reuse components with React Native.
People get hung up on bad optimization. It you are the working at sufficiently large scale, yes, thinking about bytes might be a good use of your time.
But most likely, it's not. At a system level we don't want people to do that. It's a waste of resources. Making a virtue out of it is bad, unless you care more about bytes than humans.
These bytes are human lives. The bytes and the CPU cycles translate to software that takes longer to run, that is more frustrating, that makes people accomplish less in longer time than they could, or should. Take too much, and you prevent them from using other software in parallel, compounding the problem. Or you're forcing them to upgrade hardware early, taking away money they could better spend in different areas of their lives. All this scales with the number of users, so for most software with any user base, not caring about bytes and cycles is wasting much more people-hours than is saving in dev time.
The simple fact is that a 16 GB RAM stick costs much less than the development time to make the app run on less.
Especially if the 640k are "in your hand" and the rest is "in the cloud"
Look at the whole history of computing. How many times has the pendulum swung from thin to fat clients and back?
I don't think it's even mildly controversial to say that there will be an inflection point where local models get Good Enough and this iteration of the pendulum shall swing to fat clients again.
Assuming improvements in LLMs follow a sigmoid curve, even if the cloud models are always slightly ahead in terms of raw performance it won't make much of a difference to most people, most of the time.
The local models have their own advantages (privacy, no -as-a-service model) that, for many people and orgs, will offset a small performance advantage. And, of course, you can always fall back on the cloud models should you hit something particularly chewy.
(All IMO - we're all just guessing. For example, good marketing or an as-yet-undiscovered network effect of cloud LLMs might distort this landscape).
> itās not great at using all tools
Glad it wasnt just me - i was impressed with the quality of Gemma4 - it just couldnt write the changes to file 9/10 times when using it with opencode
https://huggingface.co/google/gemma-4-31B-it/commit/e51e7dcd...
There was an update to tool calling 3 days ago. I haven't tested it myself but hope it helps.
Hmm.. is there an updated onnx?
> it just couldnt write the changes to file 9/10 times when using it with opencode
You might want to give this a try, it dramatically improves Edit tool accuracy without changing the model: https://blog.can.ac/2026/02/12/the-harness-problem/
Yep, and to be honest we don't really need local models for intensive tasks. At least yet. You can use openrouter (and others) to consume a wide variety of open models which are capable of using tools in an agentic workflow, close to the SOTA models, which are essentially commodities - many providers, each serving the same model and competing with each-other on uptime, throughput, and price. At some point we will be able to run them on commodity hardware, but for now the fact that we can have competition between providers is enough to ensure that rug pulls aren't possible.
Plus having Gemma on my device for general chat ensures I will always have a privacy respecting offline oracle which fulfils all of the non-programming tasks I could ever want. We are already at the point where the moat for these hyper scalers has basically dissolved for the general public's use case.
If I was OpenAI or Anthropic I would be shitting my pants right now and trying every unethical dark pattern in the book to lock in my customers. And they are trying hard. It won't work. And I won't shed a single tear for them.
Local models seem somewhere between 9 and 24 months behind. I'm not saying I won't be impressed with what online models will be able to do in two years, but I'm pretty satisfied with the prediction that I won't really need them in a couple of years.
We still aren't going to be putting 200gb ram on a phone in a couple years to run those local models.
A lot of people are making the mistake of noticing that local models have been 12-24 months behind SotA ones for a good portion of the last couple years, and then drawing a dotted line assuming that continues to hold.
It simply.. doesn't. The SotA models are enormous now, and there's no free lunch on compression/quantization here.
Opus 4.6 capabilities are not coming to your (even 64-128gb) laptop or phone in the popular architecture that current LLMs use.
Now, that doesn't mean that a much narrower-scoped model with very impressive results can't be delivered. But that narrower model won't have the same breadth of knowledge, and TBD if it's possible to get the quality/outcomes seen with these models without that broad "world" knowledge.
It also doesn't preclude a new architecture or other breakthrough. I'm simply stating it doesn't happen with the current way of building these.
edit: forgot to mention the notion of ASIC-style models on a chip. I haven't been following this closely, but last I saw the power requirements are too steep for a mobile device.
Donāt underestimate the march of technology. Just look at your phone, it has more FLOPS than there were in the entire world 40 years ago.
And I think it's very likely that with improved methods you could get opus 4.6 level performance on a wrist watch in few years.
You needed supercomputer to win in chess until you didn't.
Currently local models performance in natural language is much better than any algorithm running on a super computer cluster just few years ago.
but it doesn't have that much more flops than it did a couple of years ago.
Yeah, but that's the current state of the art after decades of aggressive optimizations, there's no foreseeable future where we'll ever be able to cram several orders of magnitude more ram into a phone.
We already cram several orders of magnitude more flash storage into phone than RAM (e.g. my phone has 16 GB RAM but 1 TB storage); even now, with some smart coding, if you don't need all that data at the same time for random access at sub millisecond speed, it's hard to tell the difference.
Pretty sure thereās at least a couple orders of magnitude in purely algorithmic areas of LLM inference; maybe training, too, though Iām less confident here. Rationale: meat computers run on 20W, though pretraining took a billion years or so.
There's been plenty of free lunch shrinking models thus far with regards to capability vs parameter count.
Contradicting that trend takes more than "It simply.. doesn't."
There's plenty of room for RAM sizes to double along with bus speed. It idled for a long time as a result of limited need for more.
We donāt need 200gb of RAM on a phone to run big models. Just 200 GB of storage thanks to Appleās āLLM in a flashā research.
See: https://x.com/danveloper/status/2034353876753592372
Yes, I agree that this is the right solution, because for a locally-hosted model I value more the quality of the output than the speed with which it is produced, so I prefer the models as they were originally trained, not with further quantizations.
While that paper praises the Apple advantage in SSD speed, which allows a decent performance for inference with huge models, nowadays SSD speeds equal or greater than that can be achieved in any desktop PC that has dual PCIe 5.0 SSDs, or even one PCIe 5.0 and one PCIe 4.0 SSDs.
Because I had also independently reached this conclusion, like I presume many others, I have just started to work a week ago on modifying llama.cpp to use in an optimal manner weights stored on SSDs, while also batching many tasks, so that they will share each pass through the SSDs. I assume that in the following months we will see more projects in this direction, so the local hosting of very large models will become easier and more widespread, allowing the avoidance of the high risks associated with external providers, like the recent enshittification of Claude Code.
But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback / review mechanisms or having to babysit it prompt by prompt.
By the time gemma6 allows you to do the above the proprietary models supposedly will already be on the next step change. It just depends if you need to ride the bleeding edge but specially because it's "intelligence", there's an obvious advantage in using the best version and it's easy to hype it up and generate fomo.
> But that difference atm is the difference between it being OK on its own with a team of subagents given good enough feedback
Do people actually build meaningful things like that?
It's basically impossible to leave any AI agent unsupervised, even with an amazing harness (which is incredibly hard to build). The code slowly rots and drifts over time if not fully reviewed and refactored constantly.
Even if teams of agents working almost fully autonomously were reliable from a functional perspective (they would build a functional product), the end product would have ever increasing chaos structurally over time.
I'd be happy to be proven wrong.
When that happens, you'll have fomo from not using opus 5.x. The numbers that they showed for Mythos show that the frontier is still steadily moving (and maybe even at a faster pace than before)
There is a cognitive ceiling for what you can do with smaller models. Animals with simpler neural pathways often outperform whatever think they are capable of but there's no substitute for scale. I don't think you'll ever get a 4B or 8B model equivalent to Opus 4.6. Maybe just for coding tasks but certainly not Opus' breadth.
The only thing that we are sure can't be highly compressed is knowledge, because you can only fit so much information in given entropy budget without losing fidelity.
The minimal size limits of reasoning abilities are not clear at all. It could be that you don't need all that many parameters. In which case the door is open for small focused models to converge to parity with larger models in reasoning ability.
If that happens we may end up with people using small local models most of the time, and only calling out to large models when they actually need the extra knowledge.
> and only calling out to large models when they actually need the extra knowledge
When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.
And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.
> we already know it's possible to make a pretty darn clever reasoning model
There's is a problem though: we know that it is possible, but we don't know how to (at least not yet and as far as I am aware). So we know the answer to "what?" question, but we don't know the answer to "how?" question.
I would call brains with the needed support infrastructure small.
I think you underestimate the amount of knowledge needed to deal with the complexities of language in general as opposed to specific applications. We had algorithms to do complex mathematical reasoning before we had LLMs, the drawback being that they require input in restricted formal languages. Removing that restriction is what LLMs brought to the table.
Once the difficult problem of figuring out what the input is supposed to mean was somewhat solved, bolting on reasoning was easy in comparison. It basically fell out with just a bit of prompting, "let's think step by step."
If you want to remove that knowledge to shrink the model, we're back to contorting our input into a restricted language to get the output we want, i.e. programming.
I think you are underestimating the strength a small model can get from tool use. There may be no substitute for scale, but that scale can live outside of the model and be queried using tools.
In the worst case a smaller model could use a tool that involves a bigger model to do something.
Small models are bad at tool use. I have liquidai doing it in the browser but itās super fragile.
except you don't want knowledge in the model, and most of that "size" comes from "encoded knowledge", i.e. over fitting. The goal should be to only have language handling in the model, and the knowledge in a database you can actually update, analyze etc. It's just really hard to do so.
"world models" (for cars) maybe make sense for self driving, but they are also just a crude workaround to have a physics simulation to push understanding of physics. Through in difference to most topics, basic, physics tend to not change randomly and it's based on observation of reality, so it probably can work.
Law, health advice, programming stuff etc. on the other hand changes all the time and is all based on what humans wrote about it. Which in some areas (e.g. law or health) is very commonly outdated, wrong or at least incomplete in a dangerous way. And for programming changes all the time.
Having this separation of language processing and knowledge sources is ... hard, language is messy and often interleaves with information.
But this is most likely achievable with smaller models. Actually it might even be easier with a small model. (Through if the necessary knowledge bases are achievable to fit on run on a mac is another topic...)
And this should be the goal of AI companies, as it's the only long term sustainable approach as far as I can tell.
I say should because it may not be, because if they solve it that way and someone manages to clone their success then they lose all their moat for specialized areas as people can create knowledge bases for those areas with know-how OpenAI simple doesn't have access to. (Which would be a preferable outcome as it means actual competition and a potential fair working market.)
as a concrete outdated case:
TLS cipher X25519MLKEM768 is recommended to be enabled on servers which do support it
last time I checked AI didn't even list it when you asked it for a list of TLS 1.3 ciphers (through it has been widely supported since even before it was fully standardized..)
this isn't surprising as most input sources AI can use for training are outdated and also don't list it
maybe someone of OpenAI will spot this and feet it explicitly into the next training cycle, or people will cover it more and through this it is feed implicitly there
but what about all that many niche but important information with just a handful of outdated stack overflow posts or similar? (which are unlikely to get updated now that everyone uses AI instead..)
The current "lets just train bigger models with more encoded data approach" just doesn't work, it can get you quite far, tho. But then hits a ceiling. And trying to fix it by giving it also additional knowledge "it can ask if it doesn't know" has so far not worked because it reliably doesn't realize it doesn't know if it has enough outdated/incomplete/wrong information encoded in the model. Only by assuring it doesn't have any specialized domain knowledge can you make sure that approach works IMHO.
This is the classic apple approach - wait to understand what the thing is capable of doing (aka let others make sunk investments), envision a solution that is way better than the competition and then architect a path to building a leapfrog product that builds a large lead.
Pretty much it. That said, they did try to appease the markets by announcing 'Apple Intelligence' so they didn't appear to be behind everyone.
They did do the smart thing of not throwing too much capital behind it. Once the hype crumbles, they will be able to do something amazing with this tech. That will be a few years off but probably worth the wait.
For consumers AI has anti hype right now. It's off-putting to see consumer products slapped with a hundred AI labels. I see people talk about how you can turn off all of Apple Intelligence with one toggle rather than hundreds on Samsung.
Firefox is also marketing how easy it is to disable AI.
I think a lot of people are not hype about AI in their toaster, but... I don't think people are generally turned off form deeper integration in their OS itself. Especially when for some people this is representing ideas similar to how programmer-types get excited about Shortcuts.
Decently accessible automation and discovery, without having to go figure out a bunch of stuff
> Decently accessible automation and discovery, without having to go figure out a bunch of stuff
Sure, but is this actually happening? Last time I tried, Atlassian's heavily-pushed AI couldn't even turn a Jira ticket number of Confluence into a clickable link. Similarly, Windows has been actively moving away from providing locally-installed applications in the Start menu search towards offering random internet garbage.
I'm all for using a LLM to make something like Siri able to understand both "Siri, turn off the lights" and "Siri, make it dark!" - but that's not what's being pushed onto consumers, because there is no way anyone is going to pay $100/month for any version of that.
People like features, benefits, and outcomes. AI isn't a feature, it's a technology that can enable features. But it's being marketed as the only thing that matters.
The user does not give two shits if the new laptop "has AI". This is how Apple has been killing it lately, they market the macbooks being powerful, cheap, with long batteries, and a premium feel. Things the user cares about. Most of the stuff marketers are just blanket labeling "AI" will eventually be shuffled to the background and rebranded with a more specific term to highlight the feature being delivered rather than the fact it's AI".
You're right, there is plenty of space for features that require AI to work but that are undistinguishable from "classical" feature. Better autocompletion is a proven one for example.
Sentiment among my teenage kids and their peers is that AI can fuck right off. It's way over the line into actual hate of anything AI.
Yeah exactly the Apple Intelligence thing was pure BS to shut people up who kept saying apple was going to get disrupted by missing out.
Apple seems to follow the values that Steve laid out. Tim isnāt a visionary but he seems to follow the principles associated with being disciplined with cash quite well. They havenāt done any stupid acquisitions either. Quite the contrast with OAI.
Quietly they are doing things on-device. The OCR + copy/paste is genuine goodness - modestly functional.
That's also literally years behind the competition. https://www.androidpolice.com/2018/05/09/android-ps-new-rece...
The competition has also attached it to a toxic brand and heavily integrated it with actively user-hostile applications. It doesn't matter if your tech is years ahead when people expect using it will mean your image content info will be sold to anyone willing to pay a cent for it.
Remember when Google added Car Crash Detection to Pixel in early 2020? Nobody does.
But when Apple added it in iPhone 14 (2022)...
But everyone talks about it like it was Apple, and isnāt that what matters (to Apple)?
I've never heard anybody (mis)attribute that to Apple.
I would have, and I work in tech. I'd guess that most people who use iOS have zero idea of what Android can and can't do, because they never use it and probably never will so what's the point of trying to find out.
Didn't they rush to integrate ChatGPT into their OS back in 2024? Reality doesn't seem to align with your description.
Yea, they nailed that with the Newton, Apple Pippin, and the Apple Vision Pro
The Vision Pro was a Development Kit; Just like the first generation Apple Watch. It's not meant for the consumers, it's meant for the developers among the consumers.
We will see if they ever release a new VisionOS device, but it's not the first time they did that; see also the Apple Watch.
You can explain away every failed product launch with "it's a developer product", not meant for consumers.
This wasn't like HoloLens or Google Glass. They marketed these devices to consumers and then sold these devices to consumers.
Apple learned to hang back from plowing the unsold Lisa's into a landfill.
How amazing is that Apple car
Depending on price I would or would not buy an Apple car; but I am quite interested in options for a car that (1) is electric; (2) doesn't spy on me and sell my data; (3) doesn't take video of me and my passengers and do weird things with it; and (4) doesn't support Republicans / white supremacists / Elon Musk.
And I imagine that like-minded consumers are a pretty large market.
(5) Doesn't support a dictatorship with camps.
The Vision Pro is the best AR/VR product ever created.
All the king's horses and all the king's men couldn't come up with a killer app.
Will this strategy work every time ? Maybe for AI it will work (market is competitive and Apple just purchases the best model for its consumers).
But this approach may not work in other areas: e.g. building electric batteries, wireless modems, electric cars, solar cell technology, quantum computing etc.
Essentially Apple got lucky with AI but it needs to keep investing in cutting edge technology in the various broad areas it operates in and not let others get too far ahead !
It works often enough for the company to be wildly successful. They can simply cut their losses and withdraw from industries where it hasn't, such as EVs.
I think their M chips are a good example. They ran on intel for so long, then did the impossible of changing architecture on Mac, even without much transition pain.
Obviously that was built upon years of iPhone experience, but it shows they can lag behind, buy from other vendors, and still win when it becomes worth it to them.
How is changing the architecture of a platform that only you make hardware for doing the impossible?
They could change the architecture again tonight, and start releasing new machines with it. The users will adopt because there is literally no other choice.
Every machine they release will be fastest and most capable on the platform, because there is no other option
The hard part is doing so without completely ruining the existing app ecosystem. Rosetta 2 is genuinely impressive.
Rosetta 1 delivered 50-80% of the performance of native, during the PPC->Intel transition. It turns out, you can deliver not particularly impressive performance and still not ruin your app ecosystem, because developers have to either update to target your new platform, or leave your platform entirely.
You can also voluntarily cut off huge chunks of your own app ecosystem intentionally, by giving up 32bit support and requiring everything to be 64bit capable.
...because users have no other choice when only one vendor controls the both the hardware+software. They can either use the apps still available to them, or they can leave. And the cost of leaving for users is a lot higher.
It's also notably not the first time they switched. They did the Motorola (I think MIPS?) Archictecure, then IBM PowerPC, then Intel x86 (for a single generation, then x86_64) and now Apple M-Series.
Motorola chip was called 68000.
But Apple doesn't just try to do everything.
They do the things they think they can do very well.
Why would they try to build electric batteries, wireless modems, electric cars, solar cells, or quantum computers, if their R&D hadn't already determined that they would likely be able to do so Very Well?
It's not like any of those are really in their primary lines of business anyway.
When have they done that since the first iPhone in 2007? The watch maybe? Though not sure that's "leapfrog" better than anyone else's smartwatch, but I don't have one so maybe I'm wrong.
Their own chips, vertically integrating.
- AirPods
- Apple Watch
- AirTag
Those are a few that come to mind. All do multi-billions in revenue per year.
None of those are the best product in their category, and all are only huge sellers because Apple anti-competitively privileges them in its ecosystem.
Whatās better than AirPods and AirTags? I want them
The parent poster is saying (and I agree) that Airpods and Airtags are only superior because Apple anti-competitively privileges their integration with iPhones. It's not that they are better at the hardware level by itself.
And since iPhones form the largest single company's device network in the rich countries, that is a pretty big advantage.
> wait to understand what the thing is capable of doing
My parents use Android to ask āWhat are the 5 biggest towers in Chicagoā or āRemove the people on my pictureā while apparently iPhone is only capable of doing āHey Siri start the Chronometer / There is no contact named Chronometer in your phoneā.
My iPhone is lagging a ridiculous 10 years behind. Itās just that I donāt trust Google with my credit card.
These are software/cloud features. You can install gemini on iphone if you want to talk about towers in Chicago.
The only reason to care about it being OS integrated is to interact with functions of the OS, which siri does fine.
Apple's AI stuff also uses cloud features, though you can't use them on other platforms. The problem with Apple's new cloud features is that they generally just suck. I'm surprised iCloud works so well with how hard they're fumbling basic stuff like this.
At least all of the ones I have tried work locally. Iāve entered airplane mode and things like magic eraser in images works fine.
Siri does not do it fine, it's literally the example the above commenter showed.
I want the reverse version of this, if Apple can promise me to 'lag behind' for another ten years I'll buy my first Apple device in ten years
Siri is one step below that for me, it still doesn't understand my accent, I feel like its voice recognition didn't improve from 2010...
"10 years behind" would be an improvement for Siri. It's actively broken much of the time in a way that Google Assistant or Alexa never has been.
I would argue that they are as bad as each other. I have to repeat most voice commands to Siri and Alexa than getting it right first time. No experience with Google.
Itās even more superpowered than previous implementations of this strategy.
When they made the iPhone, iPod, and Apple Watch they had no specific hardware advantage over competitors. Especially with early iPhone and iPod: no moat at all, make a better product with better marketing and youāll beat Apple.
Now? Good luck getting any kind of reasonably priced laptop or phone that can run local AI as well as the iPhone/MacBook. It doesnāt matter that Apple Intelligence sucks right now, what matters is that every request made to Gemini is losing money and possibly always will.
This is especially true in 2026 where Windows laptops are climbing in price while MacBooks stay the same.
How do you know Gemini is losing money on inference?
They're talking about free inference like Android and Google Home devices. No one is paying subscription fees for these and they're running their inference in the cloud. Apple Intelligence, for the most part, is running on the device.
Isn't some of Gemini's functionality on Android on-device?
> How do you know Gemini is losing money on inference?
It's not. People make this claim with zero evidence.
But Google made around $20B profit on Google search in 2025 Q4, and that includes AI search.
Apples advantage was that they did everything in house and had the marketing and distribution capabilities. And now youāve got the ecosystem lock in.
In hindsight itās obvious why they pulled it off - nobody else could do it. They all had pieces missing.
Apple arenāt in the business of building chatbots to impress investors (other than some WWDC2024 vaporware theyād rather not talk about any more). Theyāre in the business of consumer hardware.
Consumers want iPhones and (if Apple are right) some form of AR glasses in the next decade. Thatās their focus. Thereās a huge amount of machine learning and inference thatās required to get those to work. But itās under the hood and computed locally. Hence their chips. I donāt see what Apple have to gain by building a competitor to what OpenAI has to offer.
~25% of Apple's revenue came from services in FY25 (and 50% from iPhone, ~25% from other hardware). They made $415B in that year, so ~$100B from services alone!
Consumers don't necessarily want iPhone. They don't want to be excluded from iMessage, which is a completely different motivation.
Yeah, that just doesn't pass the simplest sniff tests. I barely use iMessage, and yet I'm an iPhone user. Basically everyone around me is the same.
Agreed, Iāve been a loyal iPhone user for a long time, and very few people I know use iMessage. I use it with my parents because they donāt have any other messenger, and they donāt even really know itās iMessage, they just think of it as texting. Everyone I know is using something else for messages, whether itās Discord, Instagram DMs, WhatsApp, or occasionally Telegram or Snapchat.
US centric view, which I believe to be wrong. UK is predominantly WhatsApp, and the bulk of handsets sold are still iPhones.
Income is a much tighter correlation than messaging platform. Rack up those market shares by phone value and the scales tip even harder.
> the bulk of handsets sold are still iPhones
According to https://gs.statcounter.com/os-market-share/mobile/united-kin... it's closer to 50/50.
No one uses iMessage in my country. Yet iPhones are sought after. Some of us just really like iPhones for the experience - not everything is a conspiracy. People can have different tastes and are more free to choose than people on HN like to believe.
iMessage is AFAIK only really a big thing in the US.
I totally buy this as someone located in the US, but what is everybody else using? It canāt be WhatsApp? Is everyone sending all their connection graphdata to Meta?
You understand that Facebook and Instagram are also very popular yes?
People who care about privacy (very very few) use signal, everyone else uses Whatsup
in my country it's Whatsapp, and has been since before it was acquired by Meta
Itās WhatsApp. No one thinks about sending data to Meta. The world is much bigger than the HN bubble, where almost no one thinks about privacy implications.
Absolutely this. No one cares about privacy. 99,9% population has no clue how tech works. āOh, itās an app on my phone.ā Thatās what typical consumer understands. How text travels from one phone to other is something magical.
Got WhatsApp, because there is no other channel to communicate with customers. Itās literally used by everyone without exceptions. Really scary.
What I don't get about Apple is when everyone else was giving up on yet another VR attempt, moving into AI, they decide AI isn't worth it, and it was the right time for a me too VR headset.
So no VR, given the price and lack of developer support, and late arrival into AI.
It is the same pattern, late on VR, late on AI. Those two tech have a pricing problem. I would guess that Apple is working to create the conditions to make these tech cheap enough to sell it to everyone.
I've had it turned off since Sequoia, and this I truly appreciate. It hasn't nagged me once to turn it or Siri on, and it isn't mandatory.
When I open up JIRA or Slack I am always greeted with multiple new dialogues pointing at some new AI bullshit, in comparison. We hates it precious
I don't like companies forcing their newest features on me noisily and constantly trying to ship new features and see what sticks so you can't trust whether a feature advertised one week will even be there the next.
However, I have even less patience for companies forcing paid-for third-party ads down my throat on a paid product. Slack at least doesn't sell my eyeballs. Facebook, Twitter, Google's ads are worse to me than new feature dialogues.
Which brings me to Apple. I pay for a $1k+ device, and yet the app store's first result is always a sponsored bit of spam, adware, or sometimes even malware (like the fake ledger wallet on iOS, that was a sponsored result for a crypto stealer). On my other devices, I can at least choose to not use ad-ridden BS (like on android you can use F-Droid and AuroraStore, on Linux my package manager has no ads), but on iOS it's harder to avoid.
Apple hasn't sunk to Google levels in terms of ads, but they've crossed a line.
It's best to avoid App Store and look for apps on Google (with ad blocker).
I get it but... well I think of App Store as... a store. I don't have to go there.
I'm actually pretty disappointed in the lack of discovery available in the App Store, but I rarely go there. I'm fine with advertising being there. I wish it was better but I'm not offended that there is paid promotion in a store.
>get letter from bank
>"to fix this, please install our app"
>search BankName
>comes up with other banks, BankNames US app (not the country you are in)
>revolut etc (cant use in the country you are in)
>ten minutes later
even worse when its your telecomm telling you to install their Official App so you can pay your bills or they will cut your cellular service, and you cant find it
I donāt see what that has to do with (increased) advertising on the App Store (IMO search there never has been good) or the comment you replied to in which colechristensen said: āI'm actually pretty disappointed in the lack of discovery availableā.
I think paid advertising may even help improve discoverability on the App Store because, instead of making 10 or 20 to do list apps and hoping to get them to rank high by a combination of sheer luck and SEO tricks, scammers may only make one, and pay to get that to the top of the list.
In super markets product placement is affected by two factors: how much producers are willing to pay for a good spot (e.g. by offering lower wholesale prices if the product gets a more visible place) and vetting by the store owner.
I donāt think different solutions exist in the App Store. Apple doesnāt want to do much vetting, making advertising the only thing that may help (and yes, it would be awesome if there were a store that did do much vetting, but that requires a world where many different stores exist, and we arenāt there (yet))
As someone who recently moved to NL from the US I encounter this issue about once a week and itās blocking me from doing serious things like paying for parking, taxes, utilities or government services, all of which have apps that are only available on the Dutch app store.
I have a separate Dutch Apple ID I can switch to, but each time I log out I risk accidentally deleting all my data.
> all of which have apps that are only available on the Dutch app store.
This isnāt really on Apple though. Blame the companies/developers for geo gating their apps. Itās a simple checkbox in the store to make it available for other countries.
That letter from the bank would probably include a QR code linking directly to their app oui?
Where do you install apps from then?
I get an app recommendation from a friend, I go to the App Store and search for it. I have to be super careful about which link I'm actually clicking on and which app I'm installing, because the App Store is riddled with spam and malware.
I wouldn't mind, except that Apple charge 30% of everything with the justification that they are keeping the ecosystem free of spam and malware...
Iāve been installing apps from the App Store for more than a decade and have never ever accidentally downloaded spam or malware. Iām sure itās there but itās really not āriddledā with it in my experience searching for apps. What itās riddled with is subscription-based apps whose free tier is worthless
I install a new app maybe once every 6 months. I agree that the app store is trash, littered with ads and casino games for kids.
I just don't find it hard to find the app I want, when I want something specific, and install, and then _get the hell out of that shithole_.
I thought the justification was that they curate an ecosystem of apps with loyal/paying customers
I haven't noticed this at all and I wonder if you're mistaking curation for advertising? When I open up the App Store I get a panel written "games we love" and a listing of indie games that are clearly not paid for ads. The ads in search are visibly marked as ads, and while I don't particularly like ads in general, they are pretty easy to avoid.
On iOS, if you open the App Store and click on the Today tab (it's the default tab if you kill and reopen), there's ads interspersed with curations.
For me, the second tile is an ad for Upside, some cashback app
Mine is Moneris Go, and the top review is titled "Garbage App!!!!" lol
Honestly the last time I remember using the App Store was years ago and I can't recall if they had ads or not. Imo it's distasteful and I wish they didn't have them. Still leagues better than the fucking ads in the start menu which caused me to give up on gaming and Windows forever.
If I open the app store and search "Gemini", the first result is "ChatGPT (advertisement)"
If I search for my bank, I get another bank. If I search for "Wordle", I get a bunch of ad-supported spamware (both the ad and non-ad results) before the real NYT Games app.
The app store has ads in search results. This is the primary way that my technologically inept relatives end up with the wrong app installed btw, is by searching and clicking the first result, and getting complete trash adware.
Apple should be ashamed of selling out their users.
Apple keeps nagging me to upgrade to godawful Tahoe. Every time thereās a system update (which includes Safari, Safari TP, CLT etc. updates) Tahoe is always default checked. Even when I specifically click on a Sequoia point update, the Tahoe update is always checked instead of that point release. This has way more destructive potential than ātry our new AI featureā in apps.
To add insult to injury, the one AI feature that I may want to evaluateāClaude Code integration in Xcodeāis gated behind Tahoe upgrade, even though it has absolutely no reason to do so, given that every other IDE integrates AI features just fine on any recent OS.
Edit: Oh and Iām not getting bombarded in Slack at all, maybe because my company doesnāt pay for any of the AI stuff there. Last time I got a banner or something like that was months ago.
> Think about the App Store. Apple didnāt build the apps, they built the platform where apps ran best, and the ecosystem followed.
As far as I remember Apple basically got forced into opening the platform to 3rd party developers. Not by regulation but by public pressure. It wasn't their initial intention to allow it.
Nvidia restricts gamer cards in data centers through licensing, eventually they will probably release a cheaper consumer AI card to corner the local AI market that can't be used in data centers if they feel too much of a threat from Apple.
Imagine a future where Nvidia sells the exact same product at completely different prices, cheap for those using local models, and expensive for those deploying proprietary models in data centers.
Nvidia-Mediatek Arm laptops will compete with Qualcomm and Apple, https://www.forbes.com/sites/jonmarkman/2026/03/16/the-arm-i...
If they can get Valve/Steam for an OS that handles most games well that could in fact be huge if the pricepoint is a bit lower initially but with plenty of unified RAM (both for AI but also games).
That said, gaming laptops cooling issues are so often around the GPU so it'd also require a seasoned manufacturer to make it correctly.
Thereās long been professional segmentation for GPUs, long before people started running AI models on them
Having your cake and eating it too. Consumer goodwill and printing money.
My capex is even less than Apple, I can ship to user's Apple hardware and I can't access iPhone user photos either...so really I'm the winner.
Apple's accidental moat now is taking the rise of hardware prices due to AI eat into their margins and just expand the mac user base.
Thing is, Apple never considered racing against LLM runners. Apple's success comes from human-centered design, it is not trying to launch a me-too product just because it increases their stock price. iPod was not the first MP3 player. iPhone was not even 3G at launch -- in the middle of 3G marketing craze.
They sure got lucky that unified memory is well-suited for running AI, but they just focused on having cost- and energy-efficient computing power. They've been having glasses in sight for the last 10 years (when was Magic Leap's first product?) and these chips have been developed with that in mind. But not only the chips: nothing was forcing Apple to spend the extra money for blazing fast SSD -- but they did.
So yes, Apple is a hardware company. All the services it sells run on their hardware. They've just designed their hardware to support their users' workflows, ignoring distractions.
With that said, LLM makes the GPU + memory bandwidth fun again. NVidia can't do it alone, Intel can't do it alone, but Apple positioned itself for it. It reminds me how everyone was surprised when then introduced 64-bit ARM for everyone: very few people understood what they were doing.
Tbh there are NVidia GPUs that beat Apple perf 2x or 3x, but these are desktop or server chips consuming 10x the power. Now all Apple needs to do is keep delivering performance out of Apple Silicon at good prices and best energy efficiency. Local LLM make sense when you need it immediately, anywhere, privately -- hence you need energy efficiency.
Using the authorās logic, it is Google then that will lead.
Unlike Apple, they have even more devices in the field PLUS they have strong models PLUS Apple uses Google models.
Google is an advertisement company at the end of the day and that's a conflict of interest with user privacy.
Maybe they thought an investment in a product with lots of substitutes & high capital requirements wasn't very attractive.
Honestly, I think part of the reason Apple hasn't jumped deep into AI is due to two big reasons:
1) Apple is not a data company.
2) Apple hasn't found a compelling, intuitive, and most of all, consistent, user experience for AI yet.
Regarding point 2: I haven't seen anyone share a hands down improved UX for a user driven product outside of something that is a variation of a chat bot. Even the main AI players can't advertise anything more than, "have AI plan your vacation".
As for consistency, Apple's latest UI shows they don't give a damn any more.
Put proper LLM into Siri. Encourage developers to expose the functionality of their apps as functions, allow Siri LLM to access those (and sprinkle some magic security dust over it).
Boom, you have an agent in the phone capable of doing all the stuff you can do with the apps. Which means pretty much everything in our life.
there are always three elements in the equations of business model: 1. marginal cost 2. marginal revenue 3. value created
for llm providers, i always believe the key is to focus on high value problems such as coding or knowledge work, becaues of the high marginal cost of having new customers - the token burnt. and low marginal revenue if the problem is not valuable enough. in this sense no llm providers can scale like previous social media platforms without taking huge losses. and no meaning user stickiness can be built unless you have users' data. and there is no meaningful business model unless people are willing to pay a high price for the problem you solve, in the same way as paying for a saas.
i am really not optimistic about the llm providers other than anthropic. it seems that the rest are just burning money, and for what? there is no clear path for monetization.
and when the local llm is powerful enough, they will soon be obsolete for the cost, and the unsustainable business model. in the end of the day, i do agree that it is the consumer hardware provider that can win this game.
I am super bullish on Google, they are my best bet to earn from models. Mostly because they are vertically integrated (other revenue streams) + open to provide services to other companies (Apple deal).
> I am actually of the opinion that without some kind of bailout, OpenAI could be bankrupt in the next 18-24 months, but I am horrible at predictions
I find this intriguing.. Does anyone here have enough insight to speculate more?
1) Put data on X/Y chart 2) Find ruler and pencil 3) Draw line
Doing this you will make all kind of fun predictions.
I don't think I have unique insight on this but the common belief is they are desperately trying to reach AGI or a least have some halo model that will allow them to rise over the other companies. The problem is they have a hilariously large monthly burn paying for compute. If they don't produce something, they are in trouble if investors stop offering capital.
It's probably one of the biggest headlines right now. OpenAI has about $96 billion in debt and they don't have a revenue generating product yet.
I might be wrong but should you not have said profit generating? I pay them $20 a month so they have at least $20 of revenue
What I think was a wasted opportunity was not bringing the xserve back, being one of the few e2e solutions out there at scale.
In the larger scheme of things, the great winner will be open source, as we'll simply use AI to recreate the entire MacOS ecosystem :)
If AI coding does go anywhere and stays affordable, this would be a great outcome.
I think AI needs to greatly accelerate open hardware design and make advanced manufacturing more accessible to really make a dent.
User facing software is not the limiting factor in AI assisted replacement of Apple products.
Apple is almost 2 years out from their announcement of Apple Intelligence. It has barely delivered on any of the hype. New Siri was delayed and barely mentioned in the last WWDC; none of the features are released in China.
In other news, people keep buying iPhones, and Apple just had its best quarter ever in China. AAPL is up 24% from last year.
i dont even care about apple intelligence. stays off, not sure anyone really cares about it who is also interested in what this ai shenanigans is about on a local device. i think people keep conflating apple intelligence with all these convos about how macs are kinda dope for joe consumer wanting to tinker with llms.
that's the other part of the story that matters, not apple intelligence. this writeup tries to touch on that, apple is uniquely positioned to do really well in this arena if/when local llm's becoming commodities that can do really impressive stuff. we're getting there a lot faster than we thought, someone had a trillion parameter qwen3,5 model going on his 128gb macbook and now people are thinking of more creative ways to swap out whats in memory as needed.
A lot of the people that bought iPhones are now buying Macs as well.
Indeed, a lot of the people that bought iPhones are now buying Macs with a binned version of the chip they already bought. So much so that Apple is in danger of running out of them.
It's almost like people don't actually want LLMs all over their core tools...
So Appleās AI acceleration and memory architecture is accidental, but nvidiaās is not?
Nvidia has research papers on accelerating Machine Learning as far back as 2014: https://research.nvidia.com/publications?f%5B0%5D=research_a...
I just realized that next year Apple's Neural Engine will be 10 years old, just like the "NPUs will change AI forever!" puff pieces.
Here's to another 10 years of scuffed Metal Compute Shaders, I guess.
I think the article is missing a whole aspect on how Apple is ensuring to not face actual competition while they're "playing it safe":
Even if the investment is overblown, there is market-demand for the services offered in the AI-industry. In a competitive playing field with equal opportunities, Apple would be affected by not participating. But they are establishing again their digital market concept, where they hinder a level playing field for Apple users.
Like they did with the Appstore (where Apple is owning the marketplace but also competes in it) they are setting themselves up as the "the bakn always wins" gatekeeper in the Apple ecosystem for AI services, by making "Apple Intelligence" an ecosystem orchestration layer (and thus themselves the gatekeeper).
1. They made a deal with OpenAI to close Apple's competitive gap on consumer AI, allowing users to upgrade to paid ChatGPT subscriptions from within the iOS menu. OpenAI has to pay at least (!) the usual revenue share for this, but considering that Apple integrated them directly into iOS I'm sure OpenAI has to pay MORE than that. (also supported by the fact that OpenAI doesn't allow users to upgrade to the 200USD PRO tier using this path, but only the 20USD Plus tier) [1]
2. Apple's integration is set up to collect data from this AI digital market they created: Their legal text for the initial release with OpenAI already states that all requests sent to ChatGPT are first evaluated by "Apple Intelligence & Siri" and "your request is analyzed to determine whether ChatGPT might have useful results" [2]. This architecture requires(!) them to not only collect and analyze data about the type of requests, but also gives them first-right-to-refuse for all tasks.
3. Developers are "encouraged" to integrate Apple Intelligence right into their apps [3]. This will have AI-tasks first evaluated by Apple
4. Apple has confirmed that they are interested to enable other AI-providers using the same path [4]
--> Apple will be the gatekeeper to decide whether they can fulfill a task by themselves or offer the user to hand it off to a 3rd party service provider.
--> Apple will be in control of the "Neural Engine" on the device, and I expect them to use it to run inference models they created based on statistics of step#2 above
--> I expect that AI orchestration, including training those models and distributing/maintaining them on the devices will be a significant part of Apple's AI strategy. This could cover alot of text and image processing and already significantly reduce their datacenter cost for cloud-based AI-services. For the remaining, more compute-intensive AI-services they will be able to closely monitor (via above step#2) when it will be most economic to in-source a service instead of "just" getting revenue-share for it (via above step#1).
So the juggernaut Apple is making sure to get the reward from those taking the risk. I don't see the US doing much about this anti-competitive practice so far, but at least in the EU this strategy has been identified and is being scrutinized.
[1] https://help.openai.com/en/articles/7905739-chatgpt-ios-app-...
[2] https://www.apple.com/legal/privacy/data/en/chatgpt-extensio...
[3] https://developer.apple.com/apple-intelligence/
[4] https://9to5mac.com/2024/06/10/craig-federighi-says-apple-ho...
> Pure strategy, luck, or a bit of both? I keep going back and forth on this, honestly, and I still donāt know if this was Appleās strategy all along, or they didnāt feel in the position to make a bet and are just flowing as the events unfold maximising their optionality.
Maximizing the available options is in fact a "strategy", and often a winning one when it comes to technology. I would love to be reminded of a list of tech innovators who were first and still the best.
Anyway, hasn't this always been Apple's strategy?
Thatās actually by design. Apple never jumps on the tech hype bandwagon.
they wait until the dust settles before making their well-thought-out moves.
Every time theyāve jumped the hype train too quickly it hasnāt worked out, like Siri for example.
How do you rate Vision Pro? It was not the first one, but it was certainly the best one. Total dud though, while Meta Ray Bans are selling like hot cakes (irrespective of what you think of the company)
It's the same everywhere: great fundamentals pay off. It's true of martial arts, dance, and absolutely about software platforms. You just have to trust that process and invest in it, which Apple does (although frustratingly not enough!).
For the love of all that's holy - folks please stop using AI to publish smart sounding texts. While you may think you are "polishing" your text, you are just disrespecting your readers. Write in your own words.
Apple is just waiting for all the slop to inevitably crash to see what actually works
This seems mistaken to me. The core idea is that LLMs are commoditizing and that the UI (Siri in this case) is what users will stick with.
But... what's the argument that the bulk of "AI value" in the coming decade is going to be... Siri Queries?! That seems ridiculous on its face.
You don't code with Siri, you don't coordinate automated workforces with Siri, you don't use Siri to replace your customer service department, you don't use Siri to build your documentation collation system. You don't implement your auto-kill weaponry system in Siri. And Siri isn't going to be the face of SkyNet and the death of human society.
Siri is what you use to get your iPhone to do random stuff. And it's great. But ... the world is a whole lot bigger than that.
Apple never competed in the "AI race" in the first place, because they already knew they were already at the finish line.
This was really unsurprising [0].
[0] https://news.ycombinator.com/item?id=40278371
Your linked comment argues the opposite.
> Won't be surprised for the re-introduction of Xserve again but for AI.
This means, Apple is gonna spend a lot of money standing up data centers (CapEx). And the article in question is essentially saying that Apple is smart not to spend any money.
It sounds like there's a bit of wishful thinking on - Whatever Apple is doing is 4D chess. Apple not spending any money - That's genuis. Apple re-introducing Xserve racks - genius.
> This is an obvious moat for Apple who can offer a cheaper alternative for training, inference AI server farms.
According to Bloomberg, Apple's inference server farms are a flop: https://9to5mac.com/2026/03/02/some-apple-ai-servers-are-rep...
Go a little bit deeper than what the media directly wants you to think.
> Then Stargate Texas was cancelled, OpenAI and Oracle couldnāt agree terms, and the demand that had justified Micronās entire strategic pivot simply vanished. Micronās stock crashed.
Well.. no. The Stargate expansion was cancelled the orginally planned 1.2MW (!) datacenter is going ahead:
> The main site is located in Abilene, Texas, where an initial expansion phase with a capacity of 1.2 GW is being built on a campus spanning over 1,000 acres (approximately 400 hectares). Construction costs for this phase amount to around $15 billion. While two buildings have already been completed and put into operation, work is underway on further construction phases, the so-called Longhorn and Hamby sections. Satellite data confirms active construction activity, and completion of the last planned building is projected to take until 2029.
> The Stargate story, however, is also a story of fading ambitions. In March 2026, Bloomberg reported that Oracle and OpenAI had abandoned their original expansion plans for the Abilene campus. Instead of expanding to 2 GW, they would stick with the planned 1.2 GW for this location. OpenAI stated that it preferred to build the additional capacity at other locations. Microsoft then took over the planning of two additional AI factory buildings in the immediate vicinity of the OpenAI campus, which the data center provider Crusoe will build for Microsoft. This effectively creates two adjacent AI megacampus locations in Abilene, sharing an industrial infrastructure. The original partnership dynamics between OpenAI and SoftBank proved problematic: media reports described disagreements over site selection and energy sources as points of contention.
https://xpert.digital/en/digitale-ruestungsspirale/
> Micronās stock crashed. [the link included an image of dropping to $320]
Micronās stock is back to $420 today
> One analysis found a max-plan subscriber consuming $27,000 worth of compute with their 200$ Max subscription.
Actually, no. They'd miscalculated and consumed $2700 worth of tokens.
The same place that checked that claim also points out:
> In fact, Anthropicās own data suggests the average Claude Code developer uses about $6 per day in API-equivalent compute.
https://www.financialexpress.com/life/technology-why-is-clau...
I like Apple's chips, but why do we put up with crappy analysis like this?
Apple's reality distortion field is really really strong. People love to claim Apple is doing 4D chess, when in reality Apple has certain strengths but AI is anything but.
Which is why they were completely caught offguard with botched rollout of Apple Intelligence. Even when they were playing to their strengths, things have not gone for them (Apple Vision Pro). Liquid Glass has had mixed reception, and that's often explained away as "Apple is setting up a world for Spatial Computing by unifying design language" and when the lead designer was fired it was like "Thank God Alan Dye is gone, he was bad for Apple anyway".
So essentially, Apple can do no wrong.
But why do I feel like the quality of the software from Apple declined sharply in recent years? The liquid glass design feels very unpolished and not well thought out throughout almost everywhere⦠seems like even Apple canāt resist falling victim to AI slop
I donāt think itās AI slop. Even before modern generative AI, Iāve noticed a decline in Appleās software quality.
Rather, I feel that Apple has forgotten its roots. The Mac was āthe computer for the rest of us,ā and there were usability guidelines backed by research. What made the Mac stand out against Windows during a time when Windows had 95%+ marketshare was the Macās ease of use. The Mac really stood out in the 2000s, with Panther and Tiger being compelling alternatives to Windows XP.
I think Apple is less perfectionistic about its software than it was 15-20 years ago. I donāt know what caused this change, but I have a few hunches:
0. Thereās no Steve Jobs.
1. When the competition is Windows and Android, and where thereās no other commercial competitors, thereās a temptation to just be marginally better than Windows/Android than to be the absolute best. Windowsā shooting itself in the foot doesnāt help matters.
2. The amazing performance and energy efficiency of Apple Silicon is carrying the Mac.
3. Many of the people who shaped the culture of Appleās software from the 1980s to the 2000s are retired or have even passed away. Additionally, there are not a lot of young software developers who have heard of people like Larry Tesler, Bill Atkinson, Bruce Tognazzini, Don Norman, and other people who shaped Appleās UI/UX principles.
4. Speaking of Bruce Tognazzini and Don Norman, I am reminded of this 2015 article (https://www.fastcompany.com/3053406/how-apple-is-giving-desi...) where they criticized Appleās design as being focused on form over function. Itās only gotten worse since 2015. The saving grace for Apple is that the rest of the industry has gone even further in reducing usability.
I think what it will take for Apple to readopt its perfectionism is if competition forced it to.
Software quality decline has been a recognised trend long before LLMs took the limelight. Apple included.
Don't worry, when apple introduce it, it'll be revolutionary and 10% thinner.
Apple will just drip feed locally running models that enable minor conveniences. They will probably drop the Apple Intelligence label later and just have things with their own names like "magic eraser".
Apple have had Siri for decades without any meaningful movement. If you think Apple is suddenly going to get better, that's just wishful thinking. Apple neither has the expertise nor the capability to do any of that. They'd hvae demonstrated that with Siri long time back.
What Apple does it build beautiful hardware. The software has been shambles for a really long time.
I like how we are acting like this market is so novel and emergent revering the luck of some while lamenting the failures of others when it was all "roadmapped" a decade ago. It's like watching a Shaanxi shadow puppet show with artificial folk lore about the origins of the industry. I hate reality television!