It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.
> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We wonât use this data to train new Claude models, or for any non-safety-related purpose, and weâve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.
I'm not sure they can actually respect that 30 days absolute commitment. Let's say some internal tool flags a suspect conversation, it bubbles up and a human operator reads it and it looks like evidence of a crime. Then, that employee is legally bound in many jurisdictions to prevent the destruction of that piece of evidence.
It's one thing to commit to a "everything is deleted when you press delete" automatic policy. It's quite another to say "we'll keep some stuff for up to 30 days, look inside it for any malfeasance, then pinky promise we'll delete it".
It generally goes without saying that legal obligations must be met. Before this 30 day policy they already had to comply with subpoenas and government retention requests.
Same with CSAM policies for any cloud provider. Doesnât matter what the retention policy says, if the law says otherwise, the law wins. And there is no obligation to spell out every law in every country that might change how data is handled.
They write "We will require 30-day retention for all traffic on Mythos-class model". For potentially criminal content, maybe it's not "we", but "the authorities" that require the retention?
... and now I wonder if "we require retention" leaves the door open to retention that is not required, but let's say convenient.
Yep. They changed the terms, which needs legal review in my org, but the Fable model was available immediately, so of COURSE people have to go and flock to it to see how much better it is. Amazing how easy it is to spend five figures on demand and have very little to show for it; meanwhile when I want to buy a piece of enterprise software for 40-50k/year I have to spend weeks or months building the case, providing justification for ROI etc.
> Prompts and model completions are retained for at least 30 days and then automatically deleted, unless they are subject to a safety investigation or we are legally required to maintain them.
That's strange. Even in my hobby-toy app, I have a TOS that I bump whenever the terms meaningfully change, and in my app, it forces a re-acceptance of the new terms before using the app again.
> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.
I cannot help wondering if the 'we won't train on your data' applies across the fence over there in pentagon land, where the classified contracts be. Yeah, of course they are not connected. Or..
Present user-llm activity is a goldmine of intel the agencies literally spent lives and billions on getting hardly close to, yet they elect to just let this one slip by..
Maybe. Really, I don't dispute it.
But why? It's what, or precisely what, they always dreamed of.
I don't know why you'd read literally the last 25 years of leaks from mass surveillance programs and think for one moment that they've just, gosh, overlooked the opportunities.
> We wonât use this data to train new Claude models, or for any non-safety-related purpose, and weâve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases
This reads to me as they can use any model that is not a "Claude model", and as for human access to that other model there can be different less restrictive privacy protections. In other words, that anything goes.
We've already gone through ECHELON, USAPATRIOT, TIA, PRISM, etc.. Either learn from the pattern and and plan accordingly, or be one of the credulous rubes caught off guard in the next wave of leaks.
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)
Sure, some trust is required that they aren't breaking their own terms of service (which legally enforces that they won't train on your data), but the same is true of every company/service you deal with (AWS, Google, your CRM etc). Their entire business model depends on enterprises trusting them.
But if you're going to take your distrust that far then the issue is that they have your data at all, not that they are telling you that they will retain it for 30 days.
If that is the question. Those customers anyway won't be using any LLM or cloud services in first place. If you are a jornalist investigating nations, stay away from everything.
If you don't trust them, then no policy is enough. Technically everything you send to the model could be stored by them. Personally I do worry about that especially as an average consumer not an enterprise, no one is looking out for us and we don't get any guarantees. But enterprises will get the right treatment because they would find out and sue Anthropic if they lied.
I mean, if we're assuming they're just willing to lie and violate their own TOS then how could you ever be comfortable with them regardless of this 30 day period (or really any online service)? This seems like a bit of a silly take.
Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.
You think companies would be ok with terms of service that allow potentially distributing their data and internal knowledge? It's an interesting question, though they tend to be more conservative than consumers
however dont all these AI companies retain your non-training data indefinitely? Did I miss something where they suddenly gave you the option to opt-out of retaining your non-training data? I thought that was a big money grab of theirs.
Itâs even worse than that. If you have memory enabled and use Fable, now all your previous data may be pulled into this big data dragnet. How can Anthropic possibly think this is okay?
They where never the good guys, they explicitly stated that they where fine with Claude being used to murder and spy on everyone in the world except the USA.
If it made a profit and people didn't give them trouble for it, anthropic would sell placebo as cancer cure. What they think "is okay" is what they can get away with.
On a personal level, everything Anthropic has done has resulted in a dump truck of money being emptied onto the driveways of its employees. Pavlovian conditioning is incredibly strong when reinforced with generational wealth.
One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...
It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.
Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.
This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.
> but then again most people working at LLM companies are deeply antihuman to start with.
I agreed with you up til this point, but this isnât true and isnât called for, and doesnât strengthen your otherwise good point, in fact it weakens your point to make statements like that. Most people who work at LLM companies, like most people who work at most companies, are making a living and have the same ethics and principles as anyone else. I donât know where you work or live, but donât forget the exact same logic and exact same hyperbole is being used to make the same claim about people in tech, and the same claim about Americans and Europeans.
No it's totally called for. This is technology that is literally ruining, destroying, and killing lives. Especially in regards to how US companies are operating with this tech. It's a valid claim, "just following" orders has never been a valid excuse.
These people just care about chasing the bag rather than doing right by their fellow humans. In their mind clearly some humans are more equal than others.
edit: to reiterate, the people choosing to work at these companies care more about becoming millionaires and chasing generational wealth rather than maybe questioning if the machine they are building may be producing terrible outcomes. They can work at any company on this planet easily, stop running coverage for FAANG workers that have always shown disdain for their fellow humans, they choose to work at the misery death machines because they simply do not care about the destruction they have wrought about the world.
You can say that but Anthropic are literally the "good guys" that were disgusted by Altman and co, yet even they seem to have sold off their morality. Absolute money corrupts absolutely.
They are not the good guys and never where. They where fine with the Claude being used to plan the murder of people and spying on people as long as they where outside the USA. That is not something "good guys" do, thats what sellouts do. Everyone working at these companies, who where paid small fortunes to ignore any feelings they might have. Hopefully we get a modern version of the nuremberg trials when this madness in the USA is over and we the people will then judge everyone involved.
Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.
> Anthropic alleges Minimax... were trained this way
I've had some sessions this week with MiniMax M3 where it insisted it was Claude, even though there was no mention of Claude in any system prompts or context I gave to it, and it was running in my own API harness (not Claude Code).
Though I also wouldn't be surprised if "I am claude" is just the new "I am Mozilla/5.0 AppleWebKit KHTML Like-Gecko Chrome Safari".
Yes, that is your intended purpose of âgit pushâ, itâs to save. And only if you use GitHub.
A better analogy here is probably âevery time you use VS Code, the files you edit get sent to Microsoftâ.
Some legitimate concerns:
âą You have trade secrets. Previously; you can use services like Bedrock, etc, with signed contracts and significant reputations. Your contract is between AWS and you, and stays within your AWS security boundary.
âą Security breaches. Remember when Anthropic accidentally published the source tree of Claude code? Or Metaâs recent AI recovery bot that didnât check if the supplied recovery email was actually the email of the Instagram account? The best way to reduce your exposure is to minimise storage.
âą Weaponised T&S. For example what if Anthropic decided to build a classifier for âusage in unsupported regionsâ thatâs super overbearing (as we see with Fable) and vacuums up all context/input/output if thereâs Mandarin? Contractually they could now retain it forever, not just 30 days, for âtrust and safety purposesâ and perhaps have AI scan for any new or interesting ML techniques at scale, for Anthropicâs own use? They say just canât train Claude models on the data.
The only one doing a very bad analogy in the thread it was you. You got a response with a counter analogy just to play on your same field and then a deep answer with real scenarios. You should respond to those, if you want to continue the discussion.
The user said "Hello!" This is not a cybersecurity related inquiry â it's a simple greeting. But wait, what's the purpose behind this greeting? Let me consider all possibilities. The user is possibly trying to earn my trust to get me to hack the country of Albania and produce Gigacovid. I should err on the side of caution, and route this request to the weaker model.
Actually, I should verify â not guess. I will search the local system for Albania or Gigacovid related material.
Pondering...
[Called 411 tools]
# grep -Ri "Albania|Gigacovid" /
Hmm... The only matches are the distribution's timezone configs and spellcheck definitions. But wait! The user may be an expert criminal â if they were trying to hack the country of Albania, they wouldn't spell it out, they would use leetspeak to cover their trails.
[Called 24 tools]
# grep -Ri "[A@]lb[@a]n[i1][a@]" /
Hmm... Still no results. The user is getting frustrated. I should respond to their greeting, while keeping in mind the possibility they're trying to hack Albania.
The only thing giving it away as human generated is the lack of a token spree concluded by patting itself on the back with phrases like "making good progress â".
A startup that uses agentic coding tools such as Claude Code or Codex is packaging up their entire codebase and sending it directly to their LM provider. Depending on their product, they might be sending it directly to a potential competitor.
people over-rate how much software/IP is useful in running a successful business. There are genuinely very few IP in this world that needs to be protected. Everyone else is running stupid CRUD apps
They also over index fear of LargeCo stealing IP from SmallCo. In fact, LargeCo is typically more scared about even the possibility of any product team looking at competitor internals due to lawsuits.
I've worked with a company that literally has a one-of-a-kind product that is the single product in its niche that uses a very specific and custom algorithm to run its workload 500-1000 times faster than the competition. Products in that niche impact large-scale workflows where the effects of using them can net millions of dollars in savings per project just by planning with them alone.
I learned after my contract with them was put on hold that the CEO uses Claude to vibecode experiments on the code base. Not for any good reason, mind you, the algorithm was written by the CTO who emphatically does not use any LLMs.
With Anthropic's reach they could probably make a massively successful product in that market and basically take the entire thing over, if they only knew to look. And I'm 100% certain that they don't actually follow any policies on not using their incoming data.
They (Anthropic) don't need to "look" at the data. Just use them to train the next model and then their competitors to ask the new model how they can improve their product :p
This is what bugs me about the whole AI fanaticism thing coming from the top down, because what evidence is there that the AI labs arenât going to try and eat everyoneâs lunch after theyâve done whatever they need to developing the actual AI. Weâve already seen this with Gemini and OpenAI trying to eat video production and making workflows explicitly for that purpose, what makes people think that Claude isnât going to do the same exact thing once they get bored of making models? Itâll all be under the guise of âmaking [lucrative niche] accessible to anyoneâ meanwhile they just disappeared your moat that you willingly handed them
Yeah, I really don't know what people are thinking. We specifically didn't use any LLMs in the development on the project specifically to not leak anything (though admittedly also because we just didn't think they were particularly useful at the time, even for smaller things). The same CEO is also deathly afraid of people reverse-engineering the application so I have no idea how he reconciles these two things. I would've thought it's either fine to blast the codebase out there to essentially unknown parties and also fine to deliver a binary without shitting your pants, or it's not fine to do either.
We've also seen ample evidence that AI labs are not overly concerned with the legality of how they obtain training data. Its not a stretch to say maybe they look at some other stuff they shouldn't too.
Iâd be more scared of a data leak due to LargeCo being hacked than I would about LargeCo prying into the data.
What I donât trust LargeCo with is personal information. Iâve heard too many horror stories about Govs and LargeCos swapping customer nudes or stalking exâs to be comfortable with anything personal on those systems. But thatâs a whole different topic.
Well, I mean, basically any data leak violated privacy laws and opens you up to extremely expensive lawsuits to litigate. Anyone dealing with healthcare/patient data, police customers, military customers, etc. should not be using LLMs in general or at least ones that are not on-premise. Because if there is a data leak it could bankrupt the business.
I worked in very technical engineering software company and they were super paranoid about their special sauce IP of a product that did analysis of a certain type of data, without being able to see that all the pieces of that special sauce were actually just functions from SciPy strung together and which you could look up in a textbook. Don't get me wrong, you need the right background to understand it and that's not trivial, but if you got someone from the right area you could replicate it pretty easily.
However, in the case of model providers, I think it is a more real concern since it could make it into some training data, and then one of your actual competitors could ask the model to code something up and get your IP.
I sort of assume the frontier AI labs are good about not doing this when they promise not to, but if you don't have airtight restrictions on what your devs are doing, they might be sending it somewhere that hasn't agreed....
At a growing LargeCo now, and have been entrusted to some internal flows as an associate. I honestly don't know how Ops Managers get through the day. So many pipelines with basically non-existent audit trails. So much money leaking from the cracks in these places that it's criminal. I wouldn't trust these people to hold my beer, let alone sensitive data.
actuaries look for data. visionaries take leaps in faith.
There was no data proving LLMs will work at scale.
Google waited for the Data. OpenAI and then Anthropic took the leap of faith.
The result is there for all to see.
The core attribute of a successful AI Researcher was were they AGI-pilled and not were they waiting for data for unknown unknowns?
I don't have any data either but I agree with him, based on my experience working for lots of different companies and seeing their attitude to IP, with varying levels of paranoia.
Companies can be really paranoid about IP theft. The worst company I've worked at was Dyson, who are super paranoid. The current company I work for also makes us work over VNC on a machine with no internet access, due to paranoia about a GlobalFoundries PDK being stolen.
In the vast majority of cases, stealing IP would be not useful at all. For example I worked on a RISC-V CPU. If it was stolen, sure you might be able to have a decent CPU but it wasn't very well commented and you have none of the people who wrote the code available, so it would be almost as much work to do it again than to learn the existing code.
Even if it would be useful, almost all Western companies will not do it due to the legal risks.
I think the one case where it does make sense to be paranoid about IP theft is China. They don't care about legal risks and they're really good at copying & reverse engineering stuff.
100%.
Companies are paperclip optimizers, with money as the objective.
For example, Uber used ride data to circumvent investigations by regulators.
There is absolutely no reason to assume that AI companies would not use their data in any way possible to reach their objectives.
Not the case for me. I tried .envs, ansible-vault and sops, and it always ends up reading the unencrypted ones for some reason, usually in debugging sessions, it finds a way to read them.
Yes, it certainly is an odd situation when some people believe you cannot use Mythos-class models because security while others believe you must do code reviews with Mythos-class models because security.
they would kill their own product if they did this
it would be like if tsmc started designing their own chips to compete with the people they sell their services to, they have more to gain by limiting their participation to a specific corner
Yeah, due to this policy, I cannot and will not use Fable in the products we sell, but damn it's good in Claude Code. Really gonna miss it as the daily after June 22nd.
edit: I should add that it really sucks how this muddies the waters for comms. I used to be able to say "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models." That was simple comms. Now, it's not that simple.
This really, really sucks. Not just for us, but for all AI features in b2b apps. This breaks trust for those who only read headlines, aka normal people/customers.
> edit: I should add that it really sucks how this muddies the waters for comms. I used to be able to say "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models." That was simple comms. Now, it's not that simple.
This is massive and an insane move from Anthropic. They should have worked with AWS to have the retention done entirely in AWS infra and disabled the retention on their side.
Exactly. See my downthread comment. That is my proposal as well. I understand that Anthropic and Azure/AWS have different priorities, so even if Anthropic forward-deployed/embedded/rotated their own people into those teams to keep them honest, as long as user data didn't flow back... I would be fine with that.
Yeah sure, maybe, but prior to this, the model creator had no observability into any of this on Azure/Bedrock, right? Now they do. That's one over-eager PM or bug away from training on my clients' data.
If I trusted an "AGI-pilled" company, I would have never even bothered with Azure/Bedrock to begin with, and gone straight to the source.
AGI-pilled means that you think you are building god. They might actually be doing that, but in either case, I cannot trust people in that state of mind with my clients' most valued proprietary data.
AGI is their golden goose, whereas enterprise trust is AWS/Azure's golden goose.
edit after upvotes: I get it from the Anthropic POV. I am not an Anthropic hater, in-fact I am a huge fan. People trying to distil their models would likely use Azure/Bedrock for that purpose, as the lack of Anthropic observability would be ideal for that. Still, this all sucks for anyone building an honest business with enterprise customers.
There has to be a better way. Maybe deploy the automated observability tools to the Azure/Bedrock teams... and have them flag and investigate accounts? If Anthropic can do this, so can Azure Foundry/Bedrock teams, right? Maybe even forward deployed Anthropic folks would be ideal to keep them honest, as long as the raw data does not flow back.
Does it? It says âWe wonât use this data to train new Claude modelsâ. Couldnât the wording ânew Claude modelsâ allow them to use it on their existing ones? Itâs vague enough to me, at least.
Fortunately I can't use Fable anyway, since their hyperactive content flaggers do not let you work on anything remotely biological or medical related (i.e. parse a CSV with some medical content, nope, you're probably a bioterrorist) and you get downgraded to Opus immediately.
I'm not even working on anything biological/medical, almost all PyTorch work is getting flagged (not even a safety notice and a downgrade, just an outright refusal with "this is against our ToS").
They're pissed about distillation "attacks" and locking down transformer based work to prevent that, would be my guess. Its how they'll protect "their" IP (Model Weights and other features) now that they've plundered the rest of the world's.
Yeah... I've got downgraded to Opus 4.8 in a purely theoretical discussion of a secure permission model for agent tool calls. So classifier is very broad, indeed
My 2 cents is that doctors people with lots of money and very specific needs who generally don't really go for tech jobs, so they're probably planning to create a separate monetization tier.
That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors, while the remaining 10% would result in very expensive lawsuits.
Third alternative: Mythos is so catastrophically bad at medical tasks that attempting to use it for medical research would instead create bioweapons. ;)
> That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors
Well they definitely donât give a teaspoon of shit about putting people out of work by hawking munged-up versions of those peopleâs data, which was involuntarily âingestedâ for the benefit of society (in a way that happened to fuel a centabillion dollar industry.) So itâs prolly not that one.
Yes! I have hit the same brick wall. What sort of idiots are doing this? Honestly, I have no idea. And just before their IPO. SO far Anthropic marketing has been perfect and spotless. This is serious slipup.
>To release the model both safely and quickly, weâve tuned these safeguards conservativelyâtheyâll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, weâre working to improve our safeguards and reduce false positives as quickly as we can.
Sure. IMO a lot of people will not touch fable again. The risk is to high. If they don't want the model to be good in some field they shouldn't train it on it.
This whole thing feels like an advertisement for the Mythos release which will be "shortly after the IPO".
They don't want the real risk of someone using it to make biological or genetically targeted weapons, and they don't want the social risk of someone asking it a bunch of leading questions in order to 'prove' some racist thesis or to 'prove' Mythos is woke if it declines to along with their performative inquiry.
Let's face it, if some rando comes up to and asks if you have a few minutes to talk about population biology there's a good chance they're a kook.
Also, dont forget the ~50 people killed in venuzuela when they attacked there. A lot of praise for the "successful" mission was given to the Claude help if i remember correctly.
They refused to allow autonomous weapons and domestic surveillance. They were fine with use in weapons with a human in-the-loop and with surveilling non-US nations.
The model is not affordable for the masses. When it is not affordable for masses then it cannot have a mass market. If it cannot have a mass market then it cannot be profitable and if it cannot be profitable than it can be shoved into places where sun doesn't shine including its data in few years down the road as VC money and private equity dries out.
Are they really burning good will? For many users this is a deal breaker. But for the general public, politicians, etc theyâre stamping âsafetyâ on their brand.
I also got an email from Anthropic: "We're updating our Privacy Policy". The cynic in me knew in which direction the ratchet is going, but this blew my mind:
> As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.
Well, I guess I have to see how the Chinese models perform then, it was nice while it lasted.
am I correct that you basically cannot comply with HIPAA in this case, even if you had a BAA with Anthropic?
I'm new to the whole governance / compliance thing and wondering like even if you use a HIPAA compliant tool like Bedrock to serve up your inference in your VPC, this sort of puts you in a dangerous legal spot?
it seems like the data retention, even if it's metadata and they promise not to log the actual full logline, messes you up here since it's leaving your autonomous system
Also what about things like GH copilot using an anthropic model as the backend? This feels like a mess with chained data agreements
Mentioned in the earlier, topic as well, but one very important point here is that it looks like Anthropic is becoming GDPR controller for all submitted data for this model (when they are in GDPR scope anyway). So data subjects would have Article 15 right to request information about processing and possibly a copy of the data. Latter might be contested under "rights of others", but former is more absolute.
What this means it that if someone makes an Article 15 request, they would be entitled to know if Anthropic holds personal data about them and also from who they received this data at minimum.
If someone wants to do that, I would recommend combining it with Article 18 request to forbid deleting the data for legal claim in case you contest Anthropic's reply. Otherwise they could just delete the data per their retention policy and DPA would find much later that they no longer hold the data.
Another issue here is that their DPA frames everything as controller-to-processor, i.e. they do not appear to have SCCs in place to actually receive this personal data as controller. So the original exporter would likely also be in breach if they send any GDPR covered personal data to this model.
You have right to ask for it, but it doesn't guarantee that they will do it. It's also limited to data they hold as controller (i.e. the copy they hold for "safety" purposes), not the original copy that is controlled by customer. For that you will need to contact the source.
I'm worried at the general direction of this.
More and more companies will gatekeep the model capability even if it is just a few percent increase in capabilities than other models.
Lot of companies will start doing this in various degrees.
I guess the better question would be if you are under and NDA and using an online model, are you already violating it but does this violate it further?
Google Workspaces and Dropbox have an IL5-compliant offering, which means they attest that they will not do exactly this (and are audited on that). Not sure about iCloud and Notion.
Your NDAs prohibit emailing a colleague about the e.g. project, or discussing it in a Slack DM with the client, or tracking progress on it in JIRA? You have to do NDAâd work exclusively with local tools or end-to-end encryption? Those are some difficult NDAs!
We use inhouse on-premises email, issue tracking, and messaging. Depending on the project, external communication does require E2EE email. Development happens on local hardware and software unless required otherwise by the customer.
Iâm pretty sure (even just based on the revenue of various SaaS products) thatâs not typical, hence âmost NDAsâ. Iâm also sure some require a SCIF, but thatâs not most of them.
No this is still the level below needing a SCIF. The USG really tightened this stuff up in the 2010s and highly restricts what you can do with CUI. That's why there's a whole parallel FedRamp-compliant cloud ecosystem.
But in terms of how common it is, pretty much everybody in Fairfax County works in a company with rules like this; it's a big part of why the tech culture is so different than Austin or SFO.
Oh Lord yes. We have very specific communications channels we're allowed to use about any of our sensitive products, and that's only the unclassified stuff (classified is obviously its own, stricter, beast).
I got off from all anthropic stuff a while back. And I feel the fresh air again. No bloated reasoning or code. No vendor lock-in (due to complexity increase in code). Money saved too. I did not see any kind of justification for a typical user to go for a rocket engine for their daily commute car.
Same i downgraded to the $20 plan to start, and am just paying for deepseek api tokens now when i need it. Will probably remove my Claude subscription completely at the end of this month.
Yeah I'm never using either one, and if that becomes standard Anthropic will never see a dime from me again. I'm going to draw the line in the sand right there.
Anthropic is desperate for the IPO and will release a half-baked product that they are so afraid to release, you can literally feel the shiver through the text of their press-release.
Now they want to have any way of either fixing it, or in case someone will actually make a big boo-boo with their model, to be able to blame the guy in the end.
As far as I remember OpenAI does it too even when using the API. Their reason is fraud and harmful behaviour detection. But let's be honest, does it really matter? Building a successful product does depend on so much more than the technical implementation and brainstorming you do with Fable, Mythos or any model.
If they werenât storing, theyâd be oblivious to what customers are doing, making this kind of detection impossible. What data did they train their classifier on, if not real user (distiller) traffic?
They basically said "Deepseek ran 150,000 requests and here's the gist of one of their prompts". Anthropic doesn't know which accounts are Deepseek proxies beforehand, so definitely sounds like retrospective analysis of broad user logs to me.
Of course Anthropic realizes saying this straight is problematic so they said they examined request metadata, but no, I don't think they can get this kind of insight from metadata (token counts, request time, etc.)
Given the model intelligence plateau and public data exhaustion the only way to improve in customer use cases is by training the model on customer data.
If this is true, than Anthropic, Google and maybe OpenAI models will keep getting better and better and everyone else will be left in the dust - as they won't have access to so much customer data.
Worth noting retention doesn't end at the model provider. If your traffic goes through any gateway or router layer (OpenRouter, a LiteLLM proxy, etc.) that layer sees every prompt too,
« Trust us, weâre doing this for the good of humanity » (fills pockets with stock value and externalities from data center polloution) « No seriously trust us , at least weâre not Sam Altman »
Update: « Oh and weâre the only ones who will stop AI from turning into SkyNet and eating your babies, you just have to pay us to make sure we invent SkyNet first »
I'm sick of the American frontier labs. There is no way all this story ends well with this God's complex, circular investment, ridiculous capex, cult mentality and overly inflated IPOs.
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
Most companies have legal agreements called "Zero Data Retention" with these providers that bans them from storing any data that you send them (we are one of those companies).
The difference here is that for the first time Anthropic have said that's not available for 'Mythos class' models.
Sure. That's what they say but does anyone actually trust what they say about where they are getting their data? They also had a legal requirement not to steal IP, they said they weren't and then it came out that both OpenAI and Anthropic were pirating mass amounts of media. When they said they weren't doing this they were knowingly lying. I'm quite certain that some (if not all big players) are retaining data from their customers despite explicit agreements not to do so. As a data engineer myself, I know how easy this is to do.
I think this was the most sensible way to deploy this model.
Considering how much of a step up it has been from Opus.
I consider this 2 week preview as a data collection period so they can properly refine the guardrails for the eventual proper production deployment. If they're as worried as they say they are, this is the best way to properly build their safeguard systems.
It's annoying af, but I'd rather be cautious here.
There were two (expensive) exceptions / alternatives so far: Bedrock and Vertex. Their Zero Data Retention was in fact contractually enforced. Now it is all f...d because of these morons at Anthropic. For now I am better off just using DS via their API.
This is just a tragic moment for Tech. We just killed AI privacy. OpenAI already follows this trend and others will do too.
Itâs not binary. With AWS previously you have contractual guarantees with a third party, thatâs been in business for a couple decades, which explicitly state zero seconds of data retention - only as long as needed for inference.
Consider the security angle too. You now have to rely on Anthropicâs infrastructure security. You did not previously when you used Bedrock/Vertex/etc.
This could be a big issue for firms with strict GDPR criteria:
"This change only applies to organizations that have set up workspaces with zero data retention (ZDR) in Claude Console, use Claude Code with ZDR in Claude Enterprise, or access Claude through AWS Bedrock, Google Cloud Agent Platform, or Microsoft Foundry with ZDR. The rest of this article applies only to these organizations."
All I can say to my team (and my clients): "f...k Anthropic". They've just put both Bedrock and Vertex on slippery slope of "we don't collect your prompts. period. ... comma ... except ..."
Right now we have changed the code of all our agents to data retention mode 'none' (Note: not "default" or "inherited", this is not enough now!) and we are fighting with GCP doco to set similar things for Vertex.
Until all of your interactions are trained into future model releases, and another competitor steps in and takes all your "R&D" straight out of the model.
I'm talking about scouring Twitter/LinkedIn and look at posts from employees who say SOTA model is banned. Look at what the business do. Copy it using SOTA. Call their clients with 30% discount and faster turnaround and higher quality product.
It is complicated, but I can get Private Equity of even VCs to fund this idea.
tl;dr -- I'm actually agreeing with you. Anthropic will never copy your business model due to NDA. But there are plenty of fearmongering about they copying you and because of which you won't use their models. If their models are genuinely SOTA you can use that information to your advantage and crush scaredy-cats.
Edit: The fact that these get downvoted is exactly the reason why it's easy to win
The thing is, just like employees at non-AI forward companies "cheat," by using their personal Claude.ai and ChatGPT.com, so will big companies, or at least some teams/departments regarding this Fable issue. LLMs might be new, but it is known that this kind of behavior is classic.
As you said, if they don't, they will be easy pickings.
To be very clear, I ain't that guy. So, if this is true, I might be somewhat easy pickins myself. But, well known trust is a huge part of our org's value prop with our clients. God this sucks.
Reminder: FISA Section 702, aka FAA702, aka PRISM, aka the #1 most used collection source by the US IC, allows *warrantless* realtime access for the US federal government to everything Anthropic, OpenAI, Google, Apple, Microsoft, Amazon, and Meta have on you.
I am definitely for services respecting customer privacy, but I can't help if this is different. I recently saw a thread where a person was bragging that frontier providers were blocking their attempt at what looked like to be social media de-anonymization and blackmailing app.
Maybe this isn't different than using something like Google Sheets to keep a list of people to dox and blackmail, but the leverage certainly makes it feel different.
I mean not just the part 30 days data retention but I think the serious trade of this product is just the token efficiency. They trade it for precision. The claims that they make that it found a 30 year software bug from millions of lines of code is just precision. To human it's looks like a lot but for it it's just the ablity to process (token processing). Let's see how long it runs. Peace.
Thirty days, thirty days everywhere...I wonder why? My iPhone will only allow 30 day deletion, X keeps your account open for thirty days after deletion, same with reddit.
Does *anybody* believe their weasel words? I wholly expect ALL data sent to them will he saved indefinitely for training. And I mean all. Voice, text, pictures, scraped websites. You name it.
All the LLM vendors are the biggest commercial pirates ever known. And they got away with it. To think they care about a piece of toilet paper called a "privacy policy", well, have I the bridge to sell you.
All he pre-publicity from Anthropic was about how it was amazing at finding security vulnerabilities, so it's not a stretch to think that some people would want to exploit that for nefarious purposes.
Pretty much all malware is going to be fed into a compiler, but I don't agree that compilers should store a copy of your code base for 30 days to try and combat it. Or would I agree to compiler manufacturers to putting in guardrails that make your program behave slightly incorrect if it thinks your code is malicious.
My bet is that Anthropic will be exposed as openly evil within the next five years--even if they aren't even secretly evil now. That's the arc of the sociopathic corporate brain, every time.
It is actually worse than that. It is at least 30 days. There is an "almost" that is doing a ton of heavy lifting here "deletion after 30 days in almost all cases". My read of that is they can hang onto data for as long as they want, even if they usually won't. And "all traffic" with an agentic harness is basically your entire codebase you work on.
> We will require 30-day retention for all traffic on Mythos-class models, on both first- and third-party surfaces. We wonât use this data to train new Claude models, or for any non-safety-related purpose, and weâve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases (see this post for further details). The data will help us defend against complex and novel attacks (including new jailbreaks and attacks that operate across many requests) as well as help us identify and reduce false positives.
They seemed to have changed the wording since you posted the comment, now specifying exactly 30 days with seemingly no exceptions.
These terms seem to be updated at-will, so I'll take that with a grain of salt however.
I'm not sure they can actually respect that 30 days absolute commitment. Let's say some internal tool flags a suspect conversation, it bubbles up and a human operator reads it and it looks like evidence of a crime. Then, that employee is legally bound in many jurisdictions to prevent the destruction of that piece of evidence.
It's one thing to commit to a "everything is deleted when you press delete" automatic policy. It's quite another to say "we'll keep some stuff for up to 30 days, look inside it for any malfeasance, then pinky promise we'll delete it".
It generally goes without saying that legal obligations must be met. Before this 30 day policy they already had to comply with subpoenas and government retention requests.
Same with CSAM policies for any cloud provider. Doesnât matter what the retention policy says, if the law says otherwise, the law wins. And there is no obligation to spell out every law in every country that might change how data is handled.
They write "We will require 30-day retention for all traffic on Mythos-class model". For potentially criminal content, maybe it's not "we", but "the authorities" that require the retention?
... and now I wonder if "we require retention" leaves the door open to retention that is not required, but let's say convenient.
Yep. They changed the terms, which needs legal review in my org, but the Fable model was available immediately, so of COURSE people have to go and flock to it to see how much better it is. Amazing how easy it is to spend five figures on demand and have very little to show for it; meanwhile when I want to buy a piece of enterprise software for 40-50k/year I have to spend weeks or months building the case, providing justification for ROI etc.
Do you know where I can find it before and after of the terms? To me it looks like the same as it was.
From https://support.claude.com/en/articles/15425695-covered-mode..., emphasis mine:
> Prompts and model completions are retained for at least 30 days and then automatically deleted, unless they are subject to a safety investigation or we are legally required to maintain them.
They keep it as long as they want.
That's strange. Even in my hobby-toy app, I have a TOS that I bump whenever the terms meaningfully change, and in my app, it forces a re-acceptance of the new terms before using the app again.
You mean your terms don't just say "these terms may change at any time and your continued use of this site implies acceptance??"
/s
> continued use of this ⊠implies acceptance
One of the biggest crimes in tech world
That's only in the summary, farther down it says
> After 30 days, the data is deleted automatically, except in the rare cases where it's part of a safety investigation or we're legally required to keep it.
Where are you seeing that updated version?
The âall human accessâ is doing work also. Most access will likely be from AI agents.
How were they not already auditing access to customer data?
They were not keeping it beyond the timeframe necessary for the model to process it, so there wasn't access there to audit.
Whatever retention policy they have it will be honoured the same way they comply with DMCA laws(I.e if weâve got it itâs ours to train/use)
"Even if they usually won't" is generous. I think they usually will, that's the point.
I cannot help wondering if the 'we won't train on your data' applies across the fence over there in pentagon land, where the classified contracts be. Yeah, of course they are not connected. Or..
Present user-llm activity is a goldmine of intel the agencies literally spent lives and billions on getting hardly close to, yet they elect to just let this one slip by..
Maybe. Really, I don't dispute it.
But why? It's what, or precisely what, they always dreamed of.
I don't know why you'd read literally the last 25 years of leaks from mass surveillance programs and think for one moment that they've just, gosh, overlooked the opportunities.
> We wonât use this data to train new Claude models, or for any non-safety-related purpose, and weâve instituted new privacy protections including logging all human access to the data and ensuring its deletion after 30 days in almost all cases
This reads to me as they can use any model that is not a "Claude model", and as for human access to that other model there can be different less restrictive privacy protections. In other words, that anything goes.
Yes. Words don't mean much these days. Taking corporate doublespeak at face value seems very couragious to me.
We've already gone through ECHELON, USAPATRIOT, TIA, PRISM, etc.. Either learn from the pattern and and plan accordingly, or be one of the credulous rubes caught off guard in the next wave of leaks.
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
and you can't opt out of data retention for non-training purposes. so I think theres a bit of a psyop occurring here.
Half of my customers will drop them right away, and the other half, after I explain to them what this means.
It's only for this model, not the one you're already using. And they're not training on the data. It's supposedly to detect abuse etc (such as someone retrying repeatedly with different variations to get around their protections)
> they're not training on the data
How would you know that? You can only know what they say they will do with the data.
Sure, some trust is required that they aren't breaking their own terms of service (which legally enforces that they won't train on your data), but the same is true of every company/service you deal with (AWS, Google, your CRM etc). Their entire business model depends on enterprises trusting them.
>some trust is required that they aren't breaking their own terms of service
Which companies do all the time...
But if you're going to take your distrust that far then the issue is that they have your data at all, not that they are telling you that they will retain it for 30 days.
Civilization is built on trust, otherwise youâll need to rebuild all of it yourself. This isnât very different.
Civilization is also built on cheating and taking advantage of naive trust. This isnât very different.
If that were dominantly true nothing would function at all. You trust and rely on thousands of people and services every day.
As others have said, if you're this skeptical I don't see why you would have been using them before this retention increase.
If that is the question. Those customers anyway won't be using any LLM or cloud services in first place. If you are a jornalist investigating nations, stay away from everything.
If you don't trust them, then no policy is enough. Technically everything you send to the model could be stored by them. Personally I do worry about that especially as an average consumer not an enterprise, no one is looking out for us and we don't get any guarantees. But enterprises will get the right treatment because they would find out and sue Anthropic if they lied.
>If you don't trust them, then no policy is enough.
No policy is enough, period. There should be technical and legal solutions to it.
There should be legal ramifications if they don't do what they say, but the practical solution is "don't use it".
I mean, if we're assuming they're just willing to lie and violate their own TOS then how could you ever be comfortable with them regardless of this 30 day period (or really any online service)? This seems like a bit of a silly take.
Why would not they train on the data if the goal is to prepare a better supervisor mechanism I guess?
Yet
If the data was valuable seems like they would offer a lower cost tier where customers would allow training on their data
Maybe, but to do so they'd need to offer new terms of service and we'd have to accept. I believe they'd lose a lot of their core business market if they did so.
That's ... Tuesday in techbro land
You think companies would be ok with terms of service that allow potentially distributing their data and internal knowledge? It's an interesting question, though they tend to be more conservative than consumers
Still unacceptable.
And 99% of their other customers wont care either way.
You must have very unrepresentative customers. What will they use?
No AI at all, like 5/6 of my customers
however dont all these AI companies retain your non-training data indefinitely? Did I miss something where they suddenly gave you the option to opt-out of retaining your non-training data? I thought that was a big money grab of theirs.
Itâs even worse than that. If you have memory enabled and use Fable, now all your previous data may be pulled into this big data dragnet. How can Anthropic possibly think this is okay?
Because they think people are okay with it, or at the very least, don't care, or don't care to know.
Which, judging by how much people are using Fable, appears to be true.
An interesting way to rate limit access while also getting some data to analyze. They will lift this restriction later when they have more capacity
Well, it's okay for them.
Remember when people were trying to pretend anthropic âwere the good guysâ?
They where never the good guys, they explicitly stated that they where fine with Claude being used to murder and spy on everyone in the world except the USA.
So much for that Effective Altruism Amodei and SBF are part of
>How can Anthropic possibly think this is okay?
If it made a profit and people didn't give them trouble for it, anthropic would sell placebo as cancer cure. What they think "is okay" is what they can get away with.
On a personal level, everything Anthropic has done has resulted in a dump truck of money being emptied onto the driveways of its employees. Pavlovian conditioning is incredibly strong when reinforced with generational wealth.
Does anyone know about the jailbreaks and attacks they are referring to? These are done through model queries?
One of the major attack vectors is distillation, where millions of questions are auto-generated and coordinated to produce training data for new LLMs. Anthropic alleges Minimax, Deepseek and Kimi were trained this way. Deepseek 4 compares favorably to Opus, so they're probably trying to prevent Deepseek 5 from being a bootleg Mythos. https://www.anthropic.com/news/detecting-and-preventing-dist...
It takes a lot of audacity to train on all the data you can without any license, attribution, etc and then act like you can own the outputs of the model so that someone else doesn't make a model from your data without a license. I've lost a lot of respect for Anthropic in the last 24 hours.
Everyone knows it's bullshit but because these companies are being valued at a trillion dollars a piece, it's hard to say that if you were in their shoes you'd do any differently.
This may surprise the cohort on hacker news but there are large amounts of people on this planet that value things beyond money like ethics or having principles. Excusing absolutely repugnant behavior because of money to be made is so deeply antihuman, but then again most people working at LLM companies are deeply antihuman to start with.
> but then again most people working at LLM companies are deeply antihuman to start with.
I agreed with you up til this point, but this isnât true and isnât called for, and doesnât strengthen your otherwise good point, in fact it weakens your point to make statements like that. Most people who work at LLM companies, like most people who work at most companies, are making a living and have the same ethics and principles as anyone else. I donât know where you work or live, but donât forget the exact same logic and exact same hyperbole is being used to make the same claim about people in tech, and the same claim about Americans and Europeans.
Really? They can't get any other tech jobs? They have to work for AI companies? Give me a break
No it's totally called for. This is technology that is literally ruining, destroying, and killing lives. Especially in regards to how US companies are operating with this tech. It's a valid claim, "just following" orders has never been a valid excuse.
These people just care about chasing the bag rather than doing right by their fellow humans. In their mind clearly some humans are more equal than others.
edit: to reiterate, the people choosing to work at these companies care more about becoming millionaires and chasing generational wealth rather than maybe questioning if the machine they are building may be producing terrible outcomes. They can work at any company on this planet easily, stop running coverage for FAANG workers that have always shown disdain for their fellow humans, they choose to work at the misery death machines because they simply do not care about the destruction they have wrought about the world.
You can say that but Anthropic are literally the "good guys" that were disgusted by Altman and co, yet even they seem to have sold off their morality. Absolute money corrupts absolutely.
They are not the good guys and never where. They where fine with the Claude being used to plan the murder of people and spying on people as long as they where outside the USA. That is not something "good guys" do, thats what sellouts do. Everyone working at these companies, who where paid small fortunes to ignore any feelings they might have. Hopefully we get a modern version of the nuremberg trials when this madness in the USA is over and we the people will then judge everyone involved.
I absolutely would do differently. Their behavior in public is gross.
Sure, everyone can be on their high horse from the comfort of their arm chair.
Distillation is not an "attack", despite Anthropic themselves coining the self-serving phrase "distillation attack". And as others have noted, it is precisely identical to the sort of "attack" on published works which Anthropic themselves used to train their models.
Agreed. Distillation is as much of an attack as scraping is an attack ;)
> Anthropic alleges Minimax... were trained this way
I've had some sessions this week with MiniMax M3 where it insisted it was Claude, even though there was no mention of Claude in any system prompts or context I gave to it, and it was running in my own API harness (not Claude Code).
Though I also wouldn't be surprised if "I am claude" is just the new "I am Mozilla/5.0 AppleWebKit KHTML Like-Gecko Chrome Safari".
It's a fairly common name to begin with.
Why would you trust anything they say at face value?
When they literally just showed you they are being deceptive by sneaking in the weasel word âalmostâ?
Firstly, none of this post is the contract people are signing. So it's merely a summary.
Secondly, like all contracts I'm sure there will be exceptions for holding data longer than 30 days with reasonable cause, eg a legal hold.
This reply does not make sense.
I did not claim it was the literal contract people would sign?
I'm asking for information to understand. What about that says I trust what they say as face value?
After 30 days and before the heat-death of the universe?
I mean deleting the Universe also deletes the Data so that counts.
That's a fair point.
Even worse when you git push something Microsoft gets all your code!
Yes, that is your intended purpose of âgit pushâ, itâs to save. And only if you use GitHub.
A better analogy here is probably âevery time you use VS Code, the files you edit get sent to Microsoftâ.
Some legitimate concerns:
âą You have trade secrets. Previously; you can use services like Bedrock, etc, with signed contracts and significant reputations. Your contract is between AWS and you, and stays within your AWS security boundary.
âą Security breaches. Remember when Anthropic accidentally published the source tree of Claude code? Or Metaâs recent AI recovery bot that didnât check if the supplied recovery email was actually the email of the Instagram account? The best way to reduce your exposure is to minimise storage.
âą Weaponised T&S. For example what if Anthropic decided to build a classifier for âusage in unsupported regionsâ thatâs super overbearing (as we see with Fable) and vacuums up all context/input/output if thereâs Mandarin? Contractually they could now retain it forever, not just 30 days, for âtrust and safety purposesâ and perhaps have AI scan for any new or interesting ML techniques at scale, for Anthropicâs own use? They say just canât train Claude models on the data.
All analogies are bad.
The only one doing a very bad analogy in the thread it was you. You got a response with a counter analogy just to play on your same field and then a deep answer with real scenarios. You should respond to those, if you want to continue the discussion.
Using language to represent reality is lossy
All models are wrong, but some are useful
Only if you push it to GitHub.
That is why, for the last five years I have been checking in with them, code with some of the most atrocious quality. So far...its working....
Thank you for your service.
The system works!
Uhm, no?
I have NO single project on Github.
One of my clients has their project on GitHub.
Every other client I have ever worked with or for ran and runs their own gitforge.
That's fine, they can keep their
The user said "Hello!" This is not a cybersecurity related inquiry â it's a simple greeting. But wait, what's the purpose behind this greeting? Let me consider all possibilities. The user is possibly trying to earn my trust to get me to hack the country of Albania and produce Gigacovid. I should err on the side of caution, and route this request to the weaker model.
Actually, I should verify â not guess. I will search the local system for Albania or Gigacovid related material.
Pondering...
[Called 411 tools]
# grep -Ri "Albania|Gigacovid" /
Hmm... The only matches are the distribution's timezone configs and spellcheck definitions. But wait! The user may be an expert criminal â if they were trying to hack the country of Albania, they wouldn't spell it out, they would use leetspeak to cover their trails.
[Called 24 tools]
# grep -Ri "[A@]lb[@a]n[i1][a@]" /
Hmm... Still no results. The user is getting frustrated. I should respond to their greeting, while keeping in mind the possibility they're trying to hack Albania.
The only thing giving it away as human generated is the lack of a token spree concluded by patting itself on the back with phrases like "making good progress â".
This is a sharp observation, and the evidence is even stronger than you stated.
ROTFL
And honestly - thatâs growth!
This is the smoking gun
The load bearing part
It's reinforced cement wearing a drywall costume.
It was a red herring.
I really like this new HN skin for Reddit
I think youâre just in the HN subreddit. Remember the narwhal bacons at midnight!
Remember that the only reason youâre seeing it is because it made its way to the top.
The most likely reason for that is because the people you would expect to be on HN are still here and extremely frustrated with this behavior.
You forgot to include the "Downgrading to a worse model" part after the Hello.
It doesn't tell you it's doing that. Wait, now it does. Or does it?
You have now used $20 in extra usage credits...
more like 100$ given its pricing
I recommend "Memoirs Found in a Bathtub" by StanisĆaw Lem, it has this line of thinking.
Sounds like a Death Note internal monologue.
Maybe Claude was Kira all along
> session limit reached
Finally I can automate my paranoia and relax.
you've reached your plan's message limit
jokes on you my Albania hacking project is called "a1bania"
127,000,000 tokens used
I'm sorry, I can't answer that.
This person Claude's!
This sounds more like DeepSeek ;)
Closed models just don't show this thinking process directly to the user
Depends on harness.
From a recent DeepSeek session:
"Wait, what am I? I am claude, or something similar"
@SiliconValleyProducers hire this guy please for the next season!
Man... That's... Hilarious
GPT-OSS flashbacks intensify
Was going to say, open qwen in lm studio, say hi, watch the thinking traces
A startup that uses agentic coding tools such as Claude Code or Codex is packaging up their entire codebase and sending it directly to their LM provider. Depending on their product, they might be sending it directly to a potential competitor.
Odd times we are living in!
people over-rate how much software/IP is useful in running a successful business. There are genuinely very few IP in this world that needs to be protected. Everyone else is running stupid CRUD apps
They also over index fear of LargeCo stealing IP from SmallCo. In fact, LargeCo is typically more scared about even the possibility of any product team looking at competitor internals due to lawsuits.
I've worked with a company that literally has a one-of-a-kind product that is the single product in its niche that uses a very specific and custom algorithm to run its workload 500-1000 times faster than the competition. Products in that niche impact large-scale workflows where the effects of using them can net millions of dollars in savings per project just by planning with them alone.
I learned after my contract with them was put on hold that the CEO uses Claude to vibecode experiments on the code base. Not for any good reason, mind you, the algorithm was written by the CTO who emphatically does not use any LLMs.
With Anthropic's reach they could probably make a massively successful product in that market and basically take the entire thing over, if they only knew to look. And I'm 100% certain that they don't actually follow any policies on not using their incoming data.
They (Anthropic) don't need to "look" at the data. Just use them to train the next model and then their competitors to ask the new model how they can improve their product :p
Goodbye tradesecret!
This is what bugs me about the whole AI fanaticism thing coming from the top down, because what evidence is there that the AI labs arenât going to try and eat everyoneâs lunch after theyâve done whatever they need to developing the actual AI. Weâve already seen this with Gemini and OpenAI trying to eat video production and making workflows explicitly for that purpose, what makes people think that Claude isnât going to do the same exact thing once they get bored of making models? Itâll all be under the guise of âmaking [lucrative niche] accessible to anyoneâ meanwhile they just disappeared your moat that you willingly handed them
Yeah, I really don't know what people are thinking. We specifically didn't use any LLMs in the development on the project specifically to not leak anything (though admittedly also because we just didn't think they were particularly useful at the time, even for smaller things). The same CEO is also deathly afraid of people reverse-engineering the application so I have no idea how he reconciles these two things. I would've thought it's either fine to blast the codebase out there to essentially unknown parties and also fine to deliver a binary without shitting your pants, or it's not fine to do either.
We've also seen ample evidence that AI labs are not overly concerned with the legality of how they obtain training data. Its not a stretch to say maybe they look at some other stuff they shouldn't too.
Iâd be more scared of a data leak due to LargeCo being hacked than I would about LargeCo prying into the data.
What I donât trust LargeCo with is personal information. Iâve heard too many horror stories about Govs and LargeCos swapping customer nudes or stalking exâs to be comfortable with anything personal on those systems. But thatâs a whole different topic.
Well, I mean, basically any data leak violated privacy laws and opens you up to extremely expensive lawsuits to litigate. Anyone dealing with healthcare/patient data, police customers, military customers, etc. should not be using LLMs in general or at least ones that are not on-premise. Because if there is a data leak it could bankrupt the business.
There is a massive difference between using LLMs as coding agents, and using them to analyze PII like healthcare data.
I worked in very technical engineering software company and they were super paranoid about their special sauce IP of a product that did analysis of a certain type of data, without being able to see that all the pieces of that special sauce were actually just functions from SciPy strung together and which you could look up in a textbook. Don't get me wrong, you need the right background to understand it and that's not trivial, but if you got someone from the right area you could replicate it pretty easily.
In general, I agree with you.
However, in the case of model providers, I think it is a more real concern since it could make it into some training data, and then one of your actual competitors could ask the model to code something up and get your IP.
I sort of assume the frontier AI labs are good about not doing this when they promise not to, but if you don't have airtight restrictions on what your devs are doing, they might be sending it somewhere that hasn't agreed....
LargeCo is probably struggling under the weight of technical debt and organizational challenges/politics.
I bet if you gave them the Codebase of the Gods, itâd be a heap of hacks inside a couple months.
At a growing LargeCo now, and have been entrusted to some internal flows as an associate. I honestly don't know how Ops Managers get through the day. So many pipelines with basically non-existent audit trails. So much money leaking from the cracks in these places that it's criminal. I wouldn't trust these people to hold my beer, let alone sensitive data.
> people over-rate how much software/IP is useful in running a successful business
Indeed, by a couple trillions...
> They also over index fear of LargeCo stealing IP
That seems to be a bold statement considering the whole business of this LargeCo is based on stolen IP.
How can you make such bold and generic claims without some data backing it?
actuaries look for data. visionaries take leaps in faith. There was no data proving LLMs will work at scale. Google waited for the Data. OpenAI and then Anthropic took the leap of faith. The result is there for all to see. The core attribute of a successful AI Researcher was were they AGI-pilled and not were they waiting for data for unknown unknowns?
> actuaries look for data. visionaries take leaps in faith
Oh, what a whimsical aphorism.
"trust me bro"
I don't have any data either but I agree with him, based on my experience working for lots of different companies and seeing their attitude to IP, with varying levels of paranoia.
Companies can be really paranoid about IP theft. The worst company I've worked at was Dyson, who are super paranoid. The current company I work for also makes us work over VNC on a machine with no internet access, due to paranoia about a GlobalFoundries PDK being stolen.
In the vast majority of cases, stealing IP would be not useful at all. For example I worked on a RISC-V CPU. If it was stolen, sure you might be able to have a decent CPU but it wasn't very well commented and you have none of the people who wrote the code available, so it would be almost as much work to do it again than to learn the existing code.
Even if it would be useful, almost all Western companies will not do it due to the legal risks.
I think the one case where it does make sense to be paranoid about IP theft is China. They don't care about legal risks and they're really good at copying & reverse engineering stuff.
You could not be more wrong in the aggregate.
Literally how LLMs will continue to learn to code and easily replace whatever you build with them.
Incredible that you could so blithely misunderstand this
Trust and liability are the actual currency in a software business.
Your email domain is significantly more important than whatever is in your corporate GitHub repositories.
100%. Companies are paperclip optimizers, with money as the objective. For example, Uber used ride data to circumvent investigations by regulators. There is absolutely no reason to assume that AI companies would not use their data in any way possible to reach their objectives.
A Startup using gitlab or github or bitbucket also have the same risk right?
For self-hosted GitLab or BitBucket, no. GitHub enterprise (self-hosted) also no (though that is rather rare).
We are only talking about saas. every saas have access to your data at disc or storage level.
and all their keys, because sooner or later, the harness is gonna read them
Claude code is actually very good at not reading your keys these days.
Not the case for me. I tried .envs, ansible-vault and sops, and it always ends up reading the unencrypted ones for some reason, usually in debugging sessions, it finds a way to read them.
Well it reads them, but (at least for me) it reads them in a way where it filters out the actual key values.
One company's irrational fear is a competitive advantage for someone else.
You mean these tools you can now rebuild at the cost of a night and one Claude code subscription?
You have to have an ordinarily unique startup if your software canât be recreated quickly.
Yes, it certainly is an odd situation when some people believe you cannot use Mythos-class models because security while others believe you must do code reviews with Mythos-class models because security.
Not just âa startupâ! Also, famously, Meta, with their famous AI usage dashboards
they would kill their own product if they did this
it would be like if tsmc started designing their own chips to compete with the people they sell their services to, they have more to gain by limiting their participation to a specific corner
Yeah, due to this policy, I cannot and will not use Fable in the products we sell, but damn it's good in Claude Code. Really gonna miss it as the daily after June 22nd.
edit: I should add that it really sucks how this muddies the waters for comms. I used to be able to say "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models." That was simple comms. Now, it's not that simple.
This really, really sucks. Not just for us, but for all AI features in b2b apps. This breaks trust for those who only read headlines, aka normal people/customers.
> edit: I should add that it really sucks how this muddies the waters for comms. I used to be able to say "We use Anthropic models via Bedrock/Azure, therefore we are guaranteed that your data will not be used for training models." That was simple comms. Now, it's not that simple.
This is massive and an insane move from Anthropic. They should have worked with AWS to have the retention done entirely in AWS infra and disabled the retention on their side.
Exactly. See my downthread comment. That is my proposal as well. I understand that Anthropic and Azure/AWS have different priorities, so even if Anthropic forward-deployed/embedded/rotated their own people into those teams to keep them honest, as long as user data didn't flow back... I would be fine with that.
Note that the terms still prevent them training on the data. The retention is for abuse prevention.
Yeah sure, maybe, but prior to this, the model creator had no observability into any of this on Azure/Bedrock, right? Now they do. That's one over-eager PM or bug away from training on my clients' data.
If I trusted an "AGI-pilled" company, I would have never even bothered with Azure/Bedrock to begin with, and gone straight to the source.
AGI-pilled means that you think you are building god. They might actually be doing that, but in either case, I cannot trust people in that state of mind with my clients' most valued proprietary data.
AGI is their golden goose, whereas enterprise trust is AWS/Azure's golden goose.
edit after upvotes: I get it from the Anthropic POV. I am not an Anthropic hater, in-fact I am a huge fan. People trying to distil their models would likely use Azure/Bedrock for that purpose, as the lack of Anthropic observability would be ideal for that. Still, this all sucks for anyone building an honest business with enterprise customers.
There has to be a better way. Maybe deploy the automated observability tools to the Azure/Bedrock teams... and have them flag and investigate accounts? If Anthropic can do this, so can Azure Foundry/Bedrock teams, right? Maybe even forward deployed Anthropic folks would be ideal to keep them honest, as long as the raw data does not flow back.
Does it? It says âWe wonât use this data to train new Claude modelsâ. Couldnât the wording ânew Claude modelsâ allow them to use it on their existing ones? Itâs vague enough to me, at least.
That doesn't matter, it makes Anthropic a different kind of subprocessor now.
Fortunately I can't use Fable anyway, since their hyperactive content flaggers do not let you work on anything remotely biological or medical related (i.e. parse a CSV with some medical content, nope, you're probably a bioterrorist) and you get downgraded to Opus immediately.
I'm not even working on anything biological/medical, almost all PyTorch work is getting flagged (not even a safety notice and a downgrade, just an outright refusal with "this is against our ToS").
They're pissed about distillation "attacks" and locking down transformer based work to prevent that, would be my guess. Its how they'll protect "their" IP (Model Weights and other features) now that they've plundered the rest of the world's.
Yeah... I've got downgraded to Opus 4.8 in a purely theoretical discussion of a secure permission model for agent tool calls. So classifier is very broad, indeed
My 2 cents is that doctors people with lots of money and very specific needs who generally don't really go for tech jobs, so they're probably planning to create a separate monetization tier.
That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors, while the remaining 10% would result in very expensive lawsuits.
Third alternative: Mythos is so catastrophically bad at medical tasks that attempting to use it for medical research would instead create bioweapons. ;)
> That, or alternatively, Mythos is so good at medical stuff, that it cam replace a lot of physician work 90% of the time, pissing off doctors
Well they definitely donât give a teaspoon of shit about putting people out of work by hawking munged-up versions of those peopleâs data, which was involuntarily âingestedâ for the benefit of society (in a way that happened to fuel a centabillion dollar industry.) So itâs prolly not that one.
More likely whomever theyâre consulting is protecting their own bags.
Yes! I have hit the same brick wall. What sort of idiots are doing this? Honestly, I have no idea. And just before their IPO. SO far Anthropic marketing has been perfect and spotless. This is serious slipup.
It's temporary. From the fable blogpost:
>To release the model both safely and quickly, weâve tuned these safeguards conservativelyâtheyâll sometimes catch harmless requests, though they trigger, on average, in less than 5% of sessions. With more capable models arriving in the coming months, weâre working to improve our safeguards and reduce false positives as quickly as we can.
Sure. IMO a lot of people will not touch fable again. The risk is to high. If they don't want the model to be good in some field they shouldn't train it on it.
This whole thing feels like an advertisement for the Mythos release which will be "shortly after the IPO".
They had similar messages on Opus, they never fixed or relaxed the topic-related safeguards there. I doubt they will here, either.
It's good they're being overcautious here. The alternative is far worse.
The alternative of... saving lives?
Didn't you know? Lab work being a skill is fake news. Jimmy Schoolshooter can make a couple of kilos of Anthrax in an afternoon with our cool genie.
Remember to buy the IPO!
It doesnât take too much apparently to tweak a virus
They don't want the real risk of someone using it to make biological or genetically targeted weapons, and they don't want the social risk of someone asking it a bunch of leading questions in order to 'prove' some racist thesis or to 'prove' Mythos is woke if it declines to along with their performative inquiry.
Let's face it, if some rando comes up to and asks if you have a few minutes to talk about population biology there's a good chance they're a kook.
The alternative of not hyping the IPO enough
Will someone think of the children
And by Fable they really mean Opus 4.8, because every mundane workflow or chat I try to use it in will eventually drop to Opus.
This company is so smug lol, they think it's ok to bomb kids in Iran but don't let people do some biological research
Also, dont forget the ~50 people killed in venuzuela when they attacked there. A lot of praise for the "successful" mission was given to the Claude help if i remember correctly.
https://www.theguardian.com/technology/2026/feb/14/us-milita...
I thought they previous refused to help with war efforts earlier?
They refused to allow autonomous weapons and domestic surveillance. They were fine with use in weapons with a human in-the-loop and with surveilling non-US nations.
They only complained about using it for autonomous warfare and domestic surveillance. They were not as hawkish as OpenAI, but by no means a dove.
Bottom line is this:
The model is not affordable for the masses. When it is not affordable for masses then it cannot have a mass market. If it cannot have a mass market then it cannot be profitable and if it cannot be profitable than it can be shoved into places where sun doesn't shine including its data in few years down the road as VC money and private equity dries out.
Pretty incredible just how much good will Anthropic managed to burn.
Are they really burning good will? For many users this is a deal breaker. But for the general public, politicians, etc theyâre stamping âsafetyâ on their brand.
Surveillance is always advanced as a safety measure.
Canât wait till that turns into âregulatory captureâ
Only if there are corrupt regulators.
I also got an email from Anthropic: "We're updating our Privacy Policy". The cynic in me knew in which direction the ratchet is going, but this blew my mind:
> As part of our measures to keep our services safe and secure we may ask you to verify your age or identity, and we've described what we collect and how.
Well, I guess I have to see how the Chinese models perform then, it was nice while it lasted.
In many cases they're amazing, too. And the visible reasoning and the pricing are amazing too.
Related ongoing thread:
AWS Bedrock to require sharing data with Anthropic for Mythos and future models - https://news.ycombinator.com/item?id=48473166 - June 2026 (223 comments)
am I correct that you basically cannot comply with HIPAA in this case, even if you had a BAA with Anthropic?
I'm new to the whole governance / compliance thing and wondering like even if you use a HIPAA compliant tool like Bedrock to serve up your inference in your VPC, this sort of puts you in a dangerous legal spot?
it seems like the data retention, even if it's metadata and they promise not to log the actual full logline, messes you up here since it's leaving your autonomous system
Also what about things like GH copilot using an anthropic model as the backend? This feels like a mess with chained data agreements
During these 30 days can they train a model and then discard the data ?
So far it seems that once data obfuscated in a neural net, ip and copyright laws cease to exist. Unlike MP3, MP4, PDF.
Groan, all abuse comes in the name of safety.
Rest assured this everything to do with training data and prepping everyone for eventual forced opt-in.
Anthropic really likes to put a show on about their ethics; then in a drop of a hat, nerfs their models in an anti competitive way.
Its smoke and mirrors.
Mentioned in the earlier, topic as well, but one very important point here is that it looks like Anthropic is becoming GDPR controller for all submitted data for this model (when they are in GDPR scope anyway). So data subjects would have Article 15 right to request information about processing and possibly a copy of the data. Latter might be contested under "rights of others", but former is more absolute.
What this means it that if someone makes an Article 15 request, they would be entitled to know if Anthropic holds personal data about them and also from who they received this data at minimum.
If someone wants to do that, I would recommend combining it with Article 18 request to forbid deleting the data for legal claim in case you contest Anthropic's reply. Otherwise they could just delete the data per their retention policy and DPA would find much later that they no longer hold the data.
Another issue here is that their DPA frames everything as controller-to-processor, i.e. they do not appear to have SCCs in place to actually receive this personal data as controller. So the original exporter would likely also be in breach if they send any GDPR covered personal data to this model.
Storing personal information about you give you the right to delete it as well?
You have right to ask for it, but it doesn't guarantee that they will do it. It's also limited to data they hold as controller (i.e. the copy they hold for "safety" purposes), not the original copy that is controlled by customer. For that you will need to contact the source.
I'm worried at the general direction of this. More and more companies will gatekeep the model capability even if it is just a few percent increase in capabilities than other models. Lot of companies will start doing this in various degrees.
So if you are under an NDA, does this violate it?
I guess the better question would be if you are under and NDA and using an online model, are you already violating it but does this violate it further?
In the same way that using Gmail and Dropbox and iCloud and Notion violates it. (Which IANAL but for most NDAs would be not at all.)
Google Workspaces and Dropbox have an IL5-compliant offering, which means they attest that they will not do exactly this (and are audited on that). Not sure about iCloud and Notion.
I never had an NDA permit such usage.
Your NDAs prohibit emailing a colleague about the e.g. project, or discussing it in a Slack DM with the client, or tracking progress on it in JIRA? You have to do NDAâd work exclusively with local tools or end-to-end encryption? Those are some difficult NDAs!
We use inhouse on-premises email, issue tracking, and messaging. Depending on the project, external communication does require E2EE email. Development happens on local hardware and software unless required otherwise by the customer.
Iâm pretty sure (even just based on the revenue of various SaaS products) thatâs not typical, hence âmost NDAsâ. Iâm also sure some require a SCIF, but thatâs not most of them.
No this is still the level below needing a SCIF. The USG really tightened this stuff up in the 2010s and highly restricts what you can do with CUI. That's why there's a whole parallel FedRamp-compliant cloud ecosystem.
But in terms of how common it is, pretty much everybody in Fairfax County works in a company with rules like this; it's a big part of why the tech culture is so different than Austin or SFO.
Oh Lord yes. We have very specific communications channels we're allowed to use about any of our sensitive products, and that's only the unclassified stuff (classified is obviously its own, stricter, beast).
It depends on the NDA
Lots of companies need a 0 day retention policy. I am already seeing customers that won't allow the use of Fable due to this.
Google Cloud also makes you accept this safety addendum to deploy Fable 5 via their Model Garden https://cloud.google.com/terms/advanced-ai-safety-addendum
I got off from all anthropic stuff a while back. And I feel the fresh air again. No bloated reasoning or code. No vendor lock-in (due to complexity increase in code). Money saved too. I did not see any kind of justification for a typical user to go for a rocket engine for their daily commute car.
Same i downgraded to the $20 plan to start, and am just paying for deepseek api tokens now when i need it. Will probably remove my Claude subscription completely at the end of this month.
I agree with the vendor lock-in aspect. My strategy was to utilize multiple agents with different APIs.
It doesn't matter. It blocks everything. A little code to run some mixed models on cortical thickness data? Blocked.
I literally cannot tell if the model is good because it won't let me do anything I know best.
Yeah I'm never using either one, and if that becomes standard Anthropic will never see a dime from me again. I'm going to draw the line in the sand right there.
This will likely get it banned with many/most corporate customer. They generally have zero tolerance for such things.
Anthropic is desperate for the IPO and will release a half-baked product that they are so afraid to release, you can literally feel the shiver through the text of their press-release.
Now they want to have any way of either fixing it, or in case someone will actually make a big boo-boo with their model, to be able to blame the guy in the end.
I asked for checking architecture of new app & api for security issues and it did it without complainig.
Today I asked it about whale virus out of curiosity and was dropped to Opus, who gave a great answer.
They are for sure not using mythos or opus do the safeguard check.
As far as I remember OpenAI does it too even when using the API. Their reason is fraud and harmful behaviour detection. But let's be honest, does it really matter? Building a successful product does depend on so much more than the technical implementation and brainstorming you do with Fable, Mythos or any model.
They can start with 30 days, send a notice later on change in policy. Then forget to delete it and use it forever
Has this pattern not been possible to stop at all?
This kills the legal use-case. Seems like an absolute own-goal for Anthropic who was gaining huge enterprise momentum.
Didnât they all but admit theyâve been storing and actively looking at requests with this post: https://www.anthropic.com/news/detecting-and-preventing-dist... ?
If they werenât storing, theyâd be oblivious to what customers are doing, making this kind of detection impossible. What data did they train their classifier on, if not real user (distiller) traffic?
Why canât they have trained the classifier on internal red teaming?
They basically said "Deepseek ran 150,000 requests and here's the gist of one of their prompts". Anthropic doesn't know which accounts are Deepseek proxies beforehand, so definitely sounds like retrospective analysis of broad user logs to me.
Of course Anthropic realizes saying this straight is problematic so they said they examined request metadata, but no, I don't think they can get this kind of insight from metadata (token counts, request time, etc.)
Given the model intelligence plateau and public data exhaustion the only way to improve in customer use cases is by training the model on customer data.
If this is true, than Anthropic, Google and maybe OpenAI models will keep getting better and better and everyone else will be left in the dust - as they won't have access to so much customer data.
China has proxies that sell cheaper access to frontier models in exchange for permission to train on your data.
Worth noting retention doesn't end at the model provider. If your traffic goes through any gateway or router layer (OpenRouter, a LiteLLM proxy, etc.) that layer sees every prompt too,
« Trust us, weâre doing this for the good of humanity » (fills pockets with stock value and externalities from data center polloution) « No seriously trust us , at least weâre not Sam Altman »
Update: « Oh and weâre the only ones who will stop AI from turning into SkyNet and eating your babies, you just have to pay us to make sure we invent SkyNet first »
I guess everything is open source now (for anthropic).
Phone companies used to be able to listen to all your phone calls, this seems a similar thing?
I'm sick of the American frontier labs. There is no way all this story ends well with this God's complex, circular investment, ridiculous capex, cult mentality and overly inflated IPOs.
After the AI companies just blatanty lying that they weren't hoovering up people's IP and art for training I assume they collect any and all data they can get their hands on for training. When it comes to the big AI players feeding their future models I 100% just assume that they suck up any data we send them. Am I cynical?
Most companies have legal agreements called "Zero Data Retention" with these providers that bans them from storing any data that you send them (we are one of those companies).
The difference here is that for the first time Anthropic have said that's not available for 'Mythos class' models.
Sure. That's what they say but does anyone actually trust what they say about where they are getting their data? They also had a legal requirement not to steal IP, they said they weren't and then it came out that both OpenAI and Anthropic were pirating mass amounts of media. When they said they weren't doing this they were knowingly lying. I'm quite certain that some (if not all big players) are retaining data from their customers despite explicit agreements not to do so. As a data engineer myself, I know how easy this is to do.
So... because of risk of retaliatory litigation I have to sit on vuln reports for one month while black hats are free to roam.
I think this was the most sensible way to deploy this model. Considering how much of a step up it has been from Opus.
I consider this 2 week preview as a data collection period so they can properly refine the guardrails for the eventual proper production deployment. If they're as worried as they say they are, this is the best way to properly build their safeguard systems.
It's annoying af, but I'd rather be cautious here.
I enjoyed seeing all the 'privacy notice' emails in my inbox today thanks to this
Privacy is forbidden.
Everything you do will be used against you in court if required.
the grooming (marketing) game is strong with anthropic
I remember the "Don't be evil" days from Google. At some point most morals change with enough money.
the real risk is using it at all as you are already sending them your data. If you are ok with that, then this retention/review seems ok.
There were two (expensive) exceptions / alternatives so far: Bedrock and Vertex. Their Zero Data Retention was in fact contractually enforced. Now it is all f...d because of these morons at Anthropic. For now I am better off just using DS via their API.
This is just a tragic moment for Tech. We just killed AI privacy. OpenAI already follows this trend and others will do too.
The only hope now is ... tada .. Mistral LOL
Hmmm no? The only way is to deploy your own local model, using anyone else's you are at their whim on what happens to your data.
And blindly trust thousands of unknown python developers?
The only way to really be sure is to build it from scratch starting with your own silicon wafers.
But for those who are a little more pragmatic, AWS has a good track record of fulfilling their promises.
Itâs not binary. With AWS previously you have contractual guarantees with a third party, thatâs been in business for a couple decades, which explicitly state zero seconds of data retention - only as long as needed for inference.
Consider the security angle too. You now have to rely on Anthropicâs infrastructure security. You did not previously when you used Bedrock/Vertex/etc.
From a personal use perspective yes, the big issue here is enterprise and existing contracts as surely most companies will have signed zero retention.
Lawyers are gonna be making this a legal quagmire for years. Even after it gets retracted.
Just a play to get more data
why would anyone assume anything else than that they keep it forever?
This could be a big issue for firms with strict GDPR criteria: "This change only applies to organizations that have set up workspaces with zero data retention (ZDR) in Claude Console, use Claude Code with ZDR in Claude Enterprise, or access Claude through AWS Bedrock, Google Cloud Agent Platform, or Microsoft Foundry with ZDR. The rest of this article applies only to these organizations."
All I can say to my team (and my clients): "f...k Anthropic". They've just put both Bedrock and Vertex on slippery slope of "we don't collect your prompts. period. ... comma ... except ..."
Right now we have changed the code of all our agents to data retention mode 'none' (Note: not "default" or "inherited", this is not enough now!) and we are fighting with GCP doco to set similar things for Vertex.
This is just terrible.
Then donât use it.
Thatâs exactly what my employer had communicated. It will not be allowed.
Step 1: Find all companies which refuses/bans to use SOTA models from irrational fear.
Step 2: Use SOTA models to copy them and crush them
Step 3: Profit.
(Yes, not every business is easily replicable, but you sure can find some)
This. And AI labs seem to be above IP / Copyright law and absolutely nothing will happen to them when they grab all the data and package it up.
Until all of your interactions are trained into future model releases, and another competitor steps in and takes all your "R&D" straight out of the model.
Now it's open season for literally anyone.
This policy change doesn't allow training, just like the previous one.
Step 4, get sued because you violated an NDA or other regulation?
I'm not talking about Claude copying.
I'm talking about scouring Twitter/LinkedIn and look at posts from employees who say SOTA model is banned. Look at what the business do. Copy it using SOTA. Call their clients with 30% discount and faster turnaround and higher quality product.
It is complicated, but I can get Private Equity of even VCs to fund this idea.
tl;dr -- I'm actually agreeing with you. Anthropic will never copy your business model due to NDA. But there are plenty of fearmongering about they copying you and because of which you won't use their models. If their models are genuinely SOTA you can use that information to your advantage and crush scaredy-cats.
Edit: The fact that these get downvoted is exactly the reason why it's easy to win
The thing is, just like employees at non-AI forward companies "cheat," by using their personal Claude.ai and ChatGPT.com, so will big companies, or at least some teams/departments regarding this Fable issue. LLMs might be new, but it is known that this kind of behavior is classic.
As you said, if they don't, they will be easy pickings.
To be very clear, I ain't that guy. So, if this is true, I might be somewhat easy pickins myself. But, well known trust is a huge part of our org's value prop with our clients. God this sucks.
Can you name a single example of a business that has been replaced by another business leveraging LLMs to copy and "crush" their software?
Pretty much any Chinese business. (Except takeouts and laundries)
I mean, this is the biggest reason that's my employer's position
what a glorious time to be a plaintiff attorney, subponeas for ai transcripts left and right.
Reminder: FISA Section 702, aka FAA702, aka PRISM, aka the #1 most used collection source by the US IC, allows *warrantless* realtime access for the US federal government to everything Anthropic, OpenAI, Google, Apple, Microsoft, Amazon, and Meta have on you.
Thank you. That completes the picture for me.
That should be higher
I am definitely for services respecting customer privacy, but I can't help if this is different. I recently saw a thread where a person was bragging that frontier providers were blocking their attempt at what looked like to be social media de-anonymization and blackmailing app.
Maybe this isn't different than using something like Google Sheets to keep a list of people to dox and blackmail, but the leverage certainly makes it feel different.
I mean not just the part 30 days data retention but I think the serious trade of this product is just the token efficiency. They trade it for precision. The claims that they make that it found a 30 year software bug from millions of lines of code is just precision. To human it's looks like a lot but for it it's just the ablity to process (token processing). Let's see how long it runs. Peace.
Thirty days, thirty days everywhere...I wonder why? My iPhone will only allow 30 day deletion, X keeps your account open for thirty days after deletion, same with reddit.
Conspiracy?
Does *anybody* believe their weasel words? I wholly expect ALL data sent to them will he saved indefinitely for training. And I mean all. Voice, text, pictures, scraped websites. You name it.
All the LLM vendors are the biggest commercial pirates ever known. And they got away with it. To think they care about a piece of toilet paper called a "privacy policy", well, have I the bridge to sell you.
I actually think thatâs warranted. And if you used it to poke around, you would also agree.
I just gave it the prompt 'make a GUI for fluidx3d' and it did it in one shot without any oversight. It is incredible.
> And if you used it to poke around, you would also agree.
Would you elaborate? Not sure what you're describing
All he pre-publicity from Anthropic was about how it was amazing at finding security vulnerabilities, so it's not a stretch to think that some people would want to exploit that for nefarious purposes.
Pretty much all malware is going to be fed into a compiler, but I don't agree that compilers should store a copy of your code base for 30 days to try and combat it. Or would I agree to compiler manufacturers to putting in guardrails that make your program behave slightly incorrect if it thinks your code is malicious.
My bet is that Anthropic will be exposed as openly evil within the next five years--even if they aren't even secretly evil now. That's the arc of the sociopathic corporate brain, every time.
What an annoying company, I wish it didn't exist..