5 comments

  • kanzure a day ago

    Anyone can generate an alternative chain of sha256 hashes. perhaps you should consider timestamping, e.g. https://opentimestamps.org/ As for what the regulation says, I haven't looked but perhaps it doesn't require the system to be actually tamper-proof.

    • systima a day ago

      Thanks for the thoughts and feedback.

      Fair point on the reconstruction attack.

      The library is deliberately scoped as tamper-evident, not tamper-proof; it detects modification but does not prevent wholesale chain reconstruction by someone with storage access. The design assumes defence-in-depth: S3 Object Lock (Compliance mode) at the infrastructure layer, hash chain verification at the application layer.

      External timestamping (OpenTimestamps, RFC 3161) would definitely add independent temporal anchoring and is worth considering as an optional feature. From what I can see, Article 12 does not currently prescribe specific cryptographic mechanisms (but of course the assurance level would increase with it).

      On the regulatory question: Article 12 requires "automatic recording" that enables monitoring and reconstruction and current regulatory guidance does not require tamper-proof storage (only trustworthy, auditable records). The hash chain plus immutable storage is designed to meet that bar, but what you raise here is good and thoughtful.

  • sathishmg 5 hours ago

    Good one. QQ. If you store it hash chained, how are you handling GDPR erasure requests? Isn't that supposed to erase within 30 days for GDPR instead of 180? Do you recreate the chain or soe sort of Pseudonymization or anything else?

    • systima an hour ago

      Great question.

      voxic11 is right that the AI Act creates a legal obligation that provides a lawful basis for processing under GDPR Article 6(1)(c).

      To add to that, Article 17(3)(b) specifically carves out an exemption to the right to erasure where retention is necessary to comply with a legal obligation.

      (So the defence works at both levels; you have a lawful basis to retain, and erasure requests don’t override it during the mandatory retention period).

      That said, GDPR data minimisation (Article 5(1)(c)) still constrains what you log.

      The library addresses this at write-time today, in that the pii config lets you SHA-256 hash inputs/outputs before they hit the log and apply regex redaction patterns, so personal data need never enter the chain in the first place.

      This enables the pattern of “Hash by default, only log raw where necessary for Article 12”.

      For cases where raw content must be logged (eg, full decision reconstruction for a regulator), we’re planning a dual-layer storage approach. The hash chain would cover a structural envelope (timestamps, decision ID, model ID, parameters, latency, hash pointers) while the actual PII-bearing content (input prompts, output text) would live in a separate referenced object.

      Erasure would then mean deleting the content object, and the chain would stay intact because it never hashed the raw content directly.

      The regulator would also therefore see a complete, tamper-evident chain of system activity.

    • voxic11 3 hours ago

      GDPR permits retention where necessary for compliance with a legal obligation (Article 6(1)(c)).

      The AI Act qualifies as such a legal obligation.