I don't understand the need for this level of engineering. It appears we are going for an opaque bearer token here. The checksum is pointless because an entire 512 bit token still fits in an x86 cache line. Comparing the whole sequence won't show up in any profiler session you will ever care about.
If you want aspects of the token to be inspectable by intermediaries, then you want json web tokens or a similar technology. You do not want to conflate these ideas. JWTs would solve the stated database concern. All you need to store in a JWT scheme are the private/public keys. Explicit tracking of the session is not required.
I may be naive but I can't imagine anyone typing an api key by hand. Optimizing for it sounds like premature optimization, surely stopping the less than one in a million HTTP request with a hand-typed API key from reaching the db isn't worth anything
The neat thing about JWT is that there are no secrets to scan for. Your secret material ideally lives inside an HSM and never leaves. Scanning for these private keys is a waste of energy if they were generated inside the secure context.
Ideally API key shouldn't contain anything regarding the account or any info right? it's meant to be an opaque string, is what I found in most of the other articles I read. Please do let me know if I am wrong about this assumption
JWT operates on a different principle; the user's private key (API key) never leaves the user's device. Instead, the stated "role" and other JSON data are signed with the servers pubkey, then verified by the server using its master key, granting the permissions that role allows.
> The checksum is pointless because an entire 512 bit token still fits in an x86 cache line
I suppose it’s there to avoid round-trip to the DB. Most of us just need to host the DB on the same machine instead, but given sharding is involved, I assume the product is big enough this is undesirable.
The point of the checksum is to just drop obviously wrong keys. No need to handle revocation or do any DB access if checksum is incorrect, the key can just be rejected.
You don't have to use a publicly documented checksum.
If you use a cryptographically secure hashing algorithm, mix in a secret salt and use a long enough checksum, attackers would find it nearly impossible to synthesise a correct checksum.
I don't follow. The checksum is in "plain text" in every key. It's trivial to find the length of the checksum and the checksum is generated from the payload.
Others have pointed out that the checksum is for offline secret scanning, which makes a lot more sense to me than ddos mitigation.
> I suppose it’s there to avoid round-trip to the DB.
That assumption is false. The article states that the DB is hit either way.
From the article:
> The reason behind having a checksum is that it allows you to verify first whether this API key is even valid before hitting the DB,
This is absurdly redundant. Caching DB calls is cheaper and simpler to implement.
If this was a local validation check, where API key signature would be checked with a secret to avoid a DB roundtrip then that could see the value in it. But that's already well in the territory of an access token, which then would be enough to reject the whole idea.
If I saw a proposal like that in my org I would reject it on the grounds of being technically unsound.
Hey OP, sorry for the negativity, I think most of these commenters right now are pretty off-base. My company is building a lot of API infrastructure and I thought this was a great write up!
While it's true that API keys are basically prefix + base32Encode(ID + secret), you will want a few more things to make secure API keys: at least versioning and hashing metadata to avoid confused deputy attacks.
Interesting read, I do have some questions though and hope you could answer them:
1. Why do you use the API key ID AND the organization ID, and not just one of them, to prevent the confused deputy problem?
2. Why is not necessary to use something like Argon2id for hashing? You say "our secret is already cryptograhically-secure", but what does this mean exactly? Is it due to the fact that the secret is already very high entropy and cracking it, even if we use much faster hash functions like the ones mentioned in your article, it would practically not be possible even PQ with highly parallelized hardware?
I don’t understand your explanation on mitigating the confused deputy. If the attacker has access to the database, can’t they just read the IDs for the target row they are overriding first so they can generate the correct hash?
The attack would be like: attacker has read/write access to the database but not to the code of the backend service. Attacker swaps the hash of a targeted API key with the hash of their own API key. Attacker has now access to the resources of the targeted organization when using their own API key.
I suppose there could be two checksums, or two hashes: the public spec that can be used by API key scanners on the client side to detect leaks, and an internal hash with a secret nonce that is used to validate that the API key is potentially valid before needing to look it up in the database.
That lets clients detect leaks, but malicious clients cant generate lots of valid-looking keys to spam your API endpoint and generate database load for just looking up API keys.
Side note: the slug prefix is not primarily intended for the end-user / developer to figure out which kind of key it is, but for security scanners to detect when they are committed to code / leaked and invalidate them.
GitHub introduced checksums to their tokens to aid offline secret scanning. AFAIK it’s mostly an optimization for that use case. But the checksums also mean you can reveal a token’s prefix and suffix to show a partially redacted token, which has its benefits.
Reading “hex” pointing to a clearly base62-ish string was a bit interesting :-)
Also, could we shard based on a short hash of account_id, and store the same hash in the token? This way we can lose the whole api_key → account_id lookup table in the metashard altogether.
I've always been interested in the technical distinction between an API "key" and an API "token". And the terminology of "key" used to confuse me, because I associated that with cryptography, and I thought an API key would be used to sign or encrypt something. But it seems that in many cases it's basically just a long, random password.
A bit over-engineered, but it was fun to read about observations on industry standard API keys. I agree it would be nice with more discussion around API keys and qualities one would want from them.
I don't even understand what approach 3 is doing. They ended up hashing the random part of the API key with an hash function that produces a small hash and stored that in the metashard server is that it?
yea... sorry I still am not the best explainer but that is the approach, I just wanted to have a shorter hash in the meta shard that is it. The approach 3 is an attempt by me to generate my own base62/base70 encoder ;-;
Presumably because API keys are n bytes of random data vs. a shitty user-generated password we don’t have to bother using a salt + can use something cheap to compute like SHA256 vs. a multi-round bcrypt-like?
I think they are saying passwords are salted and we use multiple rounds of hashing to prevent rainbow tables and slow down brute-forcing the password (in case of db leak). We don't need to do that for randomized long strings (like api keys), no one is guessing 32 character random string, so no salt is needed and we don't need multiple rounds of hashing.
> I didn't proceed with this approach since I don't want the API keys to have any info regarding the account, but hey it is all just a matter of preference and opinion.
Well I would have done that and saved half the blog post.
I know sometimes people just like to try things out, but for the love of god do not implement encryption related functionality yourself. Use JWT tokens and OpenSSL or another established library to sign them. This problem is solved. Not essentially solved, solved.
Creating your own API key system has a high likelihood of fucking things up for good!
You don't need any encryption or signing for API keys. Using JWTs is probably more dangerous here, and more annoying for people using the API since you now have to handle refreshing tokens.
Plain old API keys are straightforward to implement. Create a long random string and save it in the DB. When someone connects to the API, check if the API key is in your DB and use that to authenticate them. That's it.
Ok, then for everyone. Don't save tokens in a database. Selects are vulnerable to timing attacks. You want a token to include a id and a signature. The ID is used to look up the scope or user attached to the token, while the signature is recreated from the ID, the server secret and some salt. The resulting signature is double checked with the provided signature with a time constant comparison.
An attacker will be able to identify valid keys, but won't be able to sign them.
You can either split the values like aws or join them with a separator.
Good idea with the slug though, makes it easier to report leaked tokens to the issuer.
I don't understand the need for this level of engineering. It appears we are going for an opaque bearer token here. The checksum is pointless because an entire 512 bit token still fits in an x86 cache line. Comparing the whole sequence won't show up in any profiler session you will ever care about.
If you want aspects of the token to be inspectable by intermediaries, then you want json web tokens or a similar technology. You do not want to conflate these ideas. JWTs would solve the stated database concern. All you need to store in a JWT scheme are the private/public keys. Explicit tracking of the session is not required.
Hello bob! the checksum is for secret scanning offline and also for rejecting api keys which might have a typo (niche case)
I just was confused regarding the JWT approach, since from the research I did I saw that it's supposed to be a unique string and thats it!
I may be naive but I can't imagine anyone typing an api key by hand. Optimizing for it sounds like premature optimization, surely stopping the less than one in a million HTTP request with a hand-typed API key from reaching the db isn't worth anything
if not for typo, then I can use for secret scanning then :)
Good point!
The neat thing about JWT is that there are no secrets to scan for. Your secret material ideally lives inside an HSM and never leaves. Scanning for these private keys is a waste of energy if they were generated inside the secure context.
But JWTs are usually used as bearer tokens when doing API authentication. Those are definitely secrets that need to be scanned for.
Or are you suggesting that the API requests are signed with a private key stored in an HSM, and the JWT certifies the public key? Is that common?
> are you suggesting that the API requests are signed with a private key stored in an HSM, and the JWT certifies the public key? Is that common?
Very. The thing that certifies the public key is called a JWK.
https://datatracker.ietf.org/doc/html/rfc7517
This is typically hosted at a special URL that enables seamless key rotation and discovery.
https://auth0.com/docs/secure/tokens/json-web-tokens/json-we...
That's how JWT is designed to work
Ideally API key shouldn't contain anything regarding the account or any info right? it's meant to be an opaque string, is what I found in most of the other articles I read. Please do let me know if I am wrong about this assumption
JWT operates on a different principle; the user's private key (API key) never leaves the user's device. Instead, the stated "role" and other JSON data are signed with the servers pubkey, then verified by the server using its master key, granting the permissions that role allows.
Look at the JWT standard, it usually contains things like claims, roles, user ids, etc.
"for rejecting api keys which might have a type" - assuming that is meant by to be "typo" - won't they get rejected anyway?
it's just an added benefit, I don't have to make a DB call to verify that :)
> The checksum is pointless because an entire 512 bit token still fits in an x86 cache line
I suppose it’s there to avoid round-trip to the DB. Most of us just need to host the DB on the same machine instead, but given sharding is involved, I assume the product is big enough this is undesirable.
You need to support revocation, so I'm not sure it's ever possible to avoid the need for a round trip to verify the token.
The point of the checksum is to just drop obviously wrong keys. No need to handle revocation or do any DB access if checksum is incorrect, the key can just be rejected.
That sounds like it's only helpful for ddos mitigation, in which case the attacker could trivially synthesize a correct checksum.
You don't have to use a publicly documented checksum.
If you use a cryptographically secure hashing algorithm, mix in a secret salt and use a long enough checksum, attackers would find it nearly impossible to synthesise a correct checksum.
I don't follow. The checksum is in "plain text" in every key. It's trivial to find the length of the checksum and the checksum is generated from the payload.
Others have pointed out that the checksum is for offline secret scanning, which makes a lot more sense to me than ddos mitigation.
> I assume the product is big enough
Experience tells otherwise
> I suppose it’s there to avoid round-trip to the DB.
That assumption is false. The article states that the DB is hit either way.
From the article:
> The reason behind having a checksum is that it allows you to verify first whether this API key is even valid before hitting the DB,
This is absurdly redundant. Caching DB calls is cheaper and simpler to implement.
If this was a local validation check, where API key signature would be checked with a secret to avoid a DB roundtrip then that could see the value in it. But that's already well in the territory of an access token, which then would be enough to reject the whole idea.
If I saw a proposal like that in my org I would reject it on the grounds of being technically unsound.
JWTs solve some problems but then come with a lot of their own. I do not think they should be the goto solution.
Hey OP, sorry for the negativity, I think most of these commenters right now are pretty off-base. My company is building a lot of API infrastructure and I thought this was a great write up!
It is alright, I am learning a lot from them as well, healthy criticism is always useful :) I am very glad that you found this a great write up ^_^
While it's true that API keys are basically prefix + base32Encode(ID + secret), you will want a few more things to make secure API keys: at least versioning and hashing metadata to avoid confused deputy attacks.
Here is a detailed write-up on how to implement production API keys: https://kerkour.com/api-keys
Interesting read, I do have some questions though and hope you could answer them:
1. Why do you use the API key ID AND the organization ID, and not just one of them, to prevent the confused deputy problem?
2. Why is not necessary to use something like Argon2id for hashing? You say "our secret is already cryptograhically-secure", but what does this mean exactly? Is it due to the fact that the secret is already very high entropy and cracking it, even if we use much faster hash functions like the ones mentioned in your article, it would practically not be possible even PQ with highly parallelized hardware?
Anyways, very interesting read, thank you!
I don’t understand your explanation on mitigating the confused deputy. If the attacker has access to the database, can’t they just read the IDs for the target row they are overriding first so they can generate the correct hash?
The attack would be like: attacker has read/write access to the database but not to the code of the backend service. Attacker swaps the hash of a targeted API key with the hash of their own API key. Attacker has now access to the resources of the targeted organization when using their own API key.
Thank you! I will definitely look into it!
The purpose of the checksum is to help secret scanners avoid false positives, not to optimize the (extremely rare) case where an API key has a typo
I suppose there could be two checksums, or two hashes: the public spec that can be used by API key scanners on the client side to detect leaks, and an internal hash with a secret nonce that is used to validate that the API key is potentially valid before needing to look it up in the database.
That lets clients detect leaks, but malicious clients cant generate lots of valid-looking keys to spam your API endpoint and generate database load for just looking up API keys.
thank you so much ram chip :) I didnt know that!
Side note: the slug prefix is not primarily intended for the end-user / developer to figure out which kind of key it is, but for security scanners to detect when they are committed to code / leaked and invalidate them.
Ahhhh I see, I didn't think about it that way too, this could help us a lot yea!!!
I don't like giving away any information what-so-ever in an API key, and would lean towards a UUIDv7 string, just trying to avoid collisions.
Even the random hex with checksum component seems overkill to me, either the API key is correct or it isn't.
GitHub introduced checksums to their tokens to aid offline secret scanning. AFAIK it’s mostly an optimization for that use case. But the checksums also mean you can reveal a token’s prefix and suffix to show a partially redacted token, which has its benefits.
Identifying an opaque value is useful for security analysis. You can use regex to see when they are committed to repos accidentally, for example.
Hello everyone this is my third blog, I am still a junior learning stuff ^_^
Hey, welcome to HN!
Reading “hex” pointing to a clearly base62-ish string was a bit interesting :-)
Also, could we shard based on a short hash of account_id, and store the same hash in the token? This way we can lose the whole api_key → account_id lookup table in the metashard altogether.
Hello thanks for reading through my blog :D Coming to your question, yes! that is possible I mentioned it in my second approach!
But when I mentioned it to my senior he wanted me to default with the random string approach :)
I NEVER THOUGHT I WOULD BE IN THE MAIN PAGE OF HACKERNEWS THANK YOU SO MUCH GUYS (╥﹏╥)
I've always been interested in the technical distinction between an API "key" and an API "token". And the terminology of "key" used to confuse me, because I associated that with cryptography, and I thought an API key would be used to sign or encrypt something. But it seems that in many cases it's basically just a long, random password.
A bit over-engineered, but it was fun to read about observations on industry standard API keys. I agree it would be nice with more discussion around API keys and qualities one would want from them.
I don't even understand what approach 3 is doing. They ended up hashing the random part of the API key with an hash function that produces a small hash and stored that in the metashard server is that it?
yea... sorry I still am not the best explainer but that is the approach, I just wanted to have a shorter hash in the meta shard that is it. The approach 3 is an attempt by me to generate my own base62/base70 encoder ;-;
Presumably because API keys are n bytes of random data vs. a shitty user-generated password we don’t have to bother using a salt + can use something cheap to compute like SHA256 vs. a multi-round bcrypt-like?
Correct.
Even a million rounds of hashing only adds 20 bits of security. No need if your secret is already 128 bits.
I can't understand what you are trying to say :o
How are you storing the API key in your database?
hash of the API key just like passwords
I think they are saying passwords are salted and we use multiple rounds of hashing to prevent rainbow tables and slow down brute-forcing the password (in case of db leak). We don't need to do that for randomized long strings (like api keys), no one is guessing 32 character random string, so no salt is needed and we don't need multiple rounds of hashing.
What if the "slug" was a prefix for the API key revocation URL, so the API key was actually a valid URL that revoked itself if fetched/clicked? :)
Hey - this was a great blog ! I liked how you used the birthday paradox here.
PS : I too am working on a APIs.Take a look here : https://voiden.md/
It's a bit confusing that the "Random hex" example contains characters such as "q" and "p".
I don't understand your question :o
Hex is 0-9, a-f. P and q are outside that character set.
yes, you are right onei, it is supposed to be random string instead of hex, I am sorry I made that mistake
fixed it in the blog, thanks for pointing it out amelius ;-;
> I didn't proceed with this approach since I don't want the API keys to have any info regarding the account, but hey it is all just a matter of preference and opinion.
Well I would have done that and saved half the blog post.
I know sometimes people just like to try things out, but for the love of god do not implement encryption related functionality yourself. Use JWT tokens and OpenSSL or another established library to sign them. This problem is solved. Not essentially solved, solved. Creating your own API key system has a high likelihood of fucking things up for good!
You don't need any encryption or signing for API keys. Using JWTs is probably more dangerous here, and more annoying for people using the API since you now have to handle refreshing tokens.
Plain old API keys are straightforward to implement. Create a long random string and save it in the DB. When someone connects to the API, check if the API key is in your DB and use that to authenticate them. That's it.
> Plain old API keys are straightforward to implement
This is pretty much just plain-old-api-keys, at least as far as the auth mechanism is concerned.
The prefix slug and the checksum are just there so your vulnerability scanner can find and revoke all the keys folks accidentally commit to github.
yes this is the approach!
I would add the capability to be able to seamlessly rotate keys.
But otherwise, yes, for love of everything holy - keep it simple.
We don't store it, in plain text right, store them hashed as always.
The securify here comes from looking the key up in the DB, not from any crypto shenanigans.
This is a very good example of premature optimization.
Everything about this is over engineered. Just KISS.
Is this running in a production environment yet? If so, do you have an email address to disclose a vulnerability?
no this is just a POC, I haven't implemented any of it
Ok, then for everyone. Don't save tokens in a database. Selects are vulnerable to timing attacks. You want a token to include a id and a signature. The ID is used to look up the scope or user attached to the token, while the signature is recreated from the ID, the server secret and some salt. The resulting signature is double checked with the provided signature with a time constant comparison.
An attacker will be able to identify valid keys, but won't be able to sign them.
You can either split the values like aws or join them with a separator.
Good idea with the slug though, makes it easier to report leaked tokens to the issuer.