I think it's more accurate to call this known string a protocol namespace separator. It ensures that another protocol can't create a colliding signature on the same message, but "salt" has been a term used for a long time to mean this, but not with the same precise semantics, as it has overlapped with another concept which is often called "nonce".
The reason why the "salt" or as I prefer "protocol namespace separator" is critical is because it increases the size of the field in which secrets can be used on a given protocol instead of competing in the same space as another.
A finite field is a set of counting numbers that ultimately maps to the simplest one, the index of the position, or the scalar. A field is based on one specific number that you define as the zero, and this generates the field's zero value after you hash it.
When you take any other given input, you transform that number using a hash function into a member of the field, by including this "salt" or "protocol namespace separator" in the base value you derive the field member using a hash function. It can be said to be the member corresponding to the value from which the hash is derived, of the counting number set of the specific finite field.
If you don't add a specific base value as the zero in your field, the risk of leaking the secret rises with the number of competing users of the same hash function. Like mining bitcoin, the way you find these solutions is based on sheer numbers, so the more numbers you make available to be distinctive, the less chances of a collision, as more attempts must be made, and the numbers escalate by pure permutation, so they are factorial, thus also why they talk about "bits of security" in encryption.
On the other hand, the in-message random value, the nonce has to be in cleartext, the salt can merely be a protocol feature, and assumed to have been part of the construction of the encryption secret.
I'm not sure why it got that name as it is slang for pedophile in british english, it always makes me squirm a bit to use it.
The reason for the confusion is the vagueness of the way the word "salt" has been used, and IMO, it helps everything a lot to call the known value that is not in the message "protocol namespace separator" and the random value contained with the encrypted message the "nonce". Then there is no confusion. Salt is an old, old name for adding a value prior to a hash function, dates way back to before the concept of hash function even became clear, when it was primarily used as a data integrity mechanism, ie CRC32 and such. Hash functions became important after the invention of the hash map, which is a mechanism to accelerate scanning for records in a database.
Login to reply
Replies (2)
Thanks. I have been reading an HKDF paper and was starting to see that. I dont think using a every-time-random number here breaks anything but it is not needed.
I think of nonce as never-more-than-once.
Not more than ONCE :D yes, it makes sense, I sorta got the inkling that uniqueness was something about the concept but not the word "once".
On a side note, since malice and error are indistinguishable without evidence of intent, you could say that pedophiles represent high entropy humans, that replicate their entropy onto immature other humans, that replication part being the most fundamental element of their intent (normalization).
Random values in messages are definitely needed for optimal security, unless there is inherent entropy in the message. 32 bits of timestamp don't constitute strong entropy. Maybe if it was nanoseconds and 64 bit we'd be starting to talk about something with low collision potential.
Keep in mind that this encryption is not just for the relatively low volume of human readable text. It will inevitably be needed to encrypt massive amounts of data and the bigger the amount of data, the more likely is collision. For this reason, GCM, for example, should not use the same secret on more than 4gb of data. This is directly related to the permutations created by the entropy of the data you are encrypting and the size of the nonce, which for GCM is 12 bytes, or 96 bits.
SSL uses 128 and 256 bit sizes for the secrets because reversing them starts to become unlikely to occur within the time window of which the security of the protocol must survive. The cheaper that computation gets, and the more effective systems of parallelisation becomes, the more likely we are going to see a need to scale up to 384 and 512 bits. For now, it is very safe to bet on 256.
There has been several recent instances of people messing around with dice roll entropy to generate bitcoin addresses and having them almost immediately hacked. Computing a private key by iterating through the counting number space will pick up a short seed with very few iterations.
Pasword Based Key Derivation Functions (is HKDF a term used in papers now?) purpose is to inccrease the amount of time required to test each candidate. It is like raising the difficulty on proof of work, it functions on the same basis as everything I've discussed previously except the number of repetitions and the expansion of the output values (PBKDFs like Argon/Argon2 include a memory usage factor that adds memory hardness to the derivation, similar to how the old Ethereum PoW used a huge key, which was essentially a type of salt, to lower the optimization potential).