Just an additional thought on this (sorry to invade your replies lately), but if adversary knew what model you were using (or could guess it), and your average secret message were of some relatively consistent length (e.g., always kept short to a small number of words), then observing the varying length of messages produced could tip them off to fact that hidden messages were being transmitted, since messages with many high probability words would tend to be longer-winded. So, the smart thing to do would be to vary length of secret message with some "random" filler words each time.
I realize I'm kind of overthinking a toy model here...
Login to reply
Replies (1)
Valid thoughts; I guess you cannot make absolute statements about the security of the system without making assumptions about the distribution of the covertext, i.e. how you 'normally' use the LLM model.