[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
Re: DIGEST-MD5 and {nonce,cnonce}
On Mon, 25 Oct 1999, Kurt D. Zeilenga wrote:
> At 09:01 PM 10/25/99 +0300, Mihai Ibanescu wrote:
> > Okay, I already have seen propositions on some options
> >(/dev/random, which seems the most reliable, and a PRNG started with a
> >seed derived from gettimeofday, all of them hashed with MD5 or something
> >like that). Any other alternatives?
> > I'd like to implement it.
>
>
> If you hash the value read from /dev/random, you will LOSE entropy.
> That is, you could read 128 bits of entropy from /dev/random.
> Hashing it to a 64-bit quanity will result in at least half these
> bits being lost. Even where you read exactly 64-bits from /dev/random,
> hashing the value can only reduce the amount of entropy.
Yes, but this is a quote from
/usr/src/linux/drivers/char/random.c, the source for /dev/random:
Copyright Theodore Ts'o, 1994, 1995, 1996, 1997, 1998
...
* Computers are very predictable devices. Hence it is extremely hard
* to produce truly random numbers on a computer --- as opposed to
* pseudo-random numbers, which can easily generated by using a
* algorithm. Unfortunately, it is very easy for attackers to guess
* the sequence of pseudo-random number generators, and for some
* applications this is not acceptable. So instead, we must try to
* gather "environmental noise" from the computer's environment, which
* must be hard for outside attackers to observe, and use that to
* generate random numbers. In a Unix environment, this is best done
* from inside the kernel.
*
* Sources of randomness from the environment include inter-keyboard
* timings, inter-interrupt timings from some interrupts, and other
* events which are both (a) non-deterministic and (b) hard for an
* outside observer to measure. Randomness from these sources are
* added to an "entropy pool", which is mixed using a CRC-like function.
* This is not cryptographically strong, but it is adequate assuming
* the randomness is not chosen maliciously, and it is fast enough that
* the overhead of doing it on every interrupt is very reasonable.
* As random bytes are mixed into the entropy pool, the routines keep
* an *estimate* of how many bits of randomness have been stored into
* the random number generator's internal state.
*
* When random bytes are desired, they are obtained by taking the SHA
* hash of the contents of the "entropy pool". The SHA hash avoids
* exposing the internal state of the entropy pool. It is believed to
* be computationally infeasible to derive any useful information
* about the input of SHA from its output. Even if it is possible to
* analyze SHA in some clever way, as long as the amount of data
* returned from the generator is less than the inherent entropy in
* the pool, the output data is totally unpredictable. For this
* reason, the routine decreases its internal estimate of how many
* bits of "true randomness" are contained in the entropy pool as it
* outputs random numbers.
My point here is that I should hash the content, not for the
entropy's sake, but to hide the real contents of /dev/random, just in
case... AFAIK SHA1 is a 128-bit hash, and after all I can diversify the
generation (have no idea about SHA1, but with MD5 I can do something like:
int lutil_entropy(char *buf, int nbytes)
{
struct lutil_MD5Context ctx;
int i, retval;
int rndfd;
struct timeval tval;
char digest[16];
char buffer[16];
/* I won't initialize digest[], this may be a source of randomness */
for (i = 0; i <= nbytes / 16; i++) {
lutil_MD5Init(&ctx);
lutil_MD5Update(&ctx, digest, 16);
rndfd = open("/idev/urandom", O_RDONLY);
if (rndfd < 0)
rndfd = open("/idev/random", O_RDONLY);
if (rndfd > 0) {
retval = read(rndfd, buffer, sizeof(buffer));
lutil_MD5Update(&ctx, buffer, retval);
close(rndfd);
}
retval = gettimeofday(&tval, NULL);
if (retval == 0)
lutil_MD5Update(&ctx, (char *)&tval, sizeof(struct timeval));
lutil_MD5Final(digest, &ctx);
memcpy(buf + 16 * i, digest, (16 * i < nbytes) ? 16 : nbytes - 16
* i);
}
if (rndfd > 0)
close(rndfd);
return nbytes;
}
The idea is that I first try to read random bytes from
/dev/urandom or /dev/random. I compute the digest and store it in the
first 16 bytes of the result. The 16-bytes digest is used to further
compute the next digest, with another call to a read from /dev/random.
Just in case there is no /dev/urandom or /dev/random available, I append
the result of gettimeofday(). While one call to gettimeofday() may be
predictable, I assume a series of calls to gettimeofday are not that
predictable anymore, since a context change may increase the difference
between two calls.
I cannot be sure how much randomness is in the method described
above; however, I wrote 16383 bytes obtained via lutil_entropy and gzipped
(-9); the result was 16411 bytes, which make me think the method is pretty
reliable. The result was the same without reading bytes from /dev/random
(just hashing the results of gettimeofday()).
Since there is a SHA1 implementation in liblutil, I may change MD5
to SHA1. This should be better, IMHO.
If it's OK with you, I'll send the source.
> Hashing should only be applied to values that have some entropy
> but where you are unsure how the entropy is distributed within
> the value. That is, a crytographic hash is used as a means to
> extract bits of entropy from the input. You must, of course,
> be sure not to assume the hash has more bits of entropy than
> provided by the input.
>
> I would suggest that your initial implementation provide:
> /dev/random
> highly-portable fallback
>
> Others can then contribute codes which extend the implementation
> to support other mechanisms.
Cheers,
Misa