Why on earth would someone want to generate files of random content (not files with random name) ?
Well, there is one big reason to do it: generate incompressible files. This seems a small reason, but there are a number os usage scenarios (apart from proving that random content is incompressible), most focus on transmitting files.
Although is transparent to most people, but some tools do background compression namely, https, IPSEC and SSL VPNs, etc, and as such, trying to measure real world performance on those require incompressible content.
First, how to generate it (assuming you can talk *NIX) ?
dd if=/dev/urandom of=random.file bs=1m count=100
Where:
if – Input file, in this case the virtual file /dev/urandom
of – Output file, the name of the destination file
bs – Block Size, the default block size for dd is 512bytes, which makes sense when copying files, but not terribly useful when creating files with a determined size. In this case 1MB
count – number of blocks to be copied
In this case, I needed to create a 100MB file of random content.
The result:
> dd if=/dev/urandom of=random.file bs=1m count=100 100+0 records in 100+0 records out 104857600 bytes transferred in 13.460029 secs (7790295 bytes/sec)
Now, how random is it? Hard to tell, but we can test how compressible it is:
> ls -la random.file-rw-r--r-- 1 user staff 104857600 Jun 13 11:58 random.file>gzip random.file > ls -la random.file* -rw-r--r-- 1 user staff 104874303 Jun 13 11:58 random.file.gz
As you see, the compressed is bigger than the uncompressed file. This is caused by the headers required by the compression scheme.
But for anyone wondering if gzip is not good enough, here’s with 7zip:
> 7za a random.file.7z random.file 7-Zip (A) 9.04 beta Copyright (c) 1999-2009 Igor Pavlov 2009-05-30 p7zip Version 9.04 (locale=utf8,Utf16=on,HugeFiles=on,2 CPUs) Scanning Creating archive random.file.7z Compressing random.file Everything is Ok
> ls -la random.file* -rw-r--r-- 1 user staff 104857600 Jun 13 11:58 random.file -rw-r--r-- 1 user staff 106280801 Jun 13 12:15 random.file.7z
As you can see, the 7zip version is even larger than the gzipped file. So, this is a truly incompressible file.