Generating random files

Why on earth would someone want to generate files of random content (not files with random name) ?

Well, there is one big reason to do it: generate incompressible files. This seems a small reason, but there are a number os usage scenarios (apart from proving that random content is incompressible), most focus on transmitting files.

Although is transparent to most people, but some tools do background compression namely, https, IPSEC and SSL VPNs, etc, and as such, trying to measure real world performance on those require incompressible content.

First, how to generate it (assuming you can talk *NIX) ?

dd if=/dev/urandom of=random.file bs=1m count=100

Where:

if – Input file, in this case the virtual file /dev/urandom

Ad

of – Output file, the name of the destination file

bs – Block Size, the default block size for dd is 512bytes, which makes sense when copying files, but not terribly useful when creating files with a determined size. In this case 1MB

count – number of blocks to be copied

In this case, I needed to create a 100MB file of random content.

The result:

> dd if=/dev/urandom of=random.file bs=1m count=100
100+0 records in
100+0 records out
104857600 bytes transferred in 13.460029 secs (7790295 bytes/sec)

Now, how random is it? Hard to tell, but we can test how compressible it is:

Ad
> ls -la random.file
-rw-r--r--  1 user  staff  104857600 Jun 13 11:58 random.file
>gzip random.file > ls -la random.file* -rw-r--r-- 1 user staff 104874303 Jun 13 11:58 random.file.gz

As you see, the compressed is bigger than the uncompressed file. This is caused by the headers required by the compression scheme.

But for anyone wondering if gzip is not good enough, here’s with 7zip:

> 7za a random.file.7z random.file 

7-Zip (A) 9.04 beta  Copyright (c) 1999-2009 Igor Pavlov  2009-05-30
p7zip Version 9.04 (locale=utf8,Utf16=on,HugeFiles=on,2 CPUs)
Scanning

Creating archive random.file.7z

Compressing  random.file
Everything is Ok
> ls -la random.file*
-rw-r--r--  1 user  staff  104857600 Jun 13 11:58 random.file
-rw-r--r--  1 user  staff  106280801 Jun 13 12:15 random.file.7z

As you can see, the 7zip version is even larger than the gzipped file. So, this is a truly incompressible file.

Leave a Reply

Back to Top