MANUAL FOR RLESTAT (BBC BASIC)

This file describes the program RLESTAT which analyses a group of files, then
calculates by how much they could be compressed using a variation of RLE
compression.
The manual and software is (C)2006 SPROW

INSTRUCTIONS-
At the BASIC prompt, type CHAIN"RLESTAT"

The program will then read in a set of 7 data files to be analysed. The 
filenames can be changed by altering the DATA statements at the end of the 
program listing, the format is
  DATA "Description text"
  DATA Filename1
  DATA Filename2
  DATA Filename3
  DATA Filename4
  DATA Filename5
  DATA Filename6
  DATA Filename7
to save having to repeatedly enter the filenames, simply add additional groups
of 7 filenames then set the value of "choice%" at the start of the program
to select which group of to use. Ideally, the data files should be around
100kbytes in total.

While reading in the files, a counter is updated which represents how much
the data could be compressed. The RLE algorithm used is

  * Assign an 'escape' byte
    This should be a byte which doesn't occur very often in the input data
    and can be changedby altering the variable "escape%"
  * If the input byte is the 'escape' byte then output two bytes
    The escape code followed by the escape code itself
  * If the input byte is the same as last input byte
    Just count up how many times it has been repeated but don't output anything
  * If the input byte is different to the last input byte 
    Output that byte
    Should that also be the end of a repeated run, output the escape code 
    plus the run length too

The maximum run length in this scheme is 255 characters, though the special
case of the run length being the same as the value of the escape character is
not permitted. This would need to be split into two shorter run lengths to
avoid accidentally outputing an 'escape/escape' byte pair.

The compression statistics are displayed after this step, expressed as
a fraction of the original size. For example 40% compression would mean that
for a 100k input file the result would occupy 60k.
                                       
As its name suggests, run length encoding works best with data that contains
long runs of the same character - graphics for example where large areas of
the screen are one colour, whereas BASIC programs for example don't RLE very
well.

KNOWN PROBLEMS/FUTURE ENHANCEMENTS-
Run lengths with the same number of repeats in as the value of the escape code
aren't calculated, hopefully these should be infrequent enough that the
compression statistics aren't affected.
Should convert to use GBPB instead of multiple BGET commands.
No known problems

HISTORY-
V1.00 Original