Copy Link
Add to Bookmark
Report

The Commodore ARC format

DrWatson's profile picture
Published in 
Commodore64
 · 15 Jan 2020

Disclaimer: The description below is a mere extrapolation from the files and programs I have encountered.

ARC is used to compress multiple files into a single archive. There is no signature at the beginning of the archives.

Normally archives start at the beginning of the file but there can be different extractors prepended to the archive. A reliable method to get the start offset of the archive is reading the BASIC line at the beginning of the extractor, subtracting 6 from the BASIC line number and multiplying it by 254. In case the argument of the SYS instruction starts with the digit 7, you have to subtract 1 from the start offset.

The archives contain one or more blocks that consist of a file header and the compressed data of the file. The original file data is compressed using LZ compression bundled with a dynamic Huffman algorithm and is protected with an 16-bit checksum.

The file header has the following structure:

 
POSITION DESCRIPTION
$00 Entry version (1 or 2)
$01 Compression method (0-2 for version 1 and 0-5 for version 2)
$02-$03 Checksum of the file
$04-$06 Size of the original file data (warning, only three bytes)
$07-$08 Size of the packed file data in blocks
$09 Type of the file
$0A Length of the original file name
$0B-(N-1) Original name of the file


The following two entries only exist in version 2 entries:

 
N Record length of the original file
(N+1)-(N+2) Original date stamp of file (MS-DOS format)


Compression methods are the following: 0 means that the file is stored without any compression, 1 is for RLE (run length encoding), 2 is for Huffman algorithm, 3 is for LZ compression, 4 is for Huffman algorithm with RLE and 5 is for LZ compression in one pass.

The MS-DOS date stmap format packs the last modification date into a word:

 
BIT POSITION DESCRIPTION
0- 4 Day (1-31)
5- 9 Month (1-12)
10-15 Year minus 1980


The file name is a Pascal-style PETSCII string, its length being its first character.

You can read through an ARC archive with the following algorithm:

  1. Determine the start of the archive, with the above mentioned method.
  2. If you reached the end of the file, stop.
  3. Read in the first part of the header, until the length of the file name (11 bytes). Read in the file name (fetch its length from the header) and, if a version 2 entry, the record length and the date stamp (3 bytes), too. Now you can process the header.
  4. If the file is RLE-packed then read in the RLE control byte (1 byte). Now you can process the file.
  5. Add the block count of the packed file size, multiplied by 254, to the file position of the beginning of the header. By seeking there you get to the next header, goto step 2.

← previous
next →
loading
sending ...
New to Neperos ? Sign Up for free
download Neperos App from Google Play
install Neperos as PWA

Let's discover also

Recent Articles

Recent Comments

Neperos cookies
This website uses cookies to store your preferences and improve the service. Cookies authorization will allow me and / or my partners to process personal data such as browsing behaviour.

By pressing OK you agree to the Terms of Service and acknowledge the Privacy Policy

By pressing REJECT you will be able to continue to use Neperos (like read articles or write comments) but some important cookies will not be set. This may affect certain features and functions of the platform.
OK
REJECT