ANNOUNCING - CDROM support for linux (beta 0.2).

        CDROM support for linux is now ready for beta testing.  You
must have a CDROM drive, a SCSI adapter and a ISO9660 format disc
before this will be of any use to you.  You will also need to have the
source tree for linux 0.97 kernel sources available.  This will work with
either 0.97 or 0.97pl1, but pl1 is probably better mainly because it handles
memory better (unrelated to the CDROM, but still an important point).

        This project was a team effort.  The SCSI work was done by
David Giller rafetmad@cheshire.oxy.edu, and the filesystem was written
by Eric Youngdale eric@tantalus.nrl.navy.mil.  So far, the code has been
tested with an aha1542 SCSI card and both NEC and Sony CDROM drives.
A number of different discs have been tested.

        To install, unpack the archive in your linux kernel directory 
(usually /usr/src.  This will add a number of new files to the linux
source tree).  You will then need to apply the patches found in cdrom.diff
with the following command:

patch -p0 < cdrom.diff
 
and then build the kernel.  Once you have booted the system, you will need
to add a device with major=11, minor=0 for the first cdrom drive, minor=1 for
the second and so forth.  You can use a command something like:

        mknod -m 500 /dev/cdrom b 11 0

To mount a disc, use a command something like:

        mount -t iso9660 /dev/cdrom /mnt

I would be interested in hearing about any successes or failures with this
code.

CHANGES SINCE 0.1:

        Error detection/correction have been improved.  You should not
get any more multiply queued commands, and I increased the timeout
period such that the drive no longer times out.  My drive is fairly
fast, so other drives may have timeout problems.  I need to know this
so that I can increase the timeout period to a workable value for all
drives.  The error detection/correction should be pretty solid now.

        Support for Rock Ridge extensions has been added to the filesystem.
This means:

        * Longer filenames (My implementation limits it to 256 chars).
        * Mixed case filenames, Normal unix syntax availible. 
        * Files have correct modes, number of links, and uid/gid
        * Separate times for atime, mtime, and ctime.
        * Symbolic links.
        * Block and Character devices (Untested).
        * Deep directories (Untested).

        I was able to implement this because Adam Richter was kind
enough to lend me the Andrew Toolkit disc, which has the Rock Ridge
extensions.  I should point out that the block and character devices
and the deep directories have not been tested, since they do not
appear on the disc that I have.  If anyone has some pre-mastering software,
and could throw together a *very* small volume (i.e. one floppy disc)
that has some of these things, I could use the floppy to test and debug
these features.

        A single element cache was added that sits between the readdir
function and the lookup function.  Many programs that traverse the
directory tree (i.e. ls) also need to know the inode number and find
information about the file from the inode table.  For the CDROM this
is kind of silly, since all of the information is in one place, but we
have to make it look kind of like unix.  Thus the readdir function
returns a name, and then we do a stat, given that name and have to
search the same directory again for the file that we just extracted in
readdir.  On the Andrew toolkit disc, there is one directory that
contains about 700 files and is nearly 80kb long - doing an ls -l in
that directory takes several minutes, because each lookup has to
search the directory.  Since it turns out that we often call lookup
just after we read the directory, I added a one element cache to save
enough information so as to eliminate the need to search the directory
again.

        Scatter-gather for the cdrom is now enabled.  This should lead
to slightly faster I/O.


KNOWN PROBLEMS:

        None.

             ********************************************

Some general comments are in order:

        On some drives, there is a feature where the drive can be
locked under software control to essentially deactivate the eject
button.  The iso9660 filesystem activates this feature on drives so
equipt, so you may be unable to remove the disc while it is mounted.
The eject button will be re-enabled once the disc is dismounted.

        Since it is impossible to corrupt a CDROM, it is unlikely that
a bug in the iso9660 filesystem will lead to data corruption on your
hard disk, with the possible exception of files copied from the CDROM
to the hard disk.  Nonetheless, it is a good idea to have a backup or
your hard disk, just in case.  Then again, I really did not need to
say that, did I :-)

        There were several bugs in error handling in the scsi code.
Previously when a command failed, the higher level drivers would not
receive the correct sense data from the failed command, or would misinterpret
the data that they did get.  These has been fixed.

        Up until now, SCSI devices were either discs or tapes (and the
tapes have not been fully implemented).  CDROM drives are now a third
category.  There are several reasons why we do not want to treat then
the same as a regular hard disk, and it was cleaner to make a third
type of device.  One reason was that.....

        The CDROM has a sector size of 2048 bytes, but the buffer
cache has buffer sizes of 1024 bytes.  The SCSI high level driver for
the cdrom must perform buffering of all of the I/O in order to satisfy
the request.  At some point in the near future support will be present
in the kernel for buffers in the buffer cache which are != 1024 bytes,
at which time this code will be remove.

        Only the ISO 9660 filesystem is supported.  High Sierra is not
supported.  The High Sierra format is just an earlier version of
ISO9660, but there are minor differences between the two.  Sometimes
people use the two names interchangably, but nearly all newer discs
are the ISO9660 format.  It would not be that difficult to support HS,
but I doubt that there are very many HS discs are out there.  I will
add this if there is demand for it.

        The inode numbers for files are essentially just the byte
offset of the beginning of the directory record from the start of the
disc.  A disc can only hold about 660 MB, so the inode numbers will
be somewhere between about 60K and 660M.   Any tool that performs
a stat() on the CDROM obviously needs to be recompiled if it was
compiled before 32 bit inode support was in the kernel.

        A number of ioctl functions have been provided, some of which
are only of use when trying to play an audio disc.  An attempt has
been made to make the ioctls compatible with the ioctls on a Sun, but
we have been unable to get any of the audio functions to work.  My
NEC drive and David's Sony reject all of these commands, and we currently
believe that both of these drives implement the audio functions using
vendor-specific command codes rather than the universal ones specified
in the SCSI-II specifications.

        The filesystem has been tested under a number of conditions,
and has proved to be quite reliable so far.  This filesystem is
considerably simpler than a read/write filesystem (Files are
contiguous, so no file allocation tables need to be maintained, there
is no free space map, and we do not need to know how to rename, create
or delete files).

        Text files on a CDROM can have several types of line
terminators.  Lines can be terminated by LF, CRLF, or a CR.  The
filesystem scans the first 1024 bytes of the file, searching for out
of band characters (i.e. > 0x80 or some control characters), and if it
finds these it assumes the the file is a binary format.  If there are
no out of band characters the filesystem will assume that the file is
a text file (keeping track of whether the lines are terminated by a
CR, CRLF, or LF), and automatically converts the line terminators to a
LF, which is the unix standard.  In the case of CRLF termination, the
CR is converted to a ' '.


                  ***************************************
                  ***************************************

The remaining comments *only* apply to discs *without* the Rock Ridge
extensions:

        The find command does not work without the -noleaf switch.
The reason for this is that the number of links for each directory file
is not easily obtainable, so it is set to 2.  The default behavior for
the find program is to look for (i_links-2) subdirectories in each
directory, and it then assumes that the rest are regular files.  The
-noleaf switch disables this optimization.

        The filesystem currently has the execute permission set for
any non-directory file that does not have a period in its name.  This
is a crude assumption for now, but it kind of works.  There is not an
easy way of telling whether a file should be executable or not.
Theoretically it is possible to read the file itself and check for a
magic number, but this would considerably degrade performance.

        The filesystem does not support block or character devices,
fifos, or symbolic links.  Also, the setuid bit is never set for any
program.  The main reason for this is that there is no information in
the directory entry itself which could be used to indicate these
special types of files.

        Filenames under ISO9660 are normally all upper case on the
disc but the filesystem maps these to all lower case.  The filenames
on the disc also have a version number (like VMS) which appears at the
end of the filename, and is separated from the rest of the filename by
a ';' character.  The filesystem strips the version numbers from the
filename if the version number is 1, and replaces the ';' by a '.' if
the version number is >1.

eric@tantalus.nrl.navy.mil