Storing SQL Database Information In Comma-Delimited ASCII File Format

The SQL Standard doesn't get much into the details of how information should be stored -- that's up to the implementor, which in this case is Ocelot Computer Services Inc. We've chosen what we think is the simplest possible -- extended comma-delimited "ASCII". Simple for you, that is. Implementing all the SQL datatypes, and allowing for variable-size data, in a format that anyone can read -- well, guess why most vendors prefer to use vendor-friendly formats!

Suppose we make a table thus:

CREATE TABLE_1 (
    COLUMN_1 SMALLINT,
    COLUMN_2 CHARACTER VARYING (12345),
    COLUMN_3 DATE);

Here is an example of a typical file for this table:

"+TABLE_1", 00001,"Optimizing SQL","1994-01-05"
"*TABLE_1", 00002,"The Oracle At Delphi","1997-01-03"
"+TABLE_1",      ,"Ocelot","1997-01-03"

Subtly illustrated above are: (a) CHAR and datetime fields are stored within doublequotes; (b) fields which are not VARYING are fixed-size and stored in a canonical format; (c) rows in TABLE_1 always begin with "+TABLE_1" (the middle row in this example has been deleted, that's what the * means, and this row is inaccessible to SQL applications); (d) there may be all-blank fields which represent NULL. Between each instance of a row, not visible in the illustration, we have a carriage-return and a line feed.

We call our format "extended" because we have to allow for SQL_TEXT CHAR fields (which contain 16-bit characters rather than 8-bit characters), and for BIT fields (which contain a dword binary precision followed by undelimited byte sequences). But if you don't use the extensions, any text editor or BASIC program can read the file -- and that's a controversial point. The fact is, you shouldn't be able to get at data except via the SQL interface -- that's a basic relational rule and a very sensible rule. We used to enforce this rule by encrypting the data, but -- Ocelot has been in the database business for 20 years -- we got some experience. That experience taught us that salvage operations are sometimes necessary on personal computers. However, THE OCELOT SQL DBMS offers file encryption as an option for those who prefer to use it.

The metadata -- the descriptions of the tables themselves -- is stored in a CLUSTER file. (You may have heard the term "catalog" file before, but since we have an SQL-92 operation we can have multiple catalogs, stored in a single cluster.) This file is also easy to read -- it consists of nothing but the CREATE, ALTER, and GRANT statements needed to describe the database. In fact, when you CONNECT, THE OCELOT SQL DBMS opens a CLUSTER file and executes all the data-definition commands in it in order to recreate the database structure.

In Ocelot's regular (but non-default) file structure, a catalog is equivalent to a path, a schema is equivalent to a sub-path within a catalog, and a table is equivalent to a file within a schema. So if you have a catalog C which contains a schema S which contains a table T, the mapping is visible by doing a directory search with your operating system -- it will show you this file:
\C\S\T

A second file structure offered by THE OCELOT SQL DBMS is our Alternate File Structure, which is the default setting when you install using our SETUP.EXE installation program. The Alternate File Structure puts all database files in one directory and uses long file names with "." to support components. With this file structure, a directory search for the same example would show you this file:
\databasename\C.S.T

And that is all. We have described all files used by Ocelot's library. If you are at this point asking "what about the indexes?" we reply "there are no index files". We make indexes internally, on the fly, when you first select from a table. We drop them later based on various usefulness computations. You will therefore find that a search is sluggish the first time you do it, but it will be fast on subsequent occasions. We believe this is an improvement over the old indexing strategies, because (a) users cannot cause indexes to become out of synch with their tables, (b) expressions can be indexed which aren't handleable with the somewhat klunky CREATE INDEX syntax conventions, and (c) in any case, CREATE INDEX is not an SQL statement.

Return to Ocelot home page