Files provide long term, high capacity data storage that is essential to all large scale computational undertakings. Since file input/output (file I/O) is necessarily a cornerstone of all ISIP software, the ISIP Foundation Classes (IFC's) require a standardized, efficient, and easy to use file I/O interface. Without such an interface, tremendous amounts of time would be wasted rewriting code to parse files for individual applications. ISIP's solution to this is the Signal Object File (Sof) class. This tutorial gives an overview of the Sof class as well as of four other IFC's involved in file I/O. Signal Object File (Sof) class Sophisticated file formats are an essential part of speech research since we often augment raw speech data with auxiliary information such as the recording conditions, annotations, etc. Even more importantly, as the native file I/O format for all ISIP classes Sof files provide a standardized way for all IFC's to write themselves to, and read themselves from, files. The standard interface makes file parsing for new applications simple and straightforward to write. An Sof file is nothing more than an index, which provides information about the location of each object stored in the file, and the corresponding object data. Sof transparently supports two basic storage formats: text (useful for building human readable files such as parameter files) and binary (useful for sampled data). Sof also transparently converts binary data between different architectures by performing the appropriate byte transformations as needed. This prevents users from having to concern themselves with the low level details. Example 1:
The third line of the file is an example of an object header. An object header has two components: a class name and an integral tag. The class name must be a single word - no spaces are allowed. The tag following the class name provides an ability to have multiple instances of the same object in a file. Every object written to an Sof file can be uniquely addressed by a (name, tag) pair. The first data in the file is an instance of the Long Scalar class. The data space for (Long,0) begins immediately on the line following the object header and ends at the second newline character. This extra line of space is not necessary, but helps to make the file more readable. Sof itself does not deal at all with this data space, it simply maintains the index of pointers and positions the file pointer to assist the higher level classes to read and write themselves from disk.
01 // file: $isip/doc/examples/class/io/io_example_00/example.cc 02 // version: $Id: index.html,v 1.5 2006/06/30 22:40:01 ewt16 Exp $ 03 // 04 05 // isip include files 06 // 07 #include <File.h> 08 #include <Sof.h> 09 #include <Long.h> 10 #include <Console.h> 11 12 // main program starts here: 13 // this program reads long integer entries from a text Sof file and prints 14 // each one found 15 // 16 int main(int argc, const char **argv) { 17 18 // declare an Sof file object 19 // 20 Sof sof1; 21 22 // open a file in read only mode: 23 // note that the Sof object determines whether the input file is text or 24 // binary automatically. in this example, it happens to be text. 25 // 26 String filename(L"./file.sof"); 27 sof1.open(filename, File::READ_ONLY); 28 29 // declare a Long object used to read from the Sof file 30 // 31 Long j; 32 33 // loop through all Long objects in the file, starting with the first 34 // and ending when we have visited all objects with the given name 35 // note that the sof1 object is looking up the object based on its name 36 // which, in this case, is "Long". one could uniquely determine each Long 37 // object in the file by assigning each a different name and using that 38 // name to read in the object rather than the default name. 39 // 40 long tag = sof1.first(j.name()); 41 while (tag != Sof::NO_TAG) { 42 43 // have the object read itself: 44 // this calls the Long::read method. each object in the math library 45 // and above knows how to read itself from an Sof file 46 // 47 j.read(sof1, tag, j.name()); 48 49 // output the object to the console 50 // 51 String output; 52 output.assign(j); 53 output.insert(L"I found the value " , 0); 54 Console::put(output); 55 56 // go to the next object 57 // 58 tag = sof1.next(j.name(), tag); 59 } 60 61 // close the input file 62 // 63 sof1.close(); 64 65 // exit gracefully 66 // 67 Integral::exit(); 68 }One subtlety of this code example is that it also works on binary files with absolutely no changes! This demonstrates the degree to which the details of file I/O are abstracted from the user. The Long::read() method branches on the mode of the file, switching between formatted text and direct binary input.
From a high-level programmer's perspective, we have now covered the meat of Sof. The file is opened by passing a filename to an Sof::open() method. The optional arguments for the overloaded open methods specify the file access mode and a file type. The file access mode should be File::READ_ONLY, File::WRITE_ONLY, File::READ_PLUS, or File::WRITE_PLUS. The first two are self explanatory, File::READ_PLUS allows for reading and writing to an existing file, and File::WRITE_PLUS creates a new file with the option of reading data back out. The file type is either File::TEXT or File::BINARY, useful only for newly created files (File::WRITE_ONLY or File::WRITE_PLUS), as existing files already have a specified type. When a file is opened in a write mode a lock will automatically be obtained. Once a Sof file is open, the program needs only navigate the (name, tag) pairs before informing an object to read itself and calculate such tags to ask an object to write itself. The functions first() and last() return the first and last tags of a specified class name in the file. Additionally, next() and prev() allow for iteration through items of the same name, and the number of instances can be found with number(). Objects can be deleted from a file either one at a time, a named class at a time, or all at once with the three delete() methods. The entire file can be deleted from disk with delete_file(). Hooks are also available to copy data from one Sof file to another, changing the byte mode for binary files. Every other aspect of I/O is left to the objects. File class The File class abstracts file manipulations, which are operating system specific, and provides a general interface that all IFC's use internally to access files. The File class capabilities include, but are not limited to:
Filename class The Filename class can possesses a few file manipulation capabilities. Rather than building one directory at a time, as does the File class, the Filename class can build entire paths at once. It can also retrieve the file's extension, directory, or operating system of origin. AudioFile class The AudioFile class inherits the File class, which is to say it possesses all of the capabilities of the File class, and some added abilities of it's own. The AudioFile class was created to handle external (non-IFC) audio data formats. It can read and write raw, Microsoft wav, and NIST SPHERE formats in addition to the IFC standard Sof format. It can be modified to allow other data formats. Another useful feature is the ability to read data from a required sample index or sample time range from a file. FeatureFile class The FeatureFile class inherits the File class. The FeatureFile class is another IFC which must work with external data formats; it is used to manage feature data, obtained either in Sof form from files produced by isip_transform or from other sources in raw binary or raw text formats. Refer to the online manual pages for complete descriptions of the above mentioned classes. To see detailed examples of the usage of these classes, refer to any of the utilities provided in our software distribution. |