Sections of this Guide:

  1. Introduction
  2. It All Begins With The File Tag
    Tag Covered: <file>
  3. Numeric Field Types
    Tags Covered:
    <byte>, <unsigned-byte>, <short>, <unsigned-short>, <int>, <unsigned-int>, <long>, <unsigned-long>,
    <float>, <double>, <packed-decimal>, <zoned-decimal>, <fixed-byte>, <fixed-short>, <fixed-int>, <fixed-long>.
  4. The String Field Type
    Tag Covered: <string>
  5. Bit Flag Field Types
    Tags Covered: <flags-byte>, <flags-short>, <flags-int>, <flags-long>.
  6. Field Groups
    Tag Covered: <group>
  7. Field Blocks
    Tag Covered: <block>
  8. Field Arrays
    Tag Covered: <array>
  9. Data Field Type
    Tag Covered: <data>
  10. ASCII Int Field Type
    Tag Covered: <ascii-int>
  11. Label Field Type
    Tag Covered: <label>
  12. Calculated Fields
  13. Color Field Type
    Tag Covered: <color>
  14. Time and Date Field Types
    Tags Covered: <time>, <date>
  15. Tabbed Field Groups
    Tag Covered: <tab-group>
  16. Displaying Fields as Lists
  17. OnUpdate Events for Fields
  18. Conditional Existence of Fields
  19. JavaScript Expressions
  20. Including XML from Other Files
  21. Re-usable Type Definitions
  22. File Filters
  23. Command Line Arguments
  24. Environment Variables
  25. Frequently Asked Questions

1. Introduction

Go to Top

The file format definitions that allow FileCarver to edit or create files of various types are defined using the eXtended Markup Language (XML). XML files are simply text files with the .xml extension that follow the XML format. Although this guide attempts to explain in detail, with many examples, how XML file format definitions work, it is good to be already at least a little bit familiar with the general syntax and structure of XML documents. A lot of information, both for beginners and experts, on the XML standard is available at http://www.xml.com/.

In addition to this guide, an XML Schema file is included with FileCarver (named 'xml_schema.xsd') that specifies the structure of file format definitions. FileCarver will automatically use this schema to validate file format definitions on startup, and will report any errors it encounters. It is also possible to use an external XML editor that can validate file format definitions against the provided XML Schema as you edit them. One example of such a program is EditiX XML editor.

2. It All Begins With The File Tag

Go to Top

Each file format definition file should begin with a <file> tag. This tag acts as a container for the different fields that are specified in the file. It can have the optional attribute name, which is used to customize how the file type will be displayed in the list by FileCarver. When all elements of the file have been placed after the opening tag, the file tag must then be closed. The order the field tags are placed inside the file tag is the same order as FileCarver will read and write those fields from/to the binary file.

Example:
<file name="PC Info File">
  <int name="Some Value"/>
  <string name="Some Other Value" length="20"/>
</file>

With the optional attribute format on the <file> tag, you may specify the default byte order of numeric fields in the file. Possible values are 'big-endian' and 'little-endian'. If the format attribute is not present on the file tag, the numeric fields will use the 'big-endian' byte order by default. Note: You may always override the byte order on a field-by-field basis.

In following example, Field1 is big-endian while Field2 is little-endian:
<file name="PC Info File" format="little-endian">
  <int name="Field1" format="big-endian"/>
  <long name="Field2"/>
</file>

3. Numeric Field Types

Go to Top

The following numeric types are available for use in FileCarver:

Field Bytes Type Min Value Max Value
byte 1 integer -127 128
unsigned-byte 1 integer 0 255
short 2 integer -32768 32767
unsigned-short 2 integer 0 65535
int 4 integer -2147483648 2147483647
unsigned-int 4 integer 0 4294967295
long 8 integer -9223372036854775808 9223372036854775807
unsigned-long 8 integer 0 18446744073709551615
float 4 floating point Negative Real Positive Real
double 8 floating point Negative Real Positive Real
fixed-byte 1 fixed point Negative Real Positive Real
fixed-short 2 fixed point Negative Real Positive Real
fixed-int 4 fixed point Negative Real Positive Real
fixed-long 8 fixed point Negative Real Positive Real
packed-decimal variable decimal Unbounded Unbounded
zoned-decimal variable decimal Unbounded Unbounded

Common attributes for all numeric types:

Attribute Description Example
name The name of the field, as will be labeled in the user interface. Can consist of any characters allowed in an XML attribute value. <int name="Strength"/>
id The id of the field, which can be used to reference this field from other fields. Can consist of a number of lower-case characters, digits and underscores. Must start with a lower-case character. <byte id="c_age23"/>
value The value that an instance of this field will contain when a new file is created. The default value, when not specified, is 0 for numeric types. The specified value must be within the range for the field type. For integer field types (see table), the default value may be specified in hexadecimal, by appending the 0x prefix. For example, value="0xFF" (the 0x prefix will be replaced by as as many 0's as necessary to match the needed number of bytes of the field). <float value="2.3"/>
editable Specifies whether this field is editable. If unspecified, the default is always true. You may specify false to make a field read-only. <short editable="false"/>
hidden Specifies whether this field will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the field hidden (the value in that field will still be preserved when opening and saving the file). <double hidden="true"/>
format Specifies the endiannes of the field. Not applicable for packed-decimal and zoned-decimal fields. There are two possible values for this attribute for numeric types: big-endian and little-endian. The default value corresponds to the value that was set for the format attribute on the file tag, or big-endian if no such attribute has been set. If your file format stores values in the little-endian format (for example, the same way that they would be stored in RAM on an Intel x86 CPU), then you should specify the little-endian format. This attribute is not applicable for byte fields. <int format="little-endian"/>

Additional attributes only available on integer types:

Attribute Description Example
min When specified, limits the minimum value of the field (as can be set in the user interface), to the specified value. This value must be in the range of valid values for this field, and must be less than or equal to the max value, if specified. <int min="-273"/>
max When specified, limits the maximum value of the field (as can be set in the user interface), to the specified value. This value must be in the range of valid values for this field, and must be greater than or equal to the min value, if specified. <byte max="100"/>

The zoned-decimal and packed-decimal field types are commonly used in mainframes and in COBOL applications. These fields can consist of multiple bytes and can have an implicit decimal point positioned at a specific location. The zoned-decimal type generally uses one byte per digit, while the packed-decimal type stores two digits in a single byte. The following additional attributes are available for zoned-decimal and packed-decimal fields:

Attribute Description Example
length The length attribute is required. Specifies the length in bytes of the field. <packed-decimal length="3"/>
decimals Specifies the number of decimal digits that this field has. This effectively places a decimal point at that position, when the field's value is displayed in the user interface. The default value is 0. <zoned-decimal length="5" decimals="2"/>

A note about the 'long' type: The 'long' type is an 8-byte integer, which only corresponds to C's 'long a;' when the C code has been compiled in 64-bit mode for 64-bit processors. On 32-bit machines, 'long a;' is actually a 4 byte integer, and is identical to 'int a;'. An 8-byte long may be declared in C on 32-bit machines using 'long long a;'. This corresponds to FileCarver's 'long' type, which is also identical to Java's 'long' type. FileCarver's 'unsigned long' type has no primitive equivalent in Java. More information on C data types is available at this location, while general information about integer types can be found at this page on Wikipedia.

4. The String Field Type

Go to Top

The <string> tag is used to denote a field in the file that will hold a string of characters. This field type supports a number of attributes to specify the exact format of the string.

Strings support the following standard attributes:

Attribute Description Example
name The name of the field, as will be labeled in the user interface. Can consist of any characters allowed in an XML attribute value. <string name="First Name" ... />
id The id of the field, which can be used to reference this field from other fields. Can consist of a number of lower-case characters, digits and underscores. Must start with a lower-case character. <string id="f_name" ... />
editable Specifies whether this field is editable. If unspecified, the default is always true. You may specify false to make a field read-only. <string editable="false" ... />
hidden Specifies whether this field will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the field hidden (the value in that field will still be preserved when opening and saving the file). <string hidden="true" ... />

In addition to the standard attributes listed above, the string field supports a number of special attributes that are described below:

Attribute Description
value This attribute specifies the value that this field will contain when a new file is created. This attribute also supports escape-sequences for special characters, such as \n for newline, \t for tab, and so forth. In addition, since these values are specified in an XML file, you can use some XML codes to get characters such as the double quote to show up. Here are some useful examples of different types of codes and escape sequences you can use:
  • \\ -> backslash
  • \n -> newline
  • \t -> tab
  • &quot; -> double quote
  • &amp; -> ampersand
Examples:
<string value="Jerry's Hotel" ... />
<string value="Don &quot;The Yellow Dart&quot; Smith" ... />
<string value="Twenty years had passed,\nyet the the incident was still unforgotten." ... />
length

For string fields, the length attribute is required. It can have one of three values. If it is a fixed-length field (that is, if the size in bytes to store the string in the file is always the same), then the length attribute should specify the field's length in bytes.

Example:
<string length="32"/>

When you specify 'variable' as the value of the length attribute, the length in bytes of the string field may vary from file to file, depending on some other factor (read on). When this is the case, the format attribute must be set to either 'terminated' or 'length-specified' (these are described under in the 'format' section later in this table).

Examples:
<string length="variable" format="terminated"/>
<string length="variable" format="length-specified" lengthfield="f_len"/>

Finally, and only if this is the last field in the file or enclosing block, you may set the value of the length attribute to 'remainder', in which case the remainder of the file or block will be treated as part of the string. If this is the case, you must also set the format attribute to 'length-specified' (see below).

Example:
<string length="remainder" format="length-specified"/>
format The format of the field can be set to one of the following:
terminated:
This specifies that the string is terminated by a specific character, set by the field terminator. The default terminator is the character '\0' which corresponds to the integer value 0, and mimics the behaviour of a C-style string. The maximum length (in characters) of a string in this format is always one less than the maximum length (in characters) that could be stored, since one character is always used as the terminator. When the string is fewer bytes than the specified length, then the extra space at the end (after the terminator) will be filled with zeroes. On the other hand, if this is a variable-length string, then it will take as much space as necessary in the file, with the last character always being the terminating character.

padded:
This is equivalent to left-padded (see below).

left-padded:
The string is fixed-length, and the length must be specified explicitely. If there are fewer characters then the specified length, then any remaining space from the end of the string up to the end of the field will be set to the character specified by the padchar attribute (the default is a space).

right-padded:
The string is fixed-length, and the length must be specified explicitely. If there are fewer characters then the specified length, then the string is moved all the way to the right to fill up the length, and all characters from the start of the field until the beginning of the string will be set to the character specified by the padchar attribute (the default is a space).

length-specified:
If the format of the string is set to length-specified, then the length of the string will be determined based on some external factor: either the value of some other field, or the length of the file. If the 'length' attribute is set to 'remainder', then the length of the string is determined by the number of bytes left in the file. If the 'length' attribute is set to 'variable', then the 'lengthfield' attribute must specify how many bytes are taken up by this string. Finally, if 'length' was set to an explicit integer value, for example 32, then that many bytes will be read/written from/to the file, but the 'lengthfield' will be used to determine how many of those bytes are actually used for the characters in the string (the remaining bytes are treated as padding.)
terminator Can be used when the format attribute is set to 'terminated'. Specifies the character to use to mark the end of the string. The default value is '\0' which is equivalent to the integer value 0. The value specified may be any single character, or one of the following sequences for special values:
  • \0 -> null character
  • \\ -> backslash
  • \n -> newline
  • \r -> carriage return
  • \t -> horizontal tab
  • &quot; -> double quote (following XML rules)
padchar Should be used when the format attribute is set to either 'padded', 'left-padded', or 'right-padded'. Specifies the character used for padding the string. If this attribute is not present, string padding will be done with the space character. The value specified may be any single character, or one of the following sequences for special values:
  • \0 -> null character
  • \\ -> backslash
  • \n -> newline
  • \r -> carriage return
  • \t -> horizontal tab
  • &quot; -> double quote (following XML rules)
lengthfield Should be used when the length attribute is set to 'variable' and the format attribute is set to 'length-specified'. In these circumstances, this attribute must be set to the same value that is specified in the id attribute of the field that will store the length of the string. That field must come before the string field for which it specifies the value. Field ids are scoped, thus the first field with the matching id in the parent container of the string field (file, group, or array element) will be used when found. Such a field must exist.

NOTE: Currently, FileCarver does not support more than one field relying on the same length field.
encoding This attribute specifies the character encoding that the string is using. The following values are supported:
  • US-ASCII
  • UTF-8
  • UTF-16BE
  • UTF-16LE
  • ISO-8859-1
  • ISO-8859-2
  • ISO-8859-3
  • ISO-8859-4
  • ISO-8859-5
  • ISO-8859-6
  • ISO-8859-7
  • ISO-8859-8
  • ISO-8859-9
  • KOI8-R
  • MacRoman
  • windows-1250
  • windows-1251
  • windows-1252
  • windows-1253
  • windows-1254
  • windows-1255
  • windows-1256
  • windows-1257
  • windows-1258
  • Cp1047
  • Cp1140
Support for additional character encodings can be added on request.
display The display attribute for a string field can have two values: 'text-field' and 'text-area'. Setting it to 'text-field' is the equivalent of not including the attribute, as this is its default value. When set to 'text-field', the string will be displayed on a single line in the user interface, whereas the 'text-area' setting will make it span multiple lines, which is desirable for longer strings.

5. Bit Flag Field Types

Go to Top

FileCarver supports variants of the byte, short, int and long fields that are used specifically for storing a set of boolean values, ie: flags. Within the graphical user interface, such fields will be displayed as sets of checkboxes (each flag being a separate checkbox). These fields support the same attributes as their normal counterparts in addition to being able to contain <bit> tags.

The following such field types are available:

Type Bytes Bits Min Bit Pos Max Bit Pos
flag-byte 1 8 0 7
flag-short 2 16 0 15
flag-int 4 32 0 31
flag-long 8 64 0 63

To define specific bits within the a flag type, the <bit> tag is used. This tag supports the following attributes:

Attribute Description Example
name The name of this flag bit, which will be the name displayed next to the checkbox for this flag in the user interface. This attribute is required. <bit name="Can Fly" ... />
position This is the position of the bit within the field. A value between 0 and the maximum bit position supported by the field type (see previous table) is required. No two bits within a single field should specify the same position. This field is required. <bit position="2" ... />
value Specifies the initial value of the bit, either 1 (set) or 0 (not set) when a new file is created. The default value (when not specified) is 0. <bit value="1" ... />
editable Specifies whether this flag bit is editable. If unspecified, the default is always true. You may specify false to make the flag read-only. <bit editable="false" ... />
hidden Specifies whether this flag bit will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the bit flag hidden (the value in that field will still be preserved when opening and saving the file). <bit hidden="true" ... />

For example, a flag field can be used thus:
  <flag-byte name="Characteristics">
    <bit position="0" name="Is Evil" value="1"/>
    <bit position="1" name="Can Fly" value="0"/>
    <bit position="2" name="Bipedal" hidden="true"/>
    <bit position="3" name="Humanoid" editable="false"/>
  </flag-byte>

6. Field Groups

Go to Top

Fields can be grouped together with the <group> tag. This works by simply placing the tag as a container around any number of field tags, or other group or array (see later) tags.

Example:
<group>
  <int name="Some Value"/>
  <string name="Some Other Value" length="20"/>
</group>

Grouping tags together has several advantages. First, by using the optional name attribute on a group tag, you can put a titled border around a set of elements.

Example:
<group name="Values">
  <int name="Some Value"/>
  <string name="Some Other Value" length="20"/>
</group>

The optional display attribute on a group tag currently has four choices for values: 'normal', 'horizontal, 'collapsable', 'collapsed' and 'window'.

Setting the display attribute to 'normal' is equivalent to not specifying the attribute at all (in other words, it is the default value). The 'normal' display type will lay out the fields one after the other, just as fields are layed out normally in the file tag.

The 'horizontal' value on the display attribute will cause the group to lay out its fields horizontally, one after the other, rather than vertically as would be done otherwise.

The 'collapsable' and 'collapsed' values will display the field group in a way that lets the user click on a triangle to hide and show the fields in the group (the difference between the two values is the initial state).

Finally, display="window" will display the fields of a group in a separate window, with a button on the main window to open it.

Example:
<group name="Values" display="window">
  <int name="Some Value"/>
  <string name="Some Other Value" length="20"/>
</group>

Another advantage of the group tag, is that id attributes of fields contained by the group will be in the scope of that group, which allows flexibility when referencing fields by their ids from other tags. Group tags can also be nested easily.

Example:
<group name="Values" display="window">
  <int name="Some Value"/>
  <string name="Some Other Value"/>
  <group name="More Values">
    <int name="A Third Value"/>
    <string name="Something Else" length="20"/>
  </group>
</group>

The <group> tag also supports the attributes editable and hidden in the same way that other fields do.

7. Field Blocks

Go to Top

Field blocks are like field groups, but have an additional length attribute that specifies the exact amount of bytes a field block takes up in the file. You may refer to the previous section on field groups for information that is common to both field groups and field blocks.

The length of the block can be specified in two ways - both using the length attribute. The first way is to specify a literal positive integer as the length of the block field.

Example:
<block length="128">
  <string name="First Value" length="variable" format="terminated"/>
  <string name="Second Value" length="variable" format="terminated"/>
</block>

In the above example, the block will always take up 128 bytes in the file. Within the block are two null-terminated strings of variable length. If the combined length of the two strings is less than 128 bytes, then the remaining length will be skipped when reading the file, and will be filled with 0's when writing to the file. If the combined length of the two strings is greater than 128 bytes when writing, the contents of the block field will be truncated to 128 bytes.

The second way to specify the length of the block field is with a JavaScript expression that will be evaluated when reading the block field and when writing the block field to determine its length. This is done by placing the expression inside the parentheses of eval(), and settings this as the value of the length attribute.

Example:
<int name="Twice the Block Length" id="twicelen"/>
<block length="eval(twicelen/2)">
  <string name="First Value" length="variable" format="terminated"/>
  <string name="Second Value" length="variable" format="terminated"/>
</block>

In the above example, the JavaScript expression twicelen/2 will be evaluated to determine the length of the block prior to reading the block as well as prior to writing the block. Please note that the fields referenced by the JavaScript expression to specify the length of the block do NOT get automatically updated as the contents of the block change, and must instead be updated manually (if desired), using onupdate events on the contents of the block whose length may change.

Finally, any fields that support setting the length attribute to the value remainder (strings, arrays, data fields, etc), when placed as the last element of a block, will be read until the end of the block, but not further. This allows for greater flexibility in limiting the length for such fields.

8. Field Arrays

Go to Top

The <array> tag is similar to the <group> tag, in the sense that it acts as a container around other field tags, group tags and array tags. The difference is that an array tag allows you to have more than one set of the fields it contains. Three basic types of arrays are supported:

  1. Fixed length array with a pre-defined length:
    This type of array is defined by setting the 'length' attribute to a numeric value which corresponds to the number of elements in the array.
  2. Variable length array in which the length is specified by another field:
    This type of array is defined by setting the 'length' attribute to 'variable', and setting the 'lengthfield' attribute to the id of a previously defined field that specifies the length of this array.
  3. Fixed length array in which the number of active elements is defined by another field:
    This type of array is defined by setting the 'length' attribute to a numeric value which corresponds to the number of elements in the array as read/written to file, and setting the 'lengthfield' attribute to the id of a previously defined field that specifies the number of active elements in the array (these elements take up that many positions in the array, starting from the first element).
  4. Variable length array with the length determined by a terminating condition:
    This type of array is defined by setting the 'length' attribute to 'variable', and setting the 'termwhen' attribute to an expression that should evaluate to false on regular elements, and true when the terminating element is encountered. The expression is evaluated in the scope of the implicit field group corresponding to each element in the array. The default values that you set for an element in such an array must cause the terminating condition to evaluate to true.
  5. Fixed length array with the number of active elements is determined by a terminating condition:
    This type of array is defined by setting the 'length' attribute to a numeric value which will correspond to the number of elements in the array as read/written to the file, and setting the 'termwhen' attribute to an expression that should evaluate to false on regular elements, and true when the terminating element is encountered. The expression is evaluated in the scope of the implicit field group corresponding to each element in the array. The default values that you set for an element in such an array must cause the terminating condition to evaluate to true.
  6. Variable length array with length dependent on size of file:
    This type of array is only valid when the array is the last element in the file definition or an enclosing block. It is defined by setting the 'length' attribute to 'remainder'.
  7. Variable or fixed length array with the same length as another array:
    This type of array is also known as a parallel array, because each element in this array must correspond to an existing element in the other array. An array of this type is defined by not including the 'length' attribute and setting the 'parallelto' attribute to the id of a previously defined array field to which this array will be parallel to. In this mode, this array will always be of the same length as the array it is parallel to. An array may not be parallel to another parallel array, but two or more arrays may be parallel to a single main array. Multiple parallel arrays may be displayed in a single array editor, using display="merge" on an array that is parallel to another array.
Arrays support the following attributes:

Attribute Description
name The name of the array, if specified, will be used to put a titled border around it.
length

The length attribute is required unless this is a parallel array. For parallel arrays, this attribute must be ommitted. It can have three possible types of values. If it is a fixed-length array (that is, if the number of elements in the array is some constant number and the array cannot be shrunk or expanded), then the length attribute should specify this number of elements.

Example:
<array length="4">
  <int name="Width"/>
  <int name="Height"/>
</array>

If the number of elements in the array can vary, there are two possibile values you can set the length attribute to, depending on what determines the length of the array. If you specify 'variable' as the value of the length attribute, then you must also set the lengthfield attribute to specify the id of a previously-declared field that will be used for determining the length of this array, and the array can then be shrunk or expanded.

Example:
<unsigned-byte name="Length" id="len"/>
<array length="variable" lengthfield="len">
  <int name="Width"/>
  <int name="Height"/>
</array>

Finally, and only if this array is the last tag in the file or enclosing block, you can set the value of the length attribute to 'remainder', in which case the remaining length in the file or block specifies the number of elements in the array.

Example:
<array length="remainder">
  <int name="Width"/>
  <int name="Height"/>
</array>
lengthfield Must be specified when the length attribute is set to 'variable'. This attribute must be set to the same value that is specified in the id attribute of the field that will store the length of the array. That field must come before the array field for which it specifies the value. Field ids are scoped, which means the first field with the matching id in the parent container of this array field (file, group, or array element) will be used when found. Such a field must exist.

NOTE: Currently, FileCarver does not support more than one field relying on the same length field.
parallelto When this attribute is specified, the 'length' field must be omitted, and the value of this attribute must be equivalent the value that is specified in the id attribute of the array field to which this array will be parallel. A parallel array may (but does not have to) have its display attribute set to 'merge'.
termwhen This attribute specifies a JavaScript Expression that is evaluated, in the scope of each element of the array, to determine when the array is terminated. When the expression evaluates to true, the array is considered terminated. The expression must evaluate to true on the default values (as specified in the XML definition) of an element in this array, as these values will be used to terminate the array, when writing to a file. Two types of terminated array are possible.

Terminated arrays of variable length must have their 'length' attribute set to 'variable'. When read from file, each element from such an array is read and the 'termwhen' expression is evaluated. When the expression evaluates to true, no further elements of the array are read from the file, and the next field after the array is then read.
Example:
<array length="variable" termwhen=" (width == -1) ">
  <int name="Width" id="width" />
  <int name="Height"/>
</array>
Terminated arrays of fixed length must have their 'length' attribute specify a positive integer as the maximum number of elements they are to contain. When read from file, there are always that many elements (as specified by the length) read, but the number of active elements is determined by the 'termwhen' expression.
Example:
<array length="12" termwhen=" (width == -1) ">
  <int name="Width" id="width" />
  <int name="Height"/>
</array>
rowname Specifies the name of each row in the array, as will be displayed in the user interface. If unspecified, the default value is 'Element', producing a user interface that labels its rows in the following manner: 'Element 0', 'Element 1', 'Element 2', etc. If this attribute is present, the value of the attribute will be used as a basis for labeling the rows of the array.
Example: (will produce rows named: 'Size 0', 'Size 1', 'Size 2')
<array length="3" rowname="Size">
  <int name="Width"/>
  <int name="Height"/>
</array>
It is also possible to specify a JavaScript expression that will be evaluated to calculated the name of the array row. To do this, simply wrap the JavaScript expression between the parentheses of eval() and set that as the rowname attribute. The JavaScript will be executed in the scope of the specific row of the array, so you may refer to specific fields simply by their IDs. Also, you may find out the index of the current array element by using this.element_index. You can use this to have array elements be numbered from 1 instead of zero in the list, as shown in the example below.
Example: (will produce rows named: 'Size 1', 'Size 2', 'Size 3')
<array length="3" rowname="eval('Size ' + (this.element_index+1))">
  <int name="Width"/>
  <int name="Height"/>
</array>
Example: (will name rows according to value of the name field)
<array length="3" rowname="eval(name)">
  <string name="Name" length="12" value="Unknown" id="name"/>
  <int name="Age"/>
</array>
Note: JavaScript expressions for array row names will only be re-evaluated when a field in the array at that row is updated. This means that if you make the expression take the value from another field outside the array, the row name may not be updated when that field (outside the array) changes in value.
display This attribute may be ommitted. If it is omitted, or if set to 'normal', then the array will display normally. However, and only if this is a parallel array with the parallelto attribute present and the length attribute omitted, you may set the display attribute to 'merge'. When this is done, instead of this array getting a dedicated user interface control, the fields of each element of this array will be appended to the corresponding element in the master array, and will be displayed in the same user interface as the element of the master array.
onupdate This attribute, when specified, allows a JavaScript expression to be executed when the number of elements in the array changes. For example, this could be used to update a calculated field that displays the number of elements in the array. For more information, refer to the sections in this guide that cover onupdate events and calculated fields.

8. Data Field Type

Go to Top

FileCarver supports a <data> field type tag, which specifies that a portion of the file is data of fixed or variable length. The editor for this field type in the graphical user interface is a hex editor. This tag can be used for editing data which is in complex formats that FileCarver does not yet support, or (when set to be hidden) to just skip portions of the file that should not be editted.

The following attributes are supported by the <data> tag:

Attribute Description
name The name of the field, used to label it in the user interface.
Example:
<data name="Extras" ... />
id The id of the field, which can be used to reference this field from other fields. Can consist of a number of lower-case characters, digits and underscores. Must start with a lower-case character.
Example:
<data id="xt_data" ... />
editable Specifies whether this field is editable. If unspecified, the default is always true. You may specify false to make a field read-only.
Example:
<data editable="false" ... />
hidden Specifies whether this field will be hidden from viewing and editing. If unspecified, the default value is always false. You may specify true to make the field hidden (the value in that field will still be preserved when opening and saving the file).
Example:
<data hidden="true" ... />
value Specifies the starting value of this data field, when a new file is first created. If unspecified, every byte in the data will be zero. Otherwise, you may enter a hexadecimal value that will be set when a new file is made. The value is limited to the characters 0-9, a-f, and A-F, with an optional 0x prefix. If the prefix 0x is specified, zeroes will be inserted before the hex value to satisfy the specified length, otherwise zeroes will be inserted after the hex value. If the field is of variable length, and the specified value is an odd number of characters, then a single zero will be appended to make the number of digits even, and therefore byte divisable.
Examples:
<data value="0xEEFF" ... />
<data value="6df13e80d9f822e3" ... />
length

The length attribute is required. It can have three possible types of values. If the data is fixed-length (that is, the data always takes up the same number of bytes), then the length attribute should specify the number of bytes it occupies.

Example:
<data length="128"/>

If the size of the data can vary, there are two possibile values you can set the length attribute to, depending on what determines the length. If you specify 'variable' as the value of the length attribute, then you must also set the lengthfield attribute to specify the id of a previously-declared field that will be used for determining the length of the data (in bytes), and the data can then be shrunk or expanded.

Example:
<unsigned-byte name="Length" id="len"/>
<data length="variable" lengthfield="len"/>

Finally, and only if this data field is the last tag in the file or enclosing block, you can set the value of the length attribute to 'remainder', in which case the data field will contain the remaining bytes in the file or block.

Example:
<data length="remainder"/>

lengthfield Must be specified when the length attribute is set to 'variable'. This attribute must be set to the same value that is specified in the id attribute of the field that will store the length of the data field. That field must come before the data field for which it specifies the value. Field ids are scoped, which means the first field with the matching id in the parent container of this data field (file, group, or array element) will be used when found. Such a field must exist.

NOTE: Currently, FileCarver does not support more than one field relying on the same length field.

10. ASCII Int Field Type

Go to Top

FileCarver supports a special variation of the 4-byte <int> field: the <ascii-int> field. This works in the same way as the <int> field, except for how the data is displayed and editted. Unlike the <int> field, which treats the 4 bytes as a binary number, that is then editted in decimal, the <ascii-int> field treats those 4 bytes as four ASCII characters, and edits them as text. All tag attributes are the same as for the <int> field, except for the value attribute, which must be in the form of 4 characters, if specified. As an exception, entering 'NONE' as the value will actually produce the integer value -1, and not the characters 'N', 'O', 'N', 'E'.

11. Label Field Type

Go to Top

The <label> tag can be used to insert a descriptive string of text into the graphical user interface that will be presented to the user. This field is ignored when reading or writing from/to the file, and serves merely to present information to the user. The message displayed by the label is specified by setting the 'name' attribute.

Example:
<label name="Please fill in the following information."/>

12. Calculated Fields

Go to Top

FileCarver supports calculated fields - these are fields that are neither read from the file, nor written to the file. Instead, the values of these fields are calculated at run-time from the values of other fields.

To specify that a field is calculated, simply set the format attribute on the field tag to the value 'calculated'. You may also wish to set the attribute editable to 'false' on the field

The value of a calculated field should be set by another field's onupdate event. In the example below, a calculated field will display the total of the values from the two fields before it:

Example:
<group>
  <double name="Material Cost" id="material" onupdate="this.parent.total.value = this.value + this.parent.labor.value" />
  <double name="Labor Cost" id="labor" onupdate="this.parent.total.value = this.value + this.parent.material.value" />
  <double name="Total Cost" format="calculated" editable="false" id="total"/>
</group>

13. Color Field Type

Go to Top

The <color> tag represents a calculated field that can be used to select a color, using a color picker control. Since colors may be stored in a multitude of different formats, the color field is a calculated field that is not actually stored in the file, but gets its value from other fields.

The color field, thus, expects its value to be set by one or more onupdate events on other fields, while the onupdate on the color field should be used to set the appropriate values back to the fields storing the color information.

The color field exposes three attributes that can be retrieved and set with JavaScript: red, green and blue. Each of these attribute is a floating point value between 0.0 and 1.0, corresponding to the respective RGB component of this color.

Example:
<group>
  <unsigned-byte name="Red" id="cred" hidden="true" onupdate="this.parent.col.red = this.value/255.0" />
  <unsigned-byte name="Green" id="cgreen" hidden="true" onupdate="this.parent.col.green = this.value/255.0" />
  <unsigned-byte name="Blue" id="cblue" hidden="true" onupdate="this.parent.col.blue = this.value/255.0" />
  <color name="Color" id="col" onupdate="
    cred.value = this.red*255;
    cgreen.value = this.green*255;
    cblue.value = this.blue*255;
    "/>
</group>

14. Time and Date Field Types

Go to Top

The <time> and <date> tags represent calculated fields that can be used to display and manipulate times and dates. Since times and dates may be stored in a multitude of different formats, these fields are calculated and are not actually stored in the file, but rather they receive their values from other fields.

These fields thus expect their values to be set by one or more onupdate events on other fields, while the onupdate event on the <time> and <date> fields should be used to set the appropriate values back to the fields storing the time or date information.

The <time> field exposes three attributes that can be retrieved and set with JavaScript:

JavaScript Property Description
hour The number of hours. Values are integers ranging from 0 to 23.
minute The number of minutes. Values are integers ranging from 0 to 59.
second The number of seconds. Values are integers ranging from 0 to 59.

The <date> field exposes three attributes that can be retrieved and set with JavaScript:

JavaScript Property Description
year The year as an integer (ex: 2007).
month The month as an integer value from 1 to 12 (January to December).
day The day of the month, ranging from 1 to 31.

In the example below, the <time> field is used to display the time stored in a packed 2-byte DOS Time value (as is used in ZIP files, for example):

Example:
<unsigned-short name="Last mod file time" id="dt" hidden="true"
    onupdate="
      this.parent.mtime.second = (this.value & 0x1F) * 2;
      this.parent.mtime.minute = (this.value & 0x7E0) / 0x20;
      this.parent.mtime.hour = (this.value & 0xF800) / 0x800;
    "/>
<time name="File Last Modified on (Time)" id="mtime"
    onupdate="
      this.parent.dt.value = ((this.hour)<<11) + ((this.minute)<<5) + (this.second/2);
    "/>

In the example below, the <date> field is used to display the date stored in a packed 2-byte DOS Date value (as is used in ZIP files, for example):

Example:
<unsigned-short name="Last mod file date" id="dd" hidden="true"
    onupdate="
      this.parent.mdate.day = (this.value & 0x1F);
      this.parent.mdate.month = (this.value & 0x1E0) / 0x20;
      this.parent.mdate.year = 1980 + (this.value & 0xFE00) / 0x200;
    "/>
<time name="File Last Modified on (Date)" id="mdate"
    onupdate="
      this.parent.dd.value = ((this.year - 1980)<<9) + ((this.month)<<5) + this.day;
    "/>

15. Tabbed Field Groups

Go to Top

The <tab-group> tag allows a grouping of fields that will be presented in a tabbed interface. Unlike regular <group> tags, the <tab-group> may only contain inner tags of type <group>. The name of each such tag will then be used as the title of the tab for that group.

Example:
<tab-group>
  <group name="Tab 1">
    ...
  </group>
  <group name="Tab 2">
    ...
  </group>
  <group name="Tab 3">
    ...
  </group>
</tab-group>

16. Displaying Fields as Lists

Go to Top

For all numeric field types, the ascii-int field type, and the string field type, FileCarver supports a display mode in which the user will select the value for that field from a popup list. This is done by setting the display attribute on the tag element to the value 'list', and then nesting special <option> tags inside the tag that will be displayed in this way. Each option tag takes a required value attribute, which supports the same range of values as the value attribute on the parent field tag. The description of the option to be displayed to the user should be placed between the starting and ending tags of the option.

Example:
<int name="Occupation" display="list">
  <option value="0">Farmer</option>
  <option value="1">Carpenter</option>
  <option value="2">Blacksmith</option>
</int>

This will present the user with a drop-down list containing the three choices: Farmer, Carpenter, or Blacksmith. When the user selects one choice, the number specified in the value attribute of the option selected will be set for that field.

If you want the user to be able to enter any custom value that was not necessarily included in the definition, you can set the display attribute to 'combo'.

Example:
<int name="Occupation" display="combo">
  <option value="0">Farmer</option>
  <option value="1">Carpenter</option>
  <option value="2">Blacksmith</option>
</int>

This will allow the user to either select one of the three occupations, or enter a value other than 0, 1 or 2 directly into the textbox.

In addition to the required value attribute, option tags also support the following other attributes.

Attribute Description Example
selectable Specifies whether this option is selectable. If set to false, it will be greyed out in the user interface. When not specified, the value defaults to true. <option value="1" selectable="false">Red</option>
hidden Specifies whether this option is hidden. If set to true, not, it will not be displayed at all as one of the choices in the user interface. When not specified, the value defaults to false. <option value="0" hidden="true">Humanoid</option>

For fields that have options, it is possible to retrieve the labels of an option via JavaScript based on the option value. This can be done by indexing into the associative options array on the field.

Example:
<int name="Occupation" display="list" onupdate="this.parent.occtxt.value=this.options[this.value]">
  <option value="0">Farmer</option>
  <option value="1">Carpenter</option>
  <option value="2">Blacksmith</option>
</int>
<string name="Occupation Text" id="occtxt" length="variable" format="calculated"/>

Additionally, it is also possible to display integer fields with an up/down spinner control. The following fields can be displayed in this manner: byte, unsigned-byte, short, unsigned-short, int, unsigned-int, long and unsigned-long.

This will allow the user to increment and decrement the value of the field with a set of up and down arrows.

Example:
<int name="Age" display="spinner" value="20" />

17. OnUpdate Events for Fields

Go to Top

FileCarver supports user-configurable onupdate events for fields. These are JavaScript code snippets that are specified on the onupdate attribute for a field, that perform a certain action when the value of the field is updated.

The JavaScript code may perform normal JavaScript calculations, as per the regular JavaScript rules, and may also update the values of other fields. A good use of JavaScript onupdate events is to set values for calculated fields.

Example:
<flag-byte name="Info" id="info">
  <bit position="0" name="Is Old" value="0"/>
  <bit position="1" name="Has Children" value="0"/>
</flag-byte>
<int name="Age" id="age" value="20" onupdate="
  if (this.value &gt; 80) {
    info.bits[1] = 1;
  } else {
    info.bits[1] = 0;
  }
"/>

For more information and examples of onupdate events, see the section on calculated fields.


18. Conditional Existence of Fields

Go to Top

There are times when the existence of a field in a binary file depends on a previous field having some specific value. For example, you may have a flag field which, when a certain bit is set, indicates that there is an extra field following it.

FileCarver offers full support for such formats, with the existsif attribute which is available on all top level fields. Top level fields are all tags that can be put directly under a <file> tag.

The value of the existsif attribute is treated as a JavaScript expression (more below) that is executed in the scope of the parent field to evaluate to either true or false. When it evaluates to true, the field exists; when it does not, the field does not exist. When a field does not exist, it is neither read from the file, nor written to the file.

Examples:
<flag-byte name="Info" id="info">
  <bit position="0" name="Has Values" value="1"/>
  <bit position="1" name="Has Description" value="0"/>
</flag-byte>
<group existsif="info.bits[0] != 0" name="Values">
  <unsigned-int name="Salary" id="salary"/>
  <string name="Occupation" length="32" existsif=" ( salary != 0 ) " />
</group>
<string name="Description" display="text-area" length="256" existsif=" ( info.bits[1] == 1 ) " />

19. JavaScript Expressions

Go to Top

FileCarver uses an embbedded JavaScript interpreter to process various expressions that may be specified in file format definitions. FileCarver uses JavaScript to evaluate the existence of fields using the existsif attribute, to execute update events on fields that have the onupdate attribute set, as well as to determine the last element of a terminated array by evaluating the JavaScript in the termwhen attribute.

It is also possible to embed a global set of re-usable functions in a file format definition by specifying them in a script tag at the beginning of the definition.

Example:
<file>
  <script>
    function convert(value) {
      return (value % 20 + 15);
    }
  </script>
  . . .
</file>

There are numerous resources available on the web that explain the rules of JavaScript expressions, and therefore they are not covered in this guide. Only the specific details of their implementation in FileCarver are explained.

Accessing Field Data

To access the data inside a field, and use it in a JavaScript expression, that field must have an id attribute specified, which will act as a variable name in JavaScript. For most types of fields, when you refer to the field by its id in a JavaScript expression, the value of that field is used.

Example:
<unsigned-int name="Salary" id="salary"/>
<string name="Occupation" length="32" existsif=" ( salary != 0 ) " />

Field ids are scoped by their containing tag, such as <group> or <file>. For example, if you have two tags with the same id, one in the same group as the expression, and the other one defined in a parent scope/group, the one in the same group takes priority.

Example:
<unsigned-int name="Outer Salary" id="salary"/>
<group name="Values">
  <unsigned-int name="Inner Salary" id="salary"/>
  <string name="Occupation" length="32" existsif=" ( salary != 0 ) " />
</group>

In the above example, the value of the "Inner Salary" field is used in the evaluation of the expression.

Additionally, fields may only refer to other fields that have been declared before them in the file definition. This is a necessary mechanism, as FileCarver needs to be able to determine if a field exists before reading it from the file, which must be done using only previously-defined fields.

If a field has been previously defined, but is located in a scope that is nested deeper than the scope of the expression, then any references to it must be done through that field's parent group.

Example:
<group name="Values" id="values">
  <unsigned-int name="Salary" id="salary"/>
</group>
<unsigned-int name="Job Security" existsif=" ( values.salary != 0 ) " />

References to individual bits of a flag-byte, flag-short, flag-int, or flag-long field may done by referencing the bits member array of that field. The index of the array corresponds to the position attribute on the bit tag. The value of a bit is either 1, if its set, or 0 otherwise.

Example:
<flag-byte name="Info" id="info">
  <bit position="3" name="Has Salary" value="1"/>
</flag-byte>
<unsigned-int name="Salary" existsif="info.bits[3] == 1"/>

For array tags, you may query their current length (number of active elements), using the length property.

Example:
<array name="Pencils" length="variable" termwhen=" ( pname == '' ) " id="pencils">
  <string length="16" id="pname" name="Pencil Name"/>
</array>
<string name="Who stole my pencils?" length="variable" existsif="pencils.length == 0"/>

Note: In the above example, it is valid to reference pname in the termwhen condition of the array, as that condition is only evaluated after an element in the array is read from the file (the array is considered terminated when an empty Pencil Name has been read).

Array elements may also be referenced by index, in order to access their children. Each array element can be considered as an implicit field group.

Example:
<array name="Pencils" length="variable" termwhen=" ( pname == '' ) " id="pencils">
  <string length="16" id="pname" name="Pencil Name"/>
</array>
<group existsif="pencils.length == 1">
  <string name="Why is the first pencil broken?" length="variable" existsif=" pencils[0].pname == 'broken' "/>
</group>
Operators and XML Representations

Since expressions are evaluated using a standards-compliant JavaScript interpreter, they may use any features allowed in JavaScript expressions, such as boolean operators, parentheses, etc. However, since these expressions are stored in XML, it is necessary to follow XML syntax rules. Specifically the following substitutions must be made:

So, conditions like "a < b" must be written as "a &lt; b" and "(a == 2) && (a == 3)" as "(a == 2) &amp;&amp; (a == 3)".

This syntax is unfortunately necessary, in order to maintain the integrity of the XML format used for file format definitions. However, you can avoid having to subtitute the above characters if your script resides in a global script tag. If so, you can either enclose your script body between <![CDATA[ and ]]> or use the xi:include to import it as text, as described in the following section of the guide.

Example:
<file>
  <script><![CDATA[
    function isValid(value) {
      return (value >= 0 && value < 16);
    }
  ]]></script>
  . . .
</file>
Additional JavaScript Functions

Some additional JavaScript functions are provided by FileCarver. The functions provided are: alert() and prompt() which behave like functions of the same name in JavaScript used on web pages.

The alert() function takes one string parameter, and displays a dialog box with the specified message and an "OK" button.

Example:
<unsigned-int name="Min Height" id="min_h"/>
<unsigned-int name="Max Height" id="max_h" onupdate="
  if (this.value &lt; min_h.value) {
    alert('Max Height cannot be less than Min Height.');
    this.value = min_h.value;
  }
"/>

The prompt() function allows you to display an input prompt to the user. The first parameter is a string which specifies the message that will be presented to the user. A second parameter, which is optional, specifies the initial value of the input field. The prompt() function will return the string that the user entered, or null if the user clicked Cancel.

Example:
<unsigned-int name="Min Height" id="min_h"/>
<unsigned-int name="Max Height" id="max_h" onupdate="
  while (this.value &lt; min_h.value) {
    var value = prompt('Max Height cannot be less than Min Height. ' +
      'Please enter new value:', min_h.value);
    if (value == null) {
      this.value = min_h.value;
      break;
    }
    this.value = value;
  }
"/>

20. Including XML from Other Files

Go to Top

FileCarver supports the ability to modularize a file format definition into multiple separate files, and include the XML from those other files in the main XML definition. This is done by using the xs:include tag with a href attribute. The href attribute specifies a path to a file that will be included; the path may either be relative from the location of file containing the xs:include tag or it may be an absolute path.

For example, given the following two files:

definitions/bookshelf.xml:
<file name="Bookshelf" xmlns:xi="http://www.w3.org/2001/XInclude">
  <array length="remainder" rowname="Book">
    <xi:include href="includes/book.xml"/>
  </array>
</file>
definitions/includes/book.xml:
<group>
  <string name="Title" length="variable" format="terminated"/>
  <string name="Author" length="variable" format="terminated"/>
  <unsigned-short name="Number of Pages" display="spinner"/>
</group>

FileCarver will then treat definitions/bookshelf.xml as if it was:

<file name="Bookshelf">
  <array length="remainder" rowname="Book">
    <group>
      <string name="Title" length="variable" format="terminated"/>
      <string name="Author" length="variable" format="terminated"/>
      <unsigned-short name="Number of Pages" display="spinner"/>
    </group>
  </array>
</file>

There are several important things to notice in the above example. First, is necessary to specify xmlns:xi="http://www.w3.org/2001/XInclude" as an attribute on the file tag, if the definition will include sub-files. Second, each sub-file must be a well-formed XML document, and thus everything must be nested within a single top-level tag - in this case the group tag. While FileCarver itself does not require the extra redundant group tag in the definition, the include mechanism does.

It is also possible for included files to include other files, and there is no limit of how deep this can go. Only circular inclusion (file A includes file B which includes, directly or indirectly, file A) is not allowed.

Finally, it is also possible to specify the parse="text" parameter on the xs:include tag to include a file as plain text. This is most useful for including external JavaScript code, which can then be written without needing to XML-escape characters such as > and &.

Example::
<file xmlns:xi="http://www.w3.org/2001/XInclude">
  <script>
    <xi:include parse="text" href="scripts/functions.js"/>
  </script>
  . . .
</file>

21. Re-usable Type Definitions

Go to Top

In additional to including XML from other files, FileCarver allows you define specific field types at the top of the file format definition, and re-use these fields types throughout the rest of the file. This is done by placing a defined-types tag at the top of the format definition, containing one or more type tags. Each type tag must then have a unique id attribute, and would encapsulate the actual type definition. Then, you can refer to these global types using a type-ref element.

Example:
<file name="Bookshelf">
  <defined-types>
    <type id="book">
      <string name="Title" length="variable" format="terminated"/>
      <string name="Author" length="variable" format="terminated"/>
      <unsigned-short name="Number of Pages" display="spinner"/>
    </type>
  </defined-types>
  <array length="remainder" rowname="Book">
    <type-ref id="book"/>
  </array>
</file>

This will produce the same result as if the elements in the defined-type book were declared inside the actual array. The advantage is that you may have many type-refs fields that refer to the same type, without duplicated code.

An important distinction between defined types and the XML include mechanism covered in the previous section, is that type-refs are replaced by their corresponding types only when they exist. That is, they will not be expanded if their or their ancestors' existsif condition evalutes to false. This allows for implementing nested, recursive format definitions - where a type essentially includes itself - but conditionally.

The defined-types must be located after a global script tag, if both exist in a file format definition.


22. File Filters

Go to Top

FileCarver supports user defined file input and output filters. An input filter allows you to specify a transformation that is to be done on the file data as it is read, before it processed by FileCarver. Meanwhile, an output filter specifies the transformation that is to be done on the data when it is saved from FileCarver, prior to it being written to disk.

Example uses of input and output filters may include performing compression or encryption on the file, or some other custom processing on the data, as required by the format.

Filters are implemented in Java, each filter being a Java class. The input filter class must be an instance of the java.io.InputStream class, and must have a constructor that takes an existing InputStream as a parameter. Extending the java.io.FilterInputStream class is suggested. The fully qualified class name of the input filter class should be set as the value of the inputfilter attribute on the file tag.

The output filter class must be an instance of the java.io.OutputStream class, and must have a constructor that takes an existing OutputStream as a parameter. Extending the java.io.FilterOutputStream class is suggested. The fully qualified class name of the output filter class should be set as the value of the outputfilter attribute on the file tag.

Compiled filter classes may be placed in the folder filters which should be put in the same directory as FileCarver, in order to be available for use. Filter classes may also come from the standard Java class library.

For example, you may use the built-in Java classes java.util.zip.GZIPInputStream and java.util.zip.GZIPOutputStream as input and output filters respectively, as they conform to FileCarver's requirements for filter classes. This will allow loading and saving of GZIP-compressed files.

Example:
<file inputfilter="java.util.zip.GZIPInputStream" outputfilter="java.util.zip.GZIPOutputStream">
  ...
</file>

It is not difficult to create a custom file filter as a Java class. For instance, if you wish to create an input filter that changes all lowercase ASCII characters that are read to uppercase, you can write a class like the following, extending java.io.FilterInputStream and overriding all the read methods:

import java.io.*;

public class ToUpperInputFilter extends FilterInputStream {

  public ToUpperInputFilter(InputStream in) {
    super(in);
  }

  public int read() throws IOException {
    int c = in.read();
    return (c >= 'a' && c <= 'z' ? c + ('A' 'a': c);
  }

  public int read(byte[] b, int off, int lenthrows IOException {
    int n = super.read(b, off, len);
    for (int i = off; i < off + n; i++) {
      if (b[i>= 'a' && b[i<= 'z') {
        b[i+= ('A' 'a');
      }
    }
    return n;
  }

  public int read(byte[] bthrows IOException {
    return read(b, 0, b.length);
  }
}

Then, compile this code, which would be saved in ToUpperInputFilter.java, to produce the file ToUpperInputFilter.class. Place the class file into the filters folder in the same directory as FileCarver (you may need to create it). This will make the ToUpperInputFilter available for use, as shown below.

Example:
<file inputfilter="ToUpperInputFilter">
  ...
</file>

Please note that the above example is simplified, and does not take into consideration the format of the file. It may be necessary, for some filters, to only process specific regions of the file, which can be done by keeping track of the current offset into the file.


23. Command Line Arguments

Go to Top

Command line arguments may be used to modify how FileCarver behaves when it is launched.

To have FileCarver open one or more files when launched, simply specify the file paths of those files as command line arguments. Since FileCarver does not know which file format definition you wish to use to open those files, you will be presented with a prompt allowing you to select the file format definition to use for each file.

Alternatively, you may specify the path to a single file format definition which will be used to open the data file(s) immediately. This is done by using the -d command line option, followed by the path to the definition file, before any data file paths.

For example, if the following command line arguments are given to FileCarver:

-d /path/to/some/file1.xml /path/to/some/other/file2.dat
Then, FileCarver will launch, loading only the file1.xml file format definition, and will immediately open the data file file2.dat using that file format definition. More than one data file may be specified.


24. Environment Variables

Go to Top

Two environment variables can be used to control the behaviour of FileCarver: FC_DEFINITIONS_DIR and FC_DEFAULT_FILE_DIR.

The FC_DEFINITIONS_DIR environment variable specifies the path to the directory that FileCarver will use to read file format definitions from. When FC_DEFINITIONS_DIR is unspecified, FileCarver will look in a directory named 'definitions' in the same directory where FileCarver is located.

The FC_DEFAULT_FILE_DIR environment variable specifies the default directory that will be shown in an Open File or Save File dialog box with FileCarver. When this environment variable is not set, the Open File and Save File dialog boxes will start in the directory where FileCarver is located.

Setting environment variables is specific to each operating system, and is thus not covered in this guide.


25. Frequently Asked Questions

Go to Top

I have defined a file format for FileCarver that corresponds field-for-field to a C struct that I am using in my program. Why is it, that when I open files that are fine in my program with FileCarver, they do not display correctly?

There are two possibilities. First, check that the correct endianness is set. If you are using an x86 computer (Intel, AMD, etc), then you want to set the format attribute to 'little-endian' on the file tag, which is the same as setting it on every numeric field in the definition. This will ensure that fields are read with the correct endianness.

If you are certain that you have set the endianness correctly on fields in the file format definition, then you are most likely experiencing a problem with implicit struct padding by your C compiler. This process involves the compiler adding hidden variables into the struct during compilation, to make the size of the struct and the offsets of its members convenient numbers to work with. A simple example is the C compiler adding an extra hidden char to struct { short a; char b; } to make it 4 bytes long, instead of 3.

With many compilers, you can set options that will make the compiler issue warnings when this is happening. For example, if you are using GCC, pass the flag -Wpadded on the command line to the compiler to receive the appropriate warnings. Once you have identified that some variables are being implicitely added by the compiler, it is best to make them explicit in the C struct, and add them as hidden fields to FileCarver's file definition.

I want to be able to add new fields later on to my file formats, and re-open my old files and convert them to the new formats with extra fields. Can this be done with FileCarver?

Absolutely. Your format must have a field that specifies the current version of the file format. Then, you can use the 'existsif' attribute on the new fields that you add.

For example, suppose you have the following format:

<file>
  <unsigned-byte name="Format Version" id="version" value="1" editable="false"/>
  <int name="Field 1"/>
  <string name="Field 2" length="12"/>
</file>
Later, if you decide to add an extra field to the format, you can update the file format definition to the following:
<file>
  <unsigned-byte name="Format" id="version" display="list" value="2">
    <option value="1">Version 1</option>
    <option value="2">Version 2</option>
  </unsigned-byte>
  <int name="Field 1"/>
  <string name="Field 2" length="12"/>
  <byte name="Field 3" existsif=" ( version == 2 ) "/>
</file>
The new definition supports both version 1 and version 2 file formats. So, how would a version 1 file be converted to version 2? Very easily. You would open the version 1 file with the new format definition and select "Version 2" from the drop-down list for the Format field. "Field 3" will appear in the user interface, where you can now enter a value. When you save the file, it will be using the new version 2 format. You're done!

FileCarver reports a java.io.UnsupportedEncodingException when loading a file format definition containing a string field that uses a specific character encoding. However, this character encoding is listed as supported in this guide. How can I resolve this problem?

You will need to re-install Java on your machine and enable "support for additional languages" during the installation process.

If you are using Windows, it is possible to activate this functionality without re-installing Java by going to "Add and Remove Programs" from the Control Panel, selecting the Java Runtime Environment, clicking "Change" and activating support for additional languages through the wizard that is brought up.


© 2006-2008 Alexei Svitkine. All Rights Reserved.