Document Metadata

Along with the Flash document file itself Print2Flash can create a file
describing the document named metadata file. You may use this
information, for example, for creation of index of your documents for your
search engine.
You can turn on metadata file creation:
- when converting documents manually - by setting Create Metadata File
option and choosing the file format in the Metadata File Format field of the
Output tab of Document Options window;
- when converting documents programmatically - with CreateMetaDataFile,
MetaDataFileName and MetaDataFileFormat properties of
Profile
object or with CreateMetaDataFile, MetaDataFileName and
MetaDataFileFormat options of
Enhanced Batch Processing.
Metadata file formats
There are two supported formats of metadata file:
- XML format;
- Plain text format.
The format can be chosen with Metadata File Format field or MetaDataFileFormat
property mentioned above. Below is the description of each format.
XML format
XML format provides a convenient, easy and extendable way to describe the
document. Print2Flash produces XML documents of the following sample format:
|
<?xml version="1.0" encoding="utf-8"
?>
<!DOCTYPE Print2FlashDoc>
<Print2FlashDoc
xmlns="http://print2flash.com">
<pages pagenum="10">
<page num="1"
width="1632"
height="2112"
resolution="192">
<text>Text
of page 1</text>
</page>
<page num="2"
width="1587"
height="2245"
resolution="192">
<text>Text
of page 2</text>
</page>
...
</pages>
</Print2FlashDoc>
|
The generated XML document file is encoded using UTF-8 encoding. You may
parse this file yourself or using third-party libraries or components. For
example, on Windows platform you may use Microsoft XML Core Services (MSXML).
Below is the description of each tag of metadata file in XML format.
Print2FlashDoc tag
The root element of this XML document is Print2FlashDoc tag.
pages tag
Nested inside the root element is pages tag which envelopes
descriptions of each document page. pages tag has pagenum
attribute which contains the total number of pages in the document.
page tag
Nested inside the pages tag are a number of page tags, each tag
corresponding to a single document page. The tag has the following attributes:
- num - ordinal page number;
- width - page width in pixels (dots);
- height - page height in pixels (dots);
- resolution - resolution the page is rendered at in dots per inch
(DPI). You may calculate the physical page dimensions (in inches) by
dividing the page width and height by this resolution value.
text tag
text tag is nested in a page tag. It contains the text which
appears on the page. The order of the text corresponds to the order it was
printed in by the printing application. Note that some text may be sent by
printing applications not in the form of text but in the form of images. Such
text will not be present within the text tag.
Plain text format
This format is simpler than the XML format but is less flexible. A file
generated in this format represents a plain Unicode (UTF-16) text file in
little-endian format. The text contains all document text from all pages merged
together starting from the first page till the last page. There is no way to
distinguish which text belongs to which page using this format. If you need this
functionality, you have to use XML format.
At the beginning of text metadata file Print2Flash writes a byte order mark
consisting of two bytes (0xFF followed by 0xFE). This
mark designates the little-endian format.
Metadata file naming
Metadata file is named according to a name template. For example, you may
create metadata file with this name:
mydoc.xml
At programmatic conversion the file naming is controlled by MetaDataFileName
property of Profile
object or by MetaDataFileName option of
Enhanced Batch Processing. The
file name contains two placeholders for inserting the output document file
name and file extension. For example, to create the file mentioned above when
output document name is mydoc.swf, you need to have this value for the
MetaDataFileName property:
%name%.%ext%
When converting documents manually, the file is named in a similar way and
stored in the folder you specify when saving the Flash document in the
Print2Flash Application with Save
All button.
|