Text Formatting Settings
When you print to Miraplacid Text Driver,
Preview window pops up.
If you click on "Settings" button on the Preview Window toolbar,
Settings dialog will open.
Settings dialog has several tabs discussed
here. This document describes
Text Formatting Settings.
- Character set New versions of Windows print text in Unicode.
You can keep it in Unicode or convert to old good 8-bit bytes. Please specify
your charater set from the dropdown list.
- Insert Unicode prefix Some text editors add codes 0xFF 0xFE to the beginnning of Unicode text file.
We recommend to add the prefix unless you are sure your software does not support it.
- End of line style You can choose between Windows and Unix style.
We recommend to use Windows style unless you use legacy Unix software.
- Insert page breaks Adds page prakes symbol.
As an alternative, you can add {PAGE} to outlut file name to save pages to indvidual files.
- Formatting Style
Miraplacid Text Driver can format extracted text different ways. Unfortunately, there is no way
to make it look exactly like the original docuement. Plain text files do not support
different font types and sizes and cannot condence or expand characters. However,
Miraplacid Text Driver Text Formatting plug-ins do a really good job in most cases.
Formatting Styles
If you need your text to look like the original document, please select Formatted text.
If you need just a text without formatting, select Plain text.
If you familiar with XML files processing, you can try XML output.
It saves textboxes with text, size, location and
font information. Besides, it contains page size and DPI settings.
RSS formatting style allows you to save information in Web content syndication format (RSS)
for further using in news exchanging services.
Text with Layout is similar to Formatted text, but based on previous version of
text formatting algorithm.
We recommend to use "Formatted text" unless "Text with Layout" works better for you.
Formatting Style Settings
- Formatted text and Text with Layout - you can turn on or off Print margins.
When turned on, Miraplacid Text Driver will add blank borders to formatted text. Border sizes will
be calculated to match print margin settings in the document you extracting text from.
- Plain text - This text formatting style just merges all pieces of text in each line.
By default it adds blank character between them, but you can change it by updating Delimiter
value.
- XML option Optimize output can be used to merge individual textboxes, if words became split to several pieces, into whole words.
Textboxes bound coordinates will be merged if this option is turned on.
Whitespaces will be removed from output.
- RSS style settings Link and Description represent appropriate fields of RSS
channel attributes. Additionally, Add <BR> to EOL option adds linebreak tag to make text look formatted.
See also: