There are two types of HTML files - structured documents using headings (H1, H2, etc.) which HTMLDOC calls "books", and unstructured documents that do not use headings which HTMLDOC calls "web pages".
A very common mistake is to try converting a web page using:
htmldoc -f filename.pdf filename.html
which will likely produce a PDF file with no pages. To convert web
page files you must use the --webpage
option at the
command-line or choose Web Page in the input tab of the GUI.
HTMLDOC does not support HTML 4.0 elements, attributes, stylesheets, or scripting.
Element | Version | Supported? | Notes |
---|---|---|---|
!DOCTYPE | 3.0 | Yes | DTD is ignored |
A | 1.0 | Yes | See Below |
ACRONYM | 2.0 | Yes | No font change |
ADDRESS | 2.0 | Yes | |
AREA | 2.0 | No | |
B | 1.0 | Yes | |
BASE | 2.0 | No | |
BASEFONT | 1.0 | No | |
BIG | 2.0 | Yes | |
BLINK | 2.0 | No | |
BLOCKQUOTE | 2.0 | Yes | |
BODY | 1.0 | Yes | |
BR | 2.0 | Yes | |
CAPTION | 2.0 | Yes | See Below |
CENTER | 2.0 | Yes | |
CITE | 2.0 | Yes | Italic/Oblique |
CODE | 2.0 | Yes | Courier |
DD | 2.0 | Yes | |
DEL | 2.0 | Yes | Strikethrough |
DFN | 2.0 | Yes | Helvetica |
DIR | 2.0 | Yes | |
DIV | 3.2 | Yes | |
DL | 2.0 | Yes | |
DT | 2.0 | Yes | Italic/Oblique |
EM | 2.0 | Yes | Italic/Oblique |
EMBED | 2.0 | Yes | HTML Only |
FONT | 2.0 | Yes | See Below |
FORM | 2.0 | No | |
FRAME | 3.2 | No | |
FRAMESET | 3.2 | No | |
H1 | 1.0 | Yes | Boldface, See Below |
H2 | 1.0 | Yes | Boldface, See Below |
H3 | 1.0 | Yes | Boldface, See Below |
H4 | 1.0 | Yes | Boldface, See Below |
H5 | 1.0 | Yes | Boldface, See Below |
H6 | 1.0 | Yes | Boldface, See Below |
HEAD | 1.0 | Yes | |
HR | 1.0 | Yes | See Below |
HTML | 1.0 | Yes | |
I | 1.0 | Yes | |
IMG | 1.0 | Yes | See Below |
INPUT | 2.0 | No | |
INS | 2.0 | Yes | Underline |
ISINDEX | 2.0 | No | |
KBD | 2.0 | Yes | Courier Bold |
LI | 2.0 | Yes | |
LINK | 2.0 | No | |
MAP | 2.0 | No | |
MENU | 2.0 | Yes | |
META | 2.0 | Yes | See Below |
MULTICOL | N3.0 | No | |
NOBR | 1.0 | No | |
NOFRAMES | 3.2 | No | |
OL | 2.0 | Yes | |
OPTION | 2.0 | No | |
P | 1.0 | Yes | |
PRE | 1.0 | Yes | |
S | 2.0 | Yes | Strikethrough |
SAMP | 2.0 | Yes | Courier |
SCRIPT | 2.0 | No | |
SELECT | 2.0 | No | |
SMALL | 2.0 | Yes | |
SPACER | N3.0 | Yes | |
STRIKE | 2.0 | Yes | |
STRONG | 2.0 | Yes | Boldface Italic/Oblique |
SUB | 2.0 | Yes | Reduced Fontsize |
SUP | 2.0 | Yes | Reduced Fontsize |
TABLE | 2.0 | Yes | See Below |
TD | 2.0 | Yes | |
TEXTAREA | 2.0 | No | |
TH | 2.0 | Yes | Boldface Center |
TITLE | 2.0 | Yes | |
TR | 2.0 | Yes | |
TT | 2.0 | Yes | Courier |
U | 1.0 | Yes | |
UL | 2.0 | Yes | |
VAR | 2.0 | Yes | Helvetica Oblique |
WBR | 1.0 | No |
<!-- HALF PAGE -->
<!-- PAGE BREAK -->
<!-- NEW PAGE -->
<!-- NEW SHEET -->
<!-- NEED length -->
length
units left
on the current page. The length
value defaults to
points but can be suffixed by in
, mm
,
or cm
to convert from the corresponding units.
Requested Font | Actual Font |
---|---|
Arial | Helvetica |
Courier | Courier |
Helvetica | Helvetica |
Monospace | Courier |
Sans-Serif | Helvetica |
Serif | Times |
Symbol | Symbol |
Times | Times |
All chapters start with a top-level heading (H1) markup. Any headings within a chapter must be of a lower level (H2 to H6). Each chapter starts a new page or the next odd-numbered page if duplexing is selected.
The headings you use within a chapter must start at level 2 (H2). If you skip levels the heading will be shown under the last level that was known. For example, if you use the following hierarchy of headings:
<H1>Chapter Heading</H1> ... <H2>Section Heading 1</H2> ... <H2>Section Heading 2</H2> ... <H3>Sub-Section Heading 1</H3> ... <H4>Sub-Sub-Section Heading 1</H4> ... <H4>Sub-Sub-Section Heading 2</H4> ... <H3>Sub-Section Heading 2</H3> ... <H2>Section Heading 3</H2> ... <H4>Sub-Sub-Section Heading 3</H4> ...the table-of-contents that is generated will show:
VALUE="#"
TYPE="1"
TYPE="a"
TYPE="A"
TYPE="i"
TYPE="I"
External URL links are fully supported for HTML and PDF output, and
internal links (#target
and filename.html
)
are supported in HTML and PDF output.
When generating PDF files, local PDF file links will be converted to external file links for the PDF viewer instead of URL links. That is, you can directly link to another local PDF file from your HTML document with:
<A HREF="filename.pdf">...</A>
META
attributes for the
title page and document information:
<META NAME="AUTHOR" CONTENT="..."
<META NAME="COPYRIGHT" CONTENT="..."
<META NAME="DOCNUMBER" CONTENT="..."
<META NAME="GENERATOR" CONTENT="..."
<META NAME="KEYWORDS" CONTENT="..."
BREAK
attribute
is still supported by the HR
element:
<HR BREAK>Support for the
BREAK
attribute is deprecated and will be
removed in a future release of HTMLDOC.
MAX_COLUMNS
constant in the config.h file
included with the source code.
HTMLDOC supports HTML 3.0 tables with the following exceptions:
CAPTION
element is always shown at the top
of the table.
HTMLDOC does not support HTML 4.0 table elements or
attributes, such as TBODY
, THEAD
,
TFOOT
, or RULES
.