Documentation

Digital Hub File Alerts

The Digital Hub automatically checks uploaded .docx, .sam, and .scml files for validity and an assortment of known issues.

Due to the variety of potential errors, alerts are often unable to notify users of the exact issue or line in the file at which the issue occurs. The following guide provides further explanation for how to interpret and resolve the issues listed by the Digital Hub. Some errors can have a cascading effect, in which resolving the initial problem will correct other errors listed. Reupload your file or validate locally after each aspect is corrected in your file. For further assistance, contact your WFDW support member directly or through scribenet.com’s support page.

Note

A Hub alert does not necessarily mean that the issue is an error, only that it is a potential error. A user can determine in some circumstances that a Hub alert may be disregarded.

In the majority of cases, however, the alert indicates a critical issue to which a user should attend to get the desired output.

Reference

Digital Hub Conversion Settings documentation.

DTD Validation Errors

File Is Not Well-Formed

Message

The file is not well-formed.

A file is not well-formed if there is a basic XML syntax error. Check for opening tags that do not close, closing tags that do not open, lines that are missing paragraph tags, or tags that have been mistyped.

In order for a file to be well-formed, it must contain proper nesting and character encoding.

  • Check that all tags open and close in order.
  • Check that ampersands are encoded as & (including when ampersands appear in URLs).
  • Check that less-than or greater-than angle brackets appear as < and >.
  • Check that all commented-out text is formatted with the proper syntax (e.g., <!-- text here -->).
  • Check that attributes are only listed once (e.g., the height of an image is not defined twice).
  • Check that attributes include the proper syntax (e.g., quotation marks open and close).

sam or ScML Does Not Follow DTD

Message

Element sam/scml content does not follow the DTD, expecting [. . .].

Validation errors indicate any violation of the grammatical rules laid out for a file type by its DTD.

For example, this can include when <p> paragraphs fall outside of <chapter> tags in an ScML file or when any text has been found outside of paragraph tags.

Often, it is a character style. (Validate the file using a text program and sort the validation list of “found” styles, removing duplicates, to see if any character styles are present.)

At other times, it may be a random space or character outside of paragraph tags. (If this is the case, the validation error message will likely indicate “CDATA” was found.)

Some causes of this include unstyled content in an InDesign file that did not map to the .sam DTD before export and manual errors introduced by a person manipulating a file locally before processing it.

  • Check that all character styles appear within paragraph styles.
  • Check that all text is contained within paragraph styles.
  • Check that all paragraphs in .scml appear within <chapter> tags.

Message

Element [style name] was declared #PCDATA but contains non text nodes.

This error indicates improper nesting. In a .sam file, for example, there can be no nesting of character styles, so one character style needs to close before any other can begin.

Message

Syntax of value for attribute id of chapter is not valid.

A character that is not considered a "name character" is in an attribute value or the characters are not appearing in an acceptable order. Only letters, numbers, and underscores should be used in attribute values, and the value must start with a letter (not a number or underscore).

DTD Not Found

Message

The DTD could not be found.

This message indicates that there is no doctype declaration at the top of the file pointing to the DTD or the doctype references a DTD location that cannot be found. This could happen if you’ve made your own XML outside of the Digital Hub or have changed a file that had been created in the Hub.

File Expectations for .sam

Message

File does not match expectations for flat sam.

ScML files contain information such as <book> and <chapter> division tags that should not appear in .sam files, which are considered to be “flat” XML files.

File Expectations for ScML

Message

File does not match expectations for structured ScML.

ScML files contain more information than .sam files, such as <book> and <chapter> division tags. This message will alert you if these necessary tags are missing.

File Alerts

Blank Paragraphs

Message

Contains blank paragraphs.

A blank paragraph can either be a self-closing tag or a paragraph that contains only a page ID (e.g., <p/> or <p><page id="p3"/></p>).

Check for a paragraph that contains only a page ID. A page ID is not considered content; it is simply a marker of location. If found, place the page ID into the appropriate paragraph and delete the empty line.

Check for any self-closing paragraph tags.

In Word, check for any blank paragraphs within the .docx file.

To find blank paragraphs in a text-only file (.sam, .scml), search for the following:

<([a-z\d]+)><page id="p([a-z\d]+)"/></([a-z\d]+)>

and

<([a-z\d]+)/>

Page IDs Out of Order

Message

Page markers are out of order beginning at the referenced page.

This warning is generated when page IDs appear out of sequence or if there are duplicate page IDs. If there is more than one file in the assets window, the Digital Hub’s checker searches across those files. The checker looks only for roman numerals and arabic numbers.

A page marker order alert will be triggered in the following scenarios within a single file or across multiple files:

  • An expected page ID is missing, including page 1, when page IDs transition from roman numerals to arabic numbering (e.g., if <page id="p2"/> follows a roman numeral page ID).
  • A page ID appears out of sequence (e.g., <page id="p1"/> follows <page id="p2"/>).
  • A page ID appears more than once.
  • Letters, rather than arabic or roman numerals, have been used for page numbering (e.g., <page id="pa"/> and <page id="pb"/>).

Files can begin at any page and not generate a page marker order alert if the page ID numbering proceeds consistently from that point. Because the Digital Hub checks only for the consistent page ID sequence within a file or set of files, a page marker order alert will not be triggered in the following scenarios:

  • The only page ID missing is the first page expected in the file (e.g., <page id="pi"/> or <page id="p1"/>).
  • Page locators appear out of order (e.g., <page locator="pv"/>).

This message will only indicate the first found instance. When resolved, recheck to confirm no other page ID errors are present.

Note: Page IDs being out of order may be acceptable in certain instances due to the placement of figures. See also Page ID Placement and Reading Order.

Footnote/Endnote Numbering Pattern

Message

Note does not match expected pattern of “\[note marker\] \[note content\].”

For the Digital Hub to link <fn> or <en> paragraphs with the appropriate <fnref> or <enref> markers within the text, the <fnnum> or <ennum> must appear with a specific pattern. If an <fnnum> or <ennum> is not located at the beginning of its <fn> or <en> paragraph, the note cannot be linked.

Footnote/Endnote Numbering Mismatch

Message

The number of fnref character styles in the file does not match the number of fnnum styles for footnotes.
The number of enref character styles in the file does not match the number of ennum styles for endnotes.

For the Digital Hub to link <fn> or <en> paragraphs with their references in the body text, the number of <fnnum> or <ennum> tags must match the number of <fnref> or <enref> tags. If the number of note numbers and note references does not match, the notes cannot be linked.

Footnotes within Tables

Message

Footnotes detected inside of table cells.

Because embedded notes within tables are not supported by Adobe InDesign, notes cannot be embedded in IDTT output if they are present. The footnotes should be moved outside of the table cells.

Cover Division Not Found

Message

Cover division could not be found.

When converting to an e-book, the cover should be listed in its own division tag within the .scml file.

Specifically, the cover should be listed in this way (with the appropriate file name):

    <chapter id="cvi">
      <cover>
        <fig><img src="scr-filename.jpg"/></fig>
      </cover>
    </chapter>

Word File Is Not .docx

Message

File does not appear to be a conformant *.docx file. Resave the file from MS Word.

The Digital Hub can only work with Word documents saved in the .docx format. If a file has been given the .docx extension but was not saved in that way, open the file in Word (you may need to open in recovery mode) and resave the file as .docx.

Tables within Tables

Message

File contains tables nested within table cells. First instance is listed.

Nesting tables within other tables is not supported for .docx files. This message will flag the first instance of this. Go to the source file and adjust all tables as needed.

Table Grid Error

Message

At least one row of your table does not contain the correct number of cells. Text from that row is included.

Review the table to determine where the incorrect number of cells is occurring. Table errors such as this typically occur when a table has been manipulated in .sam or .scml.

Hidden Text

Message

This *.docx has at least one instance of hidden text. First instance is listed.

Use the SAI’s “Potentially hidden text” tool to find any content that may not be visible to the user but would interfere with conversion in the Digital Hub.

Unstyled Content

Message

File contains unstyled content. First instance is listed.

This message indicates that there is content in the .docx file that does not use an ScML style. The first instance will be listed, but there may be more. Check globally to confirm all content has an ScML style applied to it.

Only ScML styles can be used when converting files through the Digital Hub. Refer to the ScML list to determine the proper style to use in place of the listed non-ScML style.

Note: ScML styles must always be lowercase. This error will also be generated if capital letters have been used. When this error occurs, the file will also receive an error message about not being DTD valid.

Private Use Area Unicode

Message

Instances of Private Use Area Unicode encountered.

These codepoints do not have characters assigned, and you may need a specialized font to render them.

Some fonts purposefully use “unmapped” portions of the Unicode range (hence “Private Use Area”). They assign glyphs not supported by Unicode to those codepoints within the context of the font. These fonts are still considered Unicode compliant, but they will require the specialized font to be embedded in order to render properly.

Structure Indicator Syntax

Message

A structure indicator has incorrect syntax.

Structure indicators require specific formatting (e.g., {~?~ST: begin sidebar} or {~?~ST: end sidebar}). Review the formatting for proper use of curly brackets, tildes, and spacing.

Structure Indicator Mismatch

Message

A structure indicator does not close before the end of the file.

For each opening structure indicator (e.g., {~?~ST:begin sidebar}), there must be a corresponding closing structure indicator (e.g., {~?~ST:end sidebar}). Review the file to determine which structure indicators may be improperly paired.

Structure Indicator Cannot Be Resolved

Message

A structure indicator is placed in a way that cannot be resolved.

Structure indicators must be closed in order. If structure indicators close out of the sequence in which they open, the Hub cannot correctly interpret them to create nested structures in ScML.

This could mean that an "end" tag does not have a corresponding "begin" tag. Search for the text indicated as "nearby text" in the validation note to find the unpaired structure indicator.

Unresolved Images

Message

The following text is tagged as an image but needs to be resolved to an image element.

This message will appear if <img> tags are used on nonimage text. Review any <img> tags to determine whether the tag has been applied incorrectly to live text or if the image callout needs to be adjusted for proper formatting.