Documentation

Digital Hub File Alerts

The Digital Hub automatically checks uploaded .docx, .sam, and .scml files for validity and an assortment of known issues.

Due to the variety of potential errors, alerts are often unable to notify users of the exact issue or line in the file at which the issue occurs. The following guide provides further explanation for how to interpret and resolve the issues listed by the Digital Hub. Some errors can have a cascading effect, in which resolving the initial problem will correct other errors listed. Reupload your file or validate locally after each aspect is corrected in your file. For further assistance, contact your WFDW support member directly or through scribenet.com’s support page.

Note

A Hub alert does not necessarily mean that the issue is an error, only that it is a potential error. A user can determine in some circumstances that a Hub alert may be disregarded.

In the majority of cases, however, the alert indicates a critical issue to which a user should attend to get the desired output.

Reference

Digital Hub Conversion Settings documentation.

DTD Validation Errors

File Is Not Well-Formed

Message

The file is not well-formed.

A file is not well-formed if there is a basic XML syntax error. Check for opening tags that do not close, closing tags that do not open, lines that are missing paragraph tags, or tags that have been mistyped.

In order for a file to be well-formed, it must contain proper nesting and character encoding.

  • Check that all tags open and close in order.
  • Check that ampersands are encoded as & (including when ampersands appear in URLs).
  • Check that less-than or greater-than angle brackets appear as < and >.
  • Check that all commented-out text is formatted with the proper syntax (e.g., <!-- text here -->).
  • Check that attributes are only listed once (e.g., the height of an image is not defined twice).
  • Check that attributes include the proper syntax (e.g., quotation marks open and close).

sam or ScML Does Not Follow DTD

Message

Element sam/scml content does not follow the DTD, expecting [. . .].

Validation errors indicate any violation of the grammatical rules laid out for a file type by its DTD.

For example, this can include when <p> paragraphs fall outside of <chapter> tags in an ScML file or when any text has been found outside of paragraph tags.

Often, it is a character style. At other times, it may be a random space or character outside of paragraph tags. (If this is the case, the validation error message will indicate “CDATA” was found.)

Some causes of this include unstyled content in an InDesign file that did not map to the .sam DTD before export and manual errors introduced by a person manipulating a file locally before processing it.

  • Check that all character styles appear within paragraph styles.
  • Check that all text is contained within paragraph styles.
  • Check that all paragraphs in .scml appear within <chapter> tags.

Regular expressions can be used in Sublime Text 3 to find where these errors occur.

Message

Element [style name] was declared #PCDATA but contains non text nodes.

This error indicates improper nesting. In a .sam file, for example, there can be no nesting of character styles, so one character style needs to close before any other can begin.

Message

Syntax of value for attribute id of chapter is not valid.

A character that is not considered a "name character" is in an attribute value or the characters are not appearing in an acceptable order. Only letters, numbers, and underscores should be used in attribute values, and the value must start with a letter (not a number or underscore).

DTD Not Found

Message

The DTD could not be found.

This message indicates that there is no doctype declaration at the top of the file pointing to the DTD or the doctype references a DTD location that cannot be found. This could happen if you’ve made your own XML outside of the Digital Hub or have changed a file that had been created in the Hub.

File Expectations for .sam

Message

File does not match expectations for flat sam.

ScML files contain information such as <book> and <chapter> division tags that should not appear in .sam files, which are considered to be “flat” XML files.

File Expectations for ScML

Message

File does not match expectations for structured ScML.

ScML files contain more information than .sam files, such as <book> and <chapter> division tags. This message will alert you if these necessary tags are missing.

File Alerts

Blank Paragraphs

Message

Contains blank paragraphs or paragraphs that include only page IDs.

A blank paragraph can either be a self-closing tag or a paragraph that contains only a page ID (e.g., <p/> or <p><page id="p3"/></p>).

Check for a paragraph that contains only a page ID. A page ID is not considered content; it is simply a marker of location. If found, place the page ID into the appropriate paragraph and delete the empty line.

Check for any self-closing paragraph tags.

In Word, check for any blank paragraphs within the .docx file.

To find blank paragraphs in a text-only file (.sam, .scml), search for the following:

<([a-z\d]+)><page id="p([a-z\d]+)"/></([a-z\d]+)>

and

<([a-z\d]+)/>

Page IDs Out of Order

Message

Page markers are out of order beginning at the referenced page.

This warning is generated when page IDs appear out of sequence or if there are duplicate page IDs. If there is more than one file in the assets window, the Digital Hub’s checker searches across those files. The checker looks only for roman numerals and arabic numbers.

A page marker order alert will be triggered in the following scenarios within a single file or across multiple files:

  • An expected page ID is missing, including page 1, when page IDs transition from roman numerals to arabic numbering (e.g., if <page id="p2"/> follows a roman numeral page ID).
  • A page ID appears out of sequence (e.g., <page id="p1"/> follows <page id="p2"/>).
  • A page ID appears more than once.
  • Letters, rather than arabic or roman numerals, have been used for page numbering (e.g., <page id="pa"/> and <page id="pb"/>).

Files can begin at any page and not generate a page marker order alert if the page ID numbering proceeds consistently from that point. Because the Digital Hub checks only for the consistent page ID sequence within a file or set of files, a page marker order alert will not be triggered in the following scenarios:

  • The only page ID missing is the first page expected in the file (e.g., <page id="pi"/> or <page id="p1"/>).
  • Page locators appear out of order (e.g., <page locator="pv"/>).

This message will only indicate the first found instance. When resolved, recheck to confirm no other page ID errors are present.

Note: Page IDs being out of order may be acceptable in certain instances due to the placement of figures. See also Page ID Placement and Reading Order.

Footnote/Endnote Numbering Pattern

Message

Note does not match expected pattern of “\[note marker\] \[note content\].”

For the Digital Hub to link <fn> or <en> paragraphs with the appropriate <fnref> or <enref> markers within the text, the <fnnum> or <ennum> must appear with a specific pattern. If an <fnnum> or <ennum> is not located at the beginning of its <fn> or <en> paragraph, the note cannot be linked.

Note Count

Message

The numbers of footnote references (fnref), footnote numbers (fnnum), and footnote paragraphs (fn) in the file do not match.
The numbers of endnote references (enref), endnote numbers (ennum), and endnote paragraphs (en) in the file (or set of files uploaded as a group) do not match. Numbers listed here pertain to the whole file set being checked, of which this file is the last.

For the Digital Hub to link <fn> or <en> paragraphs with their references in the body text, the number of <fnnum> or <ennum> tags must match the number of <fnref> or <enref> tags. If the note numbers and note references do not match, the notes cannot be linked.

Footnotes within Tables

Message

Footnotes detected inside of table cells.

Because embedded notes within tables are not supported by Adobe InDesign, notes cannot be embedded in IDTT output if they are present. The footnotes should be moved outside of the table cells.

Note Style Mix

Message

Footnote and endnote tags are being used within the same note. Attempting to embed notes will cause issues.

An <fnnum> should never be contained within an <en>, and an <fn> should never be contained within an <endnote>. An <ennum> should never be contained within an <fn>, and an <en> should never be contained within a <footnote>. This problem can be caused by incorrect scribing. Trying to embed notes in a file where the footnote and endnote styles are mixed together like this can result in unexpected results.

If you notice this during scribing and the notes are embedded, a few SAI tools can quickly correct this. If the paragraph style is wrong, the issue can be resolved in the Word file using the SAI tool Cleanup > Scribe Paragraphs > fn/en. If the character style is wrong, Cleanup > Scribing Cleanup > Clean note marker will correct it.

You can also search a sam or ScML file for any individual note that may be triggering the warning.

<footnote[^>]*>\s*<en

<endnote[^>]*>\s*<fn

<fn[^>]*>[^\n]*<ennum

<en[^>]*>[^\n]*<fnnum

Note Reference out of Numeric Order

Message

Numbered note references are out of numeric order in the document. First instance is listed.

If a file uses numeric footnote or endnote references, the Digital Hub will check each note reference against the note immediately preceding it. If one <fnref> is <fnref>1</fnref> and the next is <fnref>3</fnref>, the Hub will direct you to the line where <fnref>3</fnref> occurs.

If a number is skipped, look for a note reference that may have been missed between the two scribed note references. Note references out of order can potentially point to other note integrity issues or can be a result of manually numbered notes that were not updated after a note was added or removed. It can also point to issues where the note reference style is applied to text that it should not be applied to.

Note Reference/Number Text Mismatch

Message

The linked note numbers and references indicate that some of these notes may not match. First instance of mismatched note ref/num is listed.

When converting files to ScML, the Digital Hub links note references to the number in the order they appear in the document. If the markers in the note reference and note number do not match, the Hub will list the problematic note ID and provide additional information about the location of the note reference and note number.

The issue may be related to the note reference and number that are noted. However, it also may be the result of an extra or missing note marker or reference. Review the previous note reference and note number styles as well.

Cover Division Not Found

Message

Cover division could not be found.

When converting to an e-book, the cover should be listed in its own division tag within the .scml file.

Specifically, the cover should be listed in this way (with the appropriate file name):

    <chapter id="cvi">
      <cover>
        <fig><img src="scr-filename.jpg"/></fig>
      </cover>
    </chapter>

Word File Is Not .docx

Message

File does not appear to be a conformant *.docx file. Resave the file from MS Word.

The Digital Hub can only work with Word documents saved in the .docx format. If a file has been given the .docx extension but was not saved in that way, open the file in Word (you may need to open in recovery mode) and resave the file as .docx.

Tables within Tables

Message

File contains tables nested within table cells. First instance is listed.

Nesting tables within other tables is not supported for .docx files. This message will flag the first instance of this. Go to the source file and adjust all tables as needed.

Table Grid Error

Message

At least one row of your table does not contain the correct number of cells. Text from that row is included.

Review the table to determine where the incorrect number of cells is occurring. Table errors such as this typically occur when a table has been manipulated in .sam or .scml.

Hidden Text

Message

This *.docx has at least one instance of hidden text. First instance is listed.

Use the SAI’s “Potentially hidden text” tool to find any content that may not be visible to the user but would interfere with conversion in the Digital Hub.

Unstyled Content

Message

File contains unstyled content. First instance is listed.

This message indicates that there is content in the .docx file that does not use an ScML style. The first instance will be listed, but there may be more. Check globally to confirm all content has an ScML style applied to it.

Only ScML styles can be used when converting files through the Digital Hub. Refer to the ScML list to determine the proper style to use in place of the listed non-ScML style.

Note: ScML styles must always be lowercase. This error will also be generated if capital letters have been used. When this error occurs, the file will also receive an error message about not being DTD valid.

Private Use Area Unicode

Message

Instances of Private Use Area Unicode encountered.

These codepoints do not have characters assigned, and you may need a specialized font to render them.

Some fonts purposefully use “unmapped” portions of the Unicode range (hence “Private Use Area”). They assign glyphs not supported by Unicode to those codepoints within the context of the font. These fonts are still considered Unicode compliant, but they will require the specialized font to be embedded in order to render properly.

Structure Indicator Syntax

Message

A structure indicator has incorrect syntax.

Structure indicators require specific formatting (e.g., {~?~ST: begin sidebar} or {~?~ST: end sidebar}). Review the formatting for proper use of curly brackets, tildes, and spacing.

Structure Indicator Mismatch

Message

A structure indicator does not close before the end of the file.

For each opening structure indicator (e.g., {~?~ST:begin sidebar}), there must be a corresponding closing structure indicator (e.g., {~?~ST:end sidebar}). Review the file to determine which structure indicators may be improperly paired.

Structure Indicator Cannot Be Resolved

Message

A structure indicator is placed in a way that cannot be resolved.

Structure indicators must be closed in order. If structure indicators close out of the sequence in which they open, the Hub cannot correctly interpret them to create nested structures in ScML.

This could mean that an "end" tag does not have a corresponding "begin" tag. Search for the text indicated as "nearby text" in the validation note to find the unpaired structure indicator.

Unresolved Images

Message

The following text is tagged as an image but needs to be resolved to an image element.

This message will appear if <img> tags are used on nonimage text. Review any <img> tags to determine whether the tag has been applied incorrectly to live text or if the image callout needs to be adjusted for proper formatting.

Special Character Issues

Message

The following special characters may be incorrect. Review and replace these characters as needed.

Many special Unicode characters get used incorrectly in place of much more commonly used characters. These special characters may not be available in a given font, may not match the appearance of the correct character, or may interrupt the accessibility of e-books. For this reason, using the correct character is always important. Check whether you need the special character or if you intended to use the recommended character instead.

Message

Links in the table of contents do not match the order in which the linked content appears. Check that the table of contents is in the correct order and that the links are correct.

A table of contents should typically list the content in the order it appears in the book. This warning can come up for multiple reasons:

  • The links in the table of contents might be going to the wrong content.
  • The paragraphs in the table of contents might be out of order.
  • The chapters of the book might be out of order.

The Digital Hub may link the table of contents incorrectly for books that have multiple chapters or heads with very similar text. Check the links first then review the order of the content itself.