SAI Cleanup

Performs various cleanup tasks on a document (or documents). Cleanups can be run at different times throughout the composition and copyediting stages. Before running, review all files to confirm global changes will not result in the loss of critical content or information.


Select the cleanup actions to be run. The File Selection Summary shows the information selected in the File Selection tab.

Composition Cleanup

Performs cleanup tasks to preserve and modify styles.

  • Rendering: Replaces Word’s character rendering with ScML character styles where applicable.
  • Overwrite Non-ScML Character Styles: Applies rendering cleanup in non-ScML character styles. Removes non-ScML character styles where no special character formatting is applied.
  • Associate Styles: Loads the Associate Styles dialog to map non-ScML styles to ScML styles globally. See the Associate Styles header below for additional feature documentation.
  • Apply url Style: Identifies text patterns that resemble a hyperlink (such as words that begin with http://) and applies the url style or url-i if italic.
  • Underline to Italic: Replaces underline formatting with italic formatting.
  • Compose Index Fields: Composes Word index entries as idx, preserving character styles with angle-bracket markup.

Punctuation Rendering

Adds or removes character styles (italic, bold, and bold-italic) from spaces and punctuation (comma, semicolon, colon, and period) following composed text.

Convert to Live Text

Preserves dynamic and linked Microsoft Word content as live text.

  • List Leaders: Converts Word characters that begin list items (numbers, letters, symbols) into literal characters. Run before paragraph styles are applied or Associate Styles is run.
  • Fields: Converts fields to text.

Punctuation and Character Cleanup

Performs cleanup and standardization tasks on punctuation and other characters.

  • Periods to Ellipses: Replaces three periods (spaced or not spaced) with the ellipsis character. Also moves the fourth period before the ellipsis character, if applicable. (Mutually exclusive with Ellipses to Spaced Periods.)
  • Ellipses to Spaced Periods: Replaces the ellipsis character with periods separated by nonbreaking spaces. (Mutually exclusive with Periods to Ellipses.)
  • Periods to Spaced Periods: Replaces three unspaced periods (plus additional surrounding punctuation) with periods separated by nonbreaking spaces.
  • Latin Ligatures to Character Pairs: Replaces common ligatures with their two- or three-character letter equivalents.
  • Single Quotation Marks: Replaces single straight quotation marks with smart (or “curly”) quotation marks.
  • Double Quotation Marks: Replaces double straight quotation marks with smart (or “curly”) quotation marks.
  • Dashes: Replaces double hyphens, spaced hyphens, and spaced en dashes with em dashes. Also replaces unspaced hyphens between numbers with en dashes. Dashes and hyphens in phone numbers, ISBNs, and text composed with the url or url-i character styles are preserved.

Space and Break Cleanup

Performs cleanup tasks to remove extraneous spacing. As authors may use spaces and breaks to format tables, indicate block quotes, and so on, it is recommended that these cleanup tasks be run after paragraph styles have been applied.

  • Spaces: Eliminates multiple spaces. Also removes spaces from before and after paragraph breaks and tabs.
  • Paragraphs: Eliminates multiple paragraph breaks.
  • Tabs: Eliminates multiple tabs and tabs that appear at the beginning or end of a paragraph.
  • All Breaks: Selects all break options.
  • Lines: Replaces line breaks with paragraph breaks.
  • Column: Replaces column breaks with paragraph breaks.
  • Pages: Replaces page breaks with paragraph breaks.
  • Sections: Replaces section breaks with paragraph breaks.

Clean File Construction

Fixes common tagging issues prior to upload to the Digital Hub.

  • Clean Note Markers: Fixes common note issues, which include removing note character styles from white space. Runs on an entire document only.
  • Clean Structure Indicators: Removes spaces before structure indicators, removes character styles from indicators, and applies structure style to indicators.
  • Clean Figures: Removes spaces around image callouts and adds img and fig styles as necessary.
  • Clean Queries: Converts legacy Scribe callouts to current, curly-bracketed callouts.
  • Remove Character Styles from Paragraph Breaks: Removes character styles from paragraph breaks.

Compose Paragraphs

Automatic composition of certain paragraphs based on formatting and context. With the exception of white space to p, it’s strongly recommended these tools be applied only after all character styles and formatting has been composed. Paragraphs that meet certain formatting options can be automatically converted to p, pcon, list styles (nl/bl/ul), and note paragraphs (fn/en), These tools won’t find all instances but can tag several paragraphs at a time based on what’s already composed.

  • Compose Heads: If toc styles have already been applied, composing heads will attempt to find the uncomposed paragraph or paragraphs that correspond to the toc entries and tag them appropriately. Heads that could not be found and tagged will be reported. Heads should always be checked after running this tool to confirm that the correct instance of the text was tagged.

Mark for Editing

Inserts an internal query near items for editorial review.

Mark for Editing

  • Long run-in quotations: Marks run-in quotations that appear to be longer than the selected style guide (CMS, APA, or MLA) recommends. Select a style guide in the User Settings prior to running this cleanup option.
  • Short block quotations: Marks composed block quotations that appear to be shorter than the selected style guide (CMS, APA, or MLA) recommends. Select a style guide in the User Settings prior to running this cleanup option.
  • Unmatched paired punctuation: Marks paragraphs that have an uneven number of opening and closing parentheses, brackets, and other paired punctuation characters.
  • Missing paragraph-ending punctuation: Marks paragraphs that don’t appear to end with a legal punctuation character. Ignores certain styles like heads and senselines and recognizes punctuation before a note marker.
  • Potentially hidden text: Marks text that has been made very small or white or has been placed in a very small text box.
  • Potentially incorrect inclusive numbers: Marks en dash–separated number ranges that don’t match the rules for inclusive number ranges as described in the Chicago Manual of Style.
  • Unmatched Styles in Paired Punctuation: Marks matched double quotes and various matched brackets that don’t have the same character style.
  • Title case issues in italics/quotes: Marks quoted and italic text against the SAI’s title case rules. In order to limit false hits, it will only check certain italic and quoted text. Quotes over 150 characters and text that only contains one capitalized word will NOT be checked. Some books may catch too many false hits. For that reason, it is not recommended to use this tool on novels with many quotes or on bibliographies with many book titles in foreign languages, as the title case tools are designed around English title case rules.
  • All caps used in small caps style: Marks the use of all caps in styles used for small caps (such as sm and tetr). Applying small caps styles to all caps text does not result in small caps letters and is usually an error.

Mark for Composition Review

Inserts queries near items for composition review.

  • Prohibited characters in url styles: Queries certain characters in URL styles that are not typically permitted.
  • Missing space around style: Queries possible missing spaces between character styles and other text. (This is typically fixed by running the “Make white space surrounding…” composition cleanup.)
  • Potentially dropped spaces: Queries select text patterns in which spaces may be missing around punctuation.

These tools can take some time to complete, so it is always recommended to save the file before running them.

Mark Index Entries for Editing (Windows Only)

Inserts queries in places where entries or page numbers appear to be out of order. Will also check regularly formatted subentries for both. The index must be composed before running.

When checking page order, only the first page in a span is checked. If the same page is listed twice in a row, it does not get flagged as an error.

These features work with both indented and run-in indexes, so long as they are properly composed and formatted. Alphabetical order can be checked on a letter-by-letter or word-by-word basis. This will also check cross-references at the end of a paragraph.

Associate Styles

Maps user-defined styles to ScML styles. Accessible only when running Rendering Cleanup from the Composition tab.

Two lists will appear: The left one comprises the styles found in the document; the right one is the ScML styles. To associate a document style with an ScML style, select each style from its respective list and click Associate.

In the left list, non-ScML paragraph styles that are not defined as bold or italic will also have a bold (B) and an italic (I) listing. This is to aid in styling headers and other content that may be indicated via local rendering. For example, if a-heads have not been distinguished with a style (they’re all “Normal”) but, rather, are all bold, you can associate “Normal” (B) with “ah” from the ScML styles list. If “Normal” a-head text is only partially bold or partially italic (e.g., if a book title is mentioned in the a-head), then the bold or italic will be retained but the paragraph will be mapped as whatever style you have associated with “Normal.” After associating styles, you can then apply the correct style (ah) to that paragraph manually.

Click Associate All with P to map all remaining unmapped paragraph styles to the ‘p’ style.

  • If you would prefer not to see or use the bold and italic style options, click the button to Remove unmapped ‘B’ and ‘I’ styles.
  • Options are also available to hide/display ScML styles found in the document, paragraph styles, and character styles. Note that only styles visible when ‘OK’ is clicked will be mapped.

When finished, click OK. The document styles will be recomposed as their associated ScML styles.

Styles in the document pane will be indicated as P (paragraph style) or C (character style). You will only be allowed to map a style to another style of the same type.

  • Load Association: Use the drop-down menu to select from previously saved associations. Click this button to load the relevant mapping into the current document. This is useful when working on multiple book chapters that were authored using similar style sets.
  • Save Association: Save an association with a memorable name for use in other documents.
  • Delete Association: Use the drop-down menu to select from previously saved associations. Click this button to remove an association from the list when it is no longer needed.

File Selection

Select the file(s) and parts of the file(s) on which the cleanup options will run.

  • File Scope: Select whether cleanup will be run on the active document or on multiple documents. If cleanup is being run on multiple documents, options are available to either automatically save each file or prompt the user to save.
  • Document Scope: Select the parts of the document on which cleanup will run: main text, footnotes, endnotes, and/or queries.
  • Document Saving: Active when Selected Files is designated. Next to the Selected Files box, choose to have Word save the cleaned-up documents automatically or prompt you to save each file.