Documentation

Reviewing Italic Terms, Phrases, and Titles

Many Scribe procedures include a step to perform text checks in a sam or ScML file using regular expressions. These search patterns may indicate errors in files. This page provides more information about how and why to review files for errors and inconsistencies related to italic terms, phrases, and titles.

Internal Consistency

The <(i|tnw|cite)>([^<]*)</\1> search can be found on the Regular Expressions Resource Page and will return results as part of the Sublime Text Regular Expression Result Counts package.

Using a sam or ScML file, run this regular expression to pull all the italic words and phrases from a file. Sort and review the results. In addition to the i style for italics, this search includes the accessibility styles tnw (title, name, or work) and cite (citation).

This search identifies italic text in the file, some of which could represent titles or names of newspapers, academic journals, and other publications. Other italic text may be foreign or scientific terms, emphasized phrases, or ship names.

There are many reasons to handle elements consistently in publications: Deviations make it difficult to find information, tripping up searches in environments such as e-readers that may not be as sophisticated as Google. While some programs can compensate for certain variations, differences in spelling and punctuation or the use of ligatures can affect the results a search may find.

Regularized, consistent treatments make it easier to augment information at later times. The promise of XML is to create an archival extensible file. With this in mind, variation (be it in structure or the content) is in opposition to XML. Indeed, Scribe would make the case that consistency across an imprint is even more desirable than merely within single books. Thus, for example, we would suggest index and abbreviation canons for publishers to employ across all their titles.

Solidifying a manuscript by cleaning up all possible errors prior to pages is a requirement for the successful deployment of the ScML2PDF process. Additionally, all our data demonstrate a reduction of effort the earlier up the chain things are fixed.

Of course, the reader’s experience should be considered as well. An inconsistency in the presentation of titles (e.g., using abbreviated titles that do not reflect the full intent of the book title) can be jarring and distracting. Anything that distracts a reader or requires an extra act of interpretation can be detrimental to the reading experience and hinder the goal of the publication.

  1. Using Sublime Text on a sam or ScML file, search for <(i|tnw|cite)>([^<]*)</\1> using Find All.

  2. Paste the results into a new document.

  3. Use the Permute (Unique) function to remove duplicate entries (Edit > Permute Lines > Unique).

  4. Use the Sort Lines function to place the results in alphabetical order (Edit > Sort Lines).

Review the Results

Scroll through the results and take note of any terms, phrases, or titles that may be incorrect or inconsistent within the file.

Possible Errors

  • Spelling errors

  • Shortened titles that are not sensible or do not match the corresponding aspect in the full title

  • Punctuation errors (e.g., quotation marks, em dashes, parentheses, or brackets that should open and close with the same italic or roman treatment; punctuation on abbreviations and acronyms that should be included within the italic treatment, as in “<i>120 lb</i>.” or “<i>The Man from U.N.C.L.E</i>.”)

  • Capitalization errors

  • Spacing errors (e.g., the treatment of initials)

  • Inconsistencies between new content (e.g., indexes, praise pages) and the existing, edited material

Valid Mismatches

In some cases, apparent inconsistencies are intentional and correct.

  • Citation formatting vs. the presentation in the body of the book (e.g., one section may use sentence case while another uses title case)

  • Shortened titles

  • Instances within quoted material

  • Occurrences within the book that discuss the different terms or treatments specifically

Small Caps, Bold, and Other Styles

While italics are the most common place to find inconsistencies in the presentation of terms, phrases, and titles, some books may use small caps, bold, or other character styles that should be reviewed for these issues.

If needed, pull those phrases by modifying the italics search by replacing the style name in the first set of parentheses.

  • <(sm)>([^<]*)</\1>

  • <(b)>([^<]*)</\1>

Examples of Possible Errors

These examples show possible errors. The search results could represent completely different books, updated editions, subsequent volumes, and so on. These aspects cannot be determined without context, but this search provides a good basis for further investigation.

Example 1: Inconsistent Pluralization

Civil War West: Testing the Limits of the United States
Civil War Wests: Testing the Limits of the United States

Example 2: Inconsistent Punctuation

Portland Oregon: Its History and Builders
Portland, Oregon, Its History and Builders
Portland, Oregon: Its History and Builders

Example 3: Inconsistent Capitalization

Report of the Adjutant General of the State of Oregon, For the Years 1865–6
Report of the Adjutant General of the State of Oregon, for the Years 1865–6

Example 4: Inconsistent Treatment of Quotation Marks

“<i>indirect empathy”</i> for participants

Example 5: Inconsistent Spelling

Music &amp; Letters
Music and Letters

Note: Ampersands (&) will appear as the coded &amp; in sam and ScML files.