Documentation

Regular Expressions Resource

Regular expressions from the checklists on scribenet.com are presented here with minimal context. These can be run on .sam and .scml files. For more details on how and when to use these searches, see the Regular Expressions Resource Supplement.

Quotation Marks, Parentheses, and Brackets

Find: ^[^“]*\”|\“[^”]*\“|\”[^“]*\”|\“[^”]*$

Find: “—|”—|—“|—”

Find: ^[^(]*\)|\([^)]*\(|\)[^(]*\)|\([^)]*$

Find: ^[^[]*\]|\][^[]*\]|\[[^]]*$|\[[^]]*\[

Find: ”([^ \)\<\]\:;\?\&\/—])|([^ \(\>\[—])“

Punctuation

Find: ([\!:;,\.\?\–])\1

Find: ([\[\(“‘])( )|( )([\.,:;\?!\)\]’”])([^\x{a0}])

Find: ([ \x{a0}])\.([ \x{a0}])\.([ \x{a0}])\.([ \x{a0}])\.

Find: \.([\x{00A0} ])\.([\x{00A0} ])\.’([A-zÀ-ÿ]+)

Find: ([^A-Z][\.\?\!”])([A-Z])|([,:;\)])([A-Za-z“])|([”])([a-z])|([a-z\>])\(|,”,|Scribe, Inc|<crt(.*)(<url>www\.zondervan\.com</url>\. The)

Find: ([a-zÀ-ÿ0-9])</(p|pf|psec|paft|pcon|rf|rf1|rf2|rff)>\n|</([ib])></(rf|rf1|rf2|rff)>\n|([;,\–\-\—])</([a-z]+)h

Find: <([^>/]*)>[^A-Za-z0-9\n]?</\1>

,</i>

</i>,

\.</i>

</i>\.

Unexpected Character Patterns

Find: --|([a-z0-9]+)\||- -|'|“ | ”|^( *)(<[^\n]*?)( ){2,}|\),[0-9]

Find: ([\d]+)([\x{2013}\x{2014}-])([\d])([\x{2013}\x{2014}-])([\d]+)([\x{2013}\x{2014}-])([\d]+)([\x{2013}\x{2014}-])([\d])

Find: [A-Za-z]<i>[A-Za-z]|[A-Za-z]</i>[A-Za-z]

Find: </(ct|ctfm|ctbm|cs|cn|pn|pt|ps|ut|un|us|ept|au|au1)>\n([ ]*)<ah[^a]|<structure>([^{])

Find: ([0-9]+)</toc|([ \t])([0-9ivxl]+)</tocfm|([ \t])([0-9ivxl]+)</tocill

Spaces

Find: ( )(\x{a0})|(\x{a0})( )|(\x{a0})<([\/a-z0-9\-]+)>( )|( )<([\/a-z0-9\-]+)>(\x{a0})

Find: ([ \x{a0}])(\t)|(\t)([ \x{a0}])|\x{00A0} | \x{00A0}

Find: ( )<([ef]nref)([^>]*>[^<]*</\2>)

Find: ^( *)(<[^>]*>)( )|( )$

Find: ([^ ]\||\|[^ ])

Find: ^( *)(<[^\n]*?)( ){2,}

Incorrect Line Breaks

Find: ^[ ]*<[^>]*>[a-z]

URLs

Find: (<url( href[^>]*)?>[^<]*)([\x{2013}\x{2014} ])

Find: <url>([ \.\(\[])|([ ,\.\)\]])</url>

Find: ([A-Za-z0-9\.\-:/]+\.(?!jpg|tif|eps|png|svg|jpeg)[A-Za-z]{2,})([^ <"\n]*[^ ><"”'’\)\],;:\.–\n—\?])?

Find: ([^ \<\"\>])http

Find: ([ ><"“'‘\(\[–\n—])(@[a-zA-Z0-9_]{1,15})

Process the file to ePub 3 in the Digital Hub.

Open the e-book in Kindle Previewer.

Go to File > Run Quality Checks

Click Open Report.

Special Characters

ISBNs and Zip Codes

Angle Brackets

Find: &#x3e;|&#x3c;|&#62;|&#60;|&gt;|&lt;|<<|>>

Typesetter Spaces

Find: &#173;|&#819[2-9];|&#820[0-4];|&#8239;

Find: &#x00AD;|&#x200[0-9A-C];|&#x202F;

Find: [\x{ad}\x{2000}-\x{2009}\x{200a}-\x{200c}\x{202f}]

Find: [^\.](&#160;|&#x00A0;|\x{a0})[^\.]|(&#8205;|&#x200D;|\x{200d})

Hyphen Spacing

Find: - | -

Incorrect Hyphenation

Find: [A-zÀ-ÿ]+-[A-zÀ-ÿ]+

Missing Spaces around Tags and Commas

Find: (</[^>]+>)([A-Za-zÀ-ÿ]+)|([A-Za-zÀ-ÿ]+)<(?![eft]nref|page)([^/][^>]*)>

Find: ,([a-zÀ-ÿ0-9]+)([^ \n]*)

Find: (<in[12f]*>)(.*),[a-zÀ-ÿ0-9]

Composition/Articulation

Small Caps

Find: <[^>]*sm[^>]*>[^<]*</[^>]*sm[^>]*>

Tetragrammaton

Find: <[^>]*tetr[^>]*>[^<]*</[^>]*tetr[^>]*>

Self-Closing Note Reference Tags

Find: <([fe])nref/>|<([fe])nnum/>

Self-Closing and Unnecessary Tags (.sam/.scml)

Find: <(?!cell|img|page)[^<]*/>|</([^>]*)>[^A-Za-z0-9\n]?<\1>|<([^>/]*)>[^A-Za-z0-9\n]?</\2>

Index Section (.scml files)

Position of Tags and Spaces (.sam/.scml)

Find: ^( *)(<[^\n]*?)( )(</[^>]*>)|(<[^/|^>]*>)( )

Page IDs (.sam/.scml)

Find: (<xref.*?>.*?)(<page id=".*?"/>)(.*?</xref>)|(</url>)(<page id=".*?"/>)(<url>)

Find: <([^/>]*)>(<page[^>]*>)</\1>

Find: [a-z]+<page id="([^<]*)"/>[a-z]+

Page references (.scml)

Find: [^</][Pp]age

Single-Chapter Bible Books (.scml)

Find: (<xbr t=")(Ob|Phm|2Jn|3Jn|Jud|Pr Az|Bel|Sus|Pr Man|LJe)( )([2-9]|[0-9]{2,})(:)([0-9-]+")

Blind Notes Pairs

Italics in Italic or Small Caps Base Font Paragraphs

DTD Validation Troubleshooting (.sam/.scml)