Documentation

Vetting Guide

Vetting is the act of assessing materials to determine the state of the content and actions to be performed.

Consider the following:

  • What is the scope of the project?
  • What accessibility requirements will there be?
  • Will the project involve editorial tasks?
  • Will it be e-book only?
  • Will it be typeset only?
  • In what order will the final products be produced?

Word Scribing Vet

Use the following to check a Word document for errors, problems, and issues that will affect the scribing of the file and to determine how long it will take to scribe the file.

1. Files

Are all files present? (Check against the table of contents.)

Do all files open?

Do the files contain the content that the file names indicate?

2. Styles

Does the document use ScML styles?

If so:

  • Are ScML styles applied correctly?
  • Are all ScML styles current?

If not, does the file use styles or other indicators to indicate structure?

3. Structure

Does the document have a clear structure?

Is the structure consistent throughout?

4. Letter Casing

Do elements use consistent letter casing (title case, sentence case, all-caps) throughout?

5. Elements

What kinds of specialized elements are present throughout the document (sidebars, figures, tables, equations, etc.)?

How complex are these elements?

6. Character Styles

Is there any specialized use of character styles (e.g., bold, italic, underline, superscript, subscript, small caps)?

Are any characters rendered with a style such as bold or small-caps that need to be set apart to convey meaning (e.g., dispk for dialogue speaker or gt for glossary term)?

7. Special Characters

Do all special characters render within the document?

Are all characters unique Unicode entities or are they rendered with a legacy font?

Are combining Unicode characters used?

Are all special characters used properly?

Use the Digital Hub to obtain a list of special characters used in the document:

  1. Upload your file to the Digital Hub.
  2. In the Word section under the Assets tab, click the Stats icon to see a list of all Non-ASCII characters.

8. Graphics

Does the document contain embedded images?

If so, refer to the Images Vet documentation here.

9. Hidden Text

Does the document contain hidden text?

Use the SAI’s Cleanup tool to mark potentially hidden text.

10. Line Breaks and Tabs

Are line breaks and tabs used to indicate spacing or to delineate special elements?

11. Notes

Are footnotes and endnotes embedded?

Is there an equal number of fnnum, fnref, and fn? Check endnotes as well (ennum, enref, and en).

Do the footnote and endnote windows contain any content other than notes, such as heads?

Are there endnotes in sections that will fall outside of the InDesign text flow (sidebars, tables, boxes, etc.)?

  • If so, these should be changed to footnotes.
  • If it is unclear whether an element will be outside of the text flow, add a note to the typesetter or designer.

Copyediting Vet

Use the following to check a Word document for errors, problems, and issues that will affect the copyediting of the file and to determine how long it will take to copyedit the file.

1. Files

Are all files present?

Do all files contain complete content?

Do all files open?

Do the files contain the content that the file names indicate?

Are all elements consistent throughout?

Are any unusual elements present?

2. File Statistics

Obtain statistics for the following:

  • Character count
  • Word count
  • Number of non-ASCII special characters

Use the Digital Hub to obtain a list of file statistics for the document:

  1. Upload your file to the Digital Hub.
  2. In the Word section under the Assets tab, click the Stats icon to see the file statistics.

3. Specifications

What level of editing is required?

Will any editorial tasks for this project deviate from the standard procedure?

Will any other materials be forthcoming that will contain text not included in this copyedit (e.g., a cover or jacket)?

4. Notes

Does this project contain footnotes or endnotes?

If so:

  • How many notes are there (of each type)? Are there an equal number of fnnum, fnref, and fn? Are there an equal number of ennum, enref, and en?
  • What style is to be followed for the notes (CMS, APA, MLA, etc.)?
  • Does the current formatting of the notes match the required final formatting?
  • Are the bibliographic details complete in the notes?
  • Do the notes’ bibliographic details match the bibliography?
  • If there are blind notes, have the key phrases been identified, and do they match the phrases used in the note paragraphs?

5. Parenthetical Citations

Does this project contain parenthetical citations?

If so:

  • What style is to be followed for the parenthetical citations (CMS, APA, MLA, etc.)?
  • Does the current formatting of the citations match the required final formatting?
  • Do the citations match the reference list?

6. Bibliography and Reference List

How many entries are there?

Are all references complete?

Are there any missing entries?

What style is to be followed for the references (CMS, APA, MLA, etc.)?

Does the current formatting of the references match the required final formatting?

Are the references consistently formatted?

Does the bibliography need to be converted to a reference list (or vice versa)?

Are there aspects that will require a specialist’s input or expertise?

What decisions can be made by the person scribing, and what decisions will need specific instruction?

7. Quotations

Do all quotations include attributions?

What style is to be followed for the quotations (CMS, APA, MLA, etc.)?

Does their current formatting match the required final formatting?

When checking the accuracy of quotations, what are the expected requirements (particularly for Bible quotations)?

8. Figures and Tables

Are there figures or tables?

If so:

  • How many figures or tables are there?
  • What are the formatting requirements?

9. Editing Level

1. Language

Check the following to find any issues with language:

Was the content created in the author’s native language?

Is there any non-English language material?

If so,

  • How much material is non-English language?
  • Is the non-English language material set off from the main text or mixed in with English material?
  • Is the non-English material translated into English?
  • When checking the non-English language material, what are the requirements?
  • Are there aspects that will require a specialist’s input or expertise?
  • What decisions can be made by the person scribing, and what decisions will need specific instruction?

2. Author Style

Is punctuation used consistently (correctly or incorrectly)?

To what degree does the unedited manuscript conform to the style guide?

To what degree should changes be made to conform to a style versus the author’s style?

Does the writing contain any nuances that affect the readability of the manuscript?

3. Audience

What is the intended audience?

Are any industry-specific terms used?

If so:

  • Will the terms need to be spelled out in the text, or are they understood as a default by the intended audience?
  • Will one need to give more or less consideration to a specific audience?

Proofreading Vet

Use the following to check a PDF document or Word file for errors, problems, and issues that will affect the proofreading of the file and to determine how long it will take to proofread the file.

1. Files

Are all materials present?

What is the character count or page count of the files to be proofread?

2. Tolerances

What are the editorial tolerances at this stage?

How many rounds of proofreading are expected?

3. Format

How will the proofread be performed?

On paper, Word document, or PDF?

Will changes be provided using comments in Word or Acrobat, or will changes be provided in a separate Word document?

4. Previous Edit

Have the files been edited previously?

If so,

  • is there an established stylesheet?
  • are there any unresolved editorial notes or queries?

5. Review

Will an author/editor review the file before, during, or after the proofread?

If there is a conflict between different sets of feedback, which should take precedence?

Design and Typesetting Vet

Use the following to check the source files, which can include images, Word documents, fonts, and so on, for errors, problems, and issues that will affect the design and typesetting of the project and to determine how long both of those components will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

2. Design Origin

Is a new design being created?

Is the design based on an existing design?

If so:

  • In what format does the original design exist (hard copy, PDF, InDesign, or a different program)?
  • Should the original design be matched exactly?
  • Is this book part of a series?

If not,

  • What tone should be conveyed in the design?
  • Are there similar books to be partially matched or reviewed for inspiration?
  • Is there a cover image or other reference file available to guide the design?
  • Are there any design aspects specifically to be avoided?

3. Specifications

What are the design specifications?

  • Trim size
  • Output requirements (printer requirements; crop marks; color or grayscale; bleed)
  • Typographical requirements

4. Software

Are there file version requirements (e.g., InDesign CS6, InDesign CC)?

Is the use of any other program other than InDesign required?

Is the use of any specialized plug-in required?

5. Content

Does the content or structure of the document raise any questions that would impact design considerations?

Are there any elements the typesetter should be aware of (e.g., special fonts, equations)? If so, list them.

6. Equations

If there are any equations, in what format are the equations (MathType, Equation Editor, etc.)?

Note the amount of equations.

7. Tables

If there are any tables, how complex are the tables?

Note the amount of tables.

8. Figures

If there are any figures, do the figures require special treatment?

Note the amount of images.

9. Non-English Characters

Are there any non-English characters?

If so, what non-English characters are present?

Does any text read right-to-left, like Hebrew?

Does any non-English text require a special font?

Are all special characters Unicode?

10. Fonts

Are all necessary fonts present?

Can the base fonts render every character present in the book?

11. Formatting Instructions

Are there any specialized formatting instructions? If so, list these instructions.

12. Import/Export Considerations

Will this book require special attention to accommodate the importing or exporting of XML content?

Are elements present that will require manual overrides to content or “-alt” styles while typesetting?

13. Possible Problems

Are there any factors present in the document that will increase typesetting time?

Is there anything unusual in the materials?

Extraction Vet

Use the following to check the source files, which can include InDesign, Quark, PDF, web, and FrameMaker files, for errors, problems, and issues that will affect the extraction of the files and to determine how long the extraction will take.

General

1. Source Files

What format are the source files?

Do all files open?

Were the source files produced using the WFDW?

Is there a reference PDF? If so, do the source files match the print version or reference PDF? If not, can an accurate PDF be generated from the source files?

2. Extraction Output

To what format will the extracted files be exported (IDTT, XML, XTG, etc.)?

3. Fonts

Are the fonts used in the source files available?

4. Inconsistencies

Are there any inconsistencies in style usage?

Are there any inconsistencies in text used, e.g., text appearing on book covers or images compared with the book’s interior?

5. Text Flow

Does the content flow correctly?

Are all text boxes/stories properly linked? Is everything in one text flow?

If images or other boxes outside of the main text flow need to be placed manually, will that affect the overall time estimate in a significant way?

6. Order

Is the content in the intended order?

7. List Items

Are numbers and bullets in lists automatically generated by the typesetting program?

8. Notes

Are there footnotes or endnotes?

9. Images

Are all images present?

Are any images masked/cropped in the typesetting program?

Are images named in any regular/sequential way?

Refer to the Images Vet documentation.

10. Directional Language

Is any directional language being used, e.g., "Figure 1 (left)"?

Will it be necessary to break up captions or alter directional language when converting to a reflowable ePub?

11. Special Characters

Are Unicode characters used, or are special characters rendered by legacy fonts?

12. Plug-Ins

Are any specialized plug-ins being used?

13. Returns

Have hard and soft returns been used properly?

The following GREP search in InDesign will find soft returns that are not preceded by a space. The replacement expression will add a space.

Find: ([^ ])\n
Replace with: $1 \n

Note: Do not replace all if this will affect URLs or other soft returns that should not have a space in front of them.

13. Spaces

Have typesetter spaces (hair spaces, thin spaces) been applied properly? (Typically these are removed from extracted text so that they don’t become regular spaces. Nonbreaking spaces are usually kept.)

InDesign Source

1. Style Usage

Are styles used properly and consistently?

Are ScML styles present?

Have GREP and nested styles been applied correctly? If not, text may get identified incorrectly in the exported XML. If the source files were not produced using the WFDW, check for modified styles. Detail how styles should be mapped to ScML.

2. Master Pages

Do master pages contain content that needs to be extracted?

3. Notes

Are InDesign notes present (Window > Editorial > Notes)? If so, the extracted IDTT will contain the text with no indicator at this time, resulting in the notes being mixed in with live text.

4. Sample

Does an extraction sample reveal any problems? (If working with files produced with the WFDW, extract the files to XML and use the Digital Hub to convert them to .sam.)

Do the images or their captions get anchored when running Scribe tools?

5. Special Conditions

Are layers being used?

Is there anything in the structure pane that should not be included in the XML output (e.g., pre-tagged images)?

6. Paragraph Marks

Have any pilcrows, or paragraph marks, been identified with tocnum or tso style? If so, these will get deleted during export and the paragraphs would be combined.

Quark Source

1. Extension

Do the source files have extensions? (Zip files for transmission as files without extensions tend to become corrupted.)

2. Style Usage

Are styles properly used in the Quark files?

Are ScML styles present?

If the source files were not produced using the WFDW, check for modified styles. Detail how styles should be mapped to ScML.

PDF Source

1. OCR

Is the text selectable, or is the source PDF an image-only PDF?

Will OCR be required to extract content from the PDF?

How will the OCR output be verified?

2. Output

Can the PDF be saved as a Word document, or will content need to be copied and pasted?

Are there bad line breaks, combined words, separated words, and so on in the output?

3. Images

Are all image files present?

If not, will the images be extracted from the PDF?

Refer to the Images Vet documentation.

Web Source

1. Output

Can the web page be printed to text?

When printing to text, are any styles lost, such as bold or italics?

FrameMaker Source

1. Conversion

Will the FrameMaker files be saved to any other format before extracting?

If so, save the files down to MIF for ease of conversion to InDesign or other formats.

Images Vet

Use the following to check the image files (.jpg, .tiff, .png, and so on) for errors, problems, and issues that will affect how images are handled and to determine how long the image work will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

If images are embedded within a Word document, follow these steps to extract the images:

  • PC: Using a program such as 7Zip, extract content from the .docx file.
  • Mac: Change the .docx extension to .zip and double click to extract content.
  • Navigate to the media folder inside the word folder.

2. Number of Images

How many images are there, and how many of each type (e.g., charts, equations, photographs)?

3. Captions

Are captions present for all images that require them?

4. Image Resolution/DPI

What is the DPI of each image?

Does the current DPI of each image match the requirements of the final output?

5. Work Required

What type of image work will be required? Will images need to be recreated or resized?

Do all images meet expectations?

Is any content cut off?

Does the content need to be cropped?

6. Format

What format are the images?

Are the images currently in the format to be used in the final output?

Are images presented in a way that matches descriptions in captions? (e.g., side by side images about which the caption refers to the left and right portion.)

7. Text Editing

Will images with text require copyediting or proofreading?

Are there any spelling errors?

Can text changes be applied to the images?

8. Permissions

Have permissions been obtained?

9. Accessibility

Will the images need alt text to be created/included?

Will the use of color affect color-blind readers in a way that obscures the intended purposes of the images?

Indexing Vet

If an index is required, use the following to check the source files, which can include Word and PDF files, for errors, problems, and issues that will affect the indexing of the files and to determine how long the indexing will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

2. Index Specifications

Does this index have any specific needs (e.g., focus on sports teams or locations)?

What style guide will the index follow (CMS, APA, MLA, etc.)?

Will the index be run-in?

How many sublevels will be included?

Does the index have a required length?

3. Multiple Indexes

Will multiple indexes be needed?

4. Type

What type of index(es) will be needed (subject, author names, Bible citations)?

E-books Vet

Use the following to check the source files, which can include InDesign, Quark, PDF files, and so on, for errors, problems, and issues that will affect the conversion of the files and to determine how long the conversion will take.

1. Files

Are all materials present?

Are any materials to be supplied at a later date? If so, should placeholder content be added to account for this?

Are the reference files present (e.g, PDF)?

Will the files be converted to ePub 3 (the current standard) or an older standard like ePub 2? Does the distribution method correspond with the desired output?

What source files will be used for e-book conversion?

How will the e-book be checked? Do all involved have access to the same programs to use when checking the e-book?

See the Extraction Vet documentation here.

2. Metadata

Is the metadata present?

This includes the following:

  • Creator information
  • BISAC category
  • eISBN

3. Accessibility

Are there any accessibility requirements?

Will alt text need to be created/included?

4. Special Characters

Do all special characters render within the document?

Are all characters unique Unicode entities or are they rendered with a legacy font?

Are combining Unicode characters used?

Are all special characters used properly?

Will any fonts need to be embedded to render special characters?

If so, are these fonts present?

Will any special characters need to be replaced with images?

Use the Digital Hub to obtain a list of special characters used in the document:

  1. Upload your file to the Digital Hub.
  2. In the Word section under the Assets tab, click the Stats icon to see a list of all Non-ASCII characters.

5. Character Styles and Decorative Elements

Will small caps and dropcaps be retained?

What elements from the reference file should be included (e.g, section break, decorative ornaments, etc.)?

6. Internal Linking

Will the e-book need internal linking?

Will the index, if present, be linked?

7. Tables

Are there any tables present in the source files?

Are they complex?

8. Figures

Are there any complex figures?

Are there any text-heavy images or images that will not render well on e-readers?

Are the images present?

Will images (including cover images) need to be cropped, pulled from the PDF, or edited in any other way?

Do ornaments need to be retained?

9. CSS

Is there an existing CSS or will one need to be created?

To what degree does the e-book have to match a reference PDF?

10. Special Text Formatting

Will classes that deviate from ScML need to be added to the HTML file for rendering purposes (i.e., alt styles)?

11. Mobi and Web PDF

Will a Mobi file be needed?

Will a Web PDF be needed?