Preparing the document for translation:

1. What are tagged files?

What do I mean by “Preparing the text for translation”? For translation purposes, there are 2 types of tags:

• Tags that you may need to move or edit and that are/could be located in the middle of a segment

• Tags that you will almost never change and are not (should not) be in the middle of a segment

Overall, there are very few tags that you may need to delete during the translation process.

"Preparing files" means modifying the files so that they can be translated easily using a CAT. What follow is a description of a file prepared for Wordfast/Trados, a “tagged file”, in the translator lingo. Since Trados is/was widely used, most professional CAT can handle this type of files, with more or less success. However, if you own and use another CAT (SDLX, DV,…), please check your CAT's documentation. As you will use a CAT to work of the tagged file, I assume that you are familiar with the basic concepts. (If not, please read the following pages of this web site before going further: “What are CATs?” and “First translation”)

A tagged file is a RTF file containing the source code (meaning, tags + text) of the original HTML file. The tags are identified using 2 styles: tw4winInternal and tw4winExternal. Without getting into details, the tw4winInternal style is red, and the tw4winExternal is light grey. Whenever you receive a file with tags in red and grey, it’s almost a given that the file has been tagged. Although the handling is very similar, beware that HTML files are not the only tagged files, and many more exotic formats are tagged for use with CATs, like SGML, XML, QuarkXpress, FrameMaker, etc.

All tags are protected against deletion by default, to avoid you deleting one by mistake. Tags that you may need to move, like <b> (bold), are in tw4winInternal. “Internal” because they will be included in the segment you have to translate. They are in red. Tags that you don't need to change or to be concerned about during the translation process are in tw4winExternal, (like <p> (paragraph mark), <body>, …) and are in grey. A tag in tw4winExternal style will end a segment automatically.

Here is an example:

Correct: You are learning to translate <b>Web Sites</b></p>Bla bla bla

By now, you should know that “Web sites” is in bold, and that the </p> shows the end of a paragraph. When you open that sentence with Wordfast (or Trados), the segment will end just after the </b>, although there is no period, because <p> is in tw4winExternal style.

Incorrect: You are learning to translate <b>Web Sites</b></p>Bla bla bla

(The segment would stop right after “translate”).

Incorrect: You are learning to translate <b>Web Sites</b></p>Bla bla bla

(The segment would include everything).

Incorrect: You are learning to translate <b>Web Sites</b></p>bla bla bla

(The segment would include everything and the tags are not protected).

