User name:
Forgot User name / Password?

Register (Free)

Please take a minute to read our privacy policy.
Login or register
About | Contact | Blog | Site map

Preparing the document for translation:

1. What are tagged files?

What do I mean by “Preparing the text for translation”? For translation purposes, there are 2 types of tags:

• Tags that you may need to move or edit and that are/could be located in the middle of a segment

• Tags that you will almost never change and are not (should not) be in the middle of a segment

Overall, there are very few tags that you may need to delete during the translation process.

"Preparing files" means modifying the files so that they can be translated easily using a CAT. What follow is a description of a file prepared for Wordfast Classic, a “tagged file”, in the translator lingo. If you own and use another CAT (SDLX, DV,…), please check your CAT's documentation. As explained before, most modern CATs can prepare HTML files automatically. This means the old way of preparing files is somewhat obsolete for HTML. I am leaving the explanation below as it gives a good introduction for other formats which still need to be prepared

A tagged file is a RTF file containing the source code (meaning, tags + text) of the original HTML file. The tags are identified using 2 styles: tw4winInternal and tw4winExternal. Without getting into details, the tw4winInternal style is red, and the tw4winExternal is light grey. Whenever you receive a file with tags in red and grey, it’s almost a given that the file has been tagged. Although the handling is very similar, beware that HTML files are not the only tagged files, and many more exotic formats are tagged for use with CATs, like SGML, XML, QuarkXpress, FrameMaker, etc.

Today, files are often prepared straight to the XLIFF format, not tagged text, but tagged text is simple and serves as a good demonstration of the key concepts of file preparation.

All tags are protected against deletion by default, to avoid you deleting one by mistake. Tags that you may need to move, like <b> (bold), are in tw4winInternal. “Internal” because they will be included in the segment you have to translate. They are in red. Tags that you don't need to change or to be concerned about during the translation process are in tw4winExternal, (like <p> (paragraph mark), <body>, …) and are in grey. A tag in tw4winExternal style will end a segment automatically.

Here is an example:

Correct: You are learning to translate <b>Web Sites</b></p>Bla bla bla

By now, you should know that “Web sites” is in bold, and that the </p> shows the end of a paragraph. When you open that sentence with Wordfast (or Trados), the segment will end just after the </b>, although there is no period, because <p> is in tw4winExternal style.

Incorrect: You are learning to translate <b>Web Sites</b></p>Bla bla bla

(The segment would stop right after “translate”).

Incorrect: You are learning to translate <b>Web Sites</b></p>Bla bla bla

(The segment would include everything).

Incorrect: You are learning to translate <b>Web Sites</b></p>bla bla bla

(The segment would include everything and the tags are not protected).

Page 1   2   3   4   5   6