translation memory - LabSpace

mangledcobwebSoftware and s/w Development

Dec 14, 2013 (3 years and 7 months ago)

69 views

OT12 online seminar: Translation Memory Tools

Paul Filkin, Director of Client Communities,

SDL Language Technologies

1


how Trados was developed and established itself as industry leader


how
translation memory tools work


what
their benefits for open (and professional) translators are


what
the particular distinguishing features of SDL Trados Studio are


what
the future is for translation memory
software


The Agenda… or things we’ll cover

2

SDL Trados… a brief history

3

Translation Production

4

Content is either …


Translated by professional

translator


Or, the “occasional” translator


Non
-
linguist, Subject matter specialist

(reviewer), Crowd sourced, …


Or, left un
-
translated


Not relevant, too costly, too much overhead involved, …


This presentation focuses on content produced by professional
translators

Productivity Environments

5


Today, content workers utilize specialized productivity environment(s)

Content Worker

App
lication Class

Prominent

Example

Graphic Designers

Graphic tools

Adobe Photoshop

Audio

Producers
Musicians

DAW


(Digital Audio Workstation)

Steinberg

Cubase

Architects


3D

modeling program

Google Sketch up

Engineers

CAD

(Computer Aided

Design)

Autodesk AutoCAD

Game Developer

Game Engine

Epic

Games Unreal
Engine

Translators

CAT

(Computer Aided Translation)

SDL TRADOS TWB /
SDL Studio


All mentioned trademarks are property of
their respective owners.

Translation Editor is at the core of any CAT

6

Professional Translation
can be done




In principle, in
any authoring
editor (desktop/browser)


However, with limited productivity (in the range 800
-
1500 words per day) and
high efforts maintaining consistency and accuracy.



Using Microsoft Word + Plug
-
ins


Plug
-
in to translation productivity tool


Hard dealing with structured content



Using a Dedicated Translation Editor (CAT or
TEnT
)


Depending on various factors: productivity boost in the range 2000 to 5000
words per day


Well established market for professionals


CAT:

Computer
-
Aided Translation


A generic term used to describe software which assists users during the
localization/translation process


Sometimes referred to as
TEnT

: Translation Environment Tool



Our CAT technology is an integrated toolset, offering:


Translation Memory (TM)


Termbase


Editing environments


Project Management functionality


Software Localization


OpenExchange

What is CAT Technology?

7

Public
ProZ

Poll August 24 reply

from 1670 translators

http://www.proz.com/polls/5474



CAT technology incorporates the concept of
translation memory

and
termbase



Translation memory:

a database consisting of
translation units


Translation unit:

source and translated sentence or paragraph


During translation, the technology searches for exact or similar matches to the current source
segment for translation


Matches found can be reused or edited



Termbase:

multilingual database consisting of
term entries



Term entries:

terms, synonyms, acronyms, etc.



Contextual data:

definition, part of speech, gender, etc.



Translators work with a translation memory and termbase to reuse previous
translations and ensure consistency of terminology during translation

What is CAT Technology?

8


A translation memory is a searchable
database containing source and
translated sentences or paragraphs



The translation of a segment or phrase
occurs only once, as each occurrence
is

stored in the database



During

a translation
project
, when the source
segment
re
-
occurs, the translation memory
remembers

the translation (by
searching

the
database) and inserts
it

into

the new
document



The translator
may

accept

the
previous

translation or
edit

the translation, if
necessary

Translation Memory Overview

9

Terminology Management Overview


A termbase is a searchable database
which contains a list of multilingual terms
and contextual term data



Term data gives details about the origin and use of
the term, such as definition, gender, context, etc.



The termbase can be used in monolingual form
during source content creation


Ensure consistency of terminology in source
documentation


Facilitate translation for the global marketplace



The termbase can be used in bilingual form in
conjunction with translation memory technology to
increase translation accuracy


Ensure consistency of terminology in translated
documentation



10

Key Productivity Accelerators

11

Topic Level

document, page, fragment, chunk, …

Segment

Level

sentence, header, footnote, table cell,


Subsegment Level

phrase, word, …

Exclusion

from
translation through
markup

Translation


Memory


Auto
-
suggest

(dictionary

based

auto
-
completions)

“Perfect Matching”
utilizing

bi
-
lingual
representations

Automated


Translation

Placeables,

Terms

Auto
-
propagation

Concordance

Impact on effective handling of update translations

Impact on
effective handling of
new translations

Impact on
effective handling
of document internal redundancies

Impact on consistency & quality

Topic (Document, …) Level

12

“Don’t translate if

it hasn’t changed”

(but show it to
provide context
for

the text
that has
actually
changed/

added)












Significant productivity gains

dependent
on update frequency

Markup exclusions


Use ITS

/ other convention to lock
text


Custom arrangements between
CMS + Translation System


Perfect Matching


Compare text with predecessor
translation project and lock what
hasn’t changed


But, high overhead in managing
corresponding projects

Segment Level : TM

13

“Don’t re
-
translate

if you can reuse

an (approved)

existing translation”

(but adapt as you need)



Increasingly sophisticated match type differentiation


100%,
Fuzzies
, Context Matches (CM), (ICE)


Cascaded TMs, Ranking of TMs



Significant productivity gains dependent on


Availability of relevant TMs


Similar content produced again and again

Segment Level : Automated Translations

14

“Adapt an automated

translation proposal”

(instead of translating

from scratch)



Increasingly accepted by professional translators


Especially using Statistical Machine Translation (SMT)



Significant Productivity gains depending on


SMT engine trained with sufficient,
relevant

(in
-
domain), high quality
(professional translator output) data


Translators are able to dynamically select “
in
-
domain” trained
engine [e.g.

Touchpoints
”]


Trust scores

Segment Level : Auto
-
propagation

15

“Auto
-
propagate

translations

for identical source

segments”

(and ripple through any changes

when you change your translation)




Productivity gain if text has internal repetitions


Simplifies updating identical segments throughout the content



Requires parameters to control behavior

Subsegment

Level : Auto
-
suggest

16

“While I type, provide a list of relevant candidates so that I can
quickly auto
-
complete this part of my translation’”









Productivity gain highly dependent on available
data
-
sources

and
proposal
strategy


Optimal configurations reduce keystrokes by 30 up to 50%


Avoidance of typos, impact on consistency

Subsegment

Level :
Placeables, terms

17

“While I type, make it easy for me to place tags,
recognised

terms and other placeables so I can focus on the translatable
text.’”









Productivity gain highly dependent on available
data
-
sources

for terminology or
translator diligence, and the complexity of the tags


Avoidance of typos, impact on consistency, robust target documents

Subsegment

Level :
Concordance

18

“Make it easy for me to search through Translation Memories,
in both source or target text and from wherever I am in the
document I’m translating’”









Biggest impact is in being able to find things you’ve translated before that are
similar, or the same, as the current text and make it easy to reuse


Impacts the quality of the work you deliver


Impacts the time it takes to find the right words for complicated texts


Whereas
the key technology advances

are
in the area of
subsegment
reuse and

statistical
machine translation (SMT), the

actual
productivity gains
for a Professional

Translator relate
to the
ergonomics
of how

systems
allow users
to
interact, control and

automate
the various
data sources:



Access, creation, chaining, weighting and sharing of TMs


Access to SMT pointing to specific engines


Compilation of phrase dictionaries on the fly

Key technology advances…

19

What Happens When Teams Grow?


When
teams of three or
more work together, new factors must be
considered to work effectively and properly collaborate

Translators

Reviewers

Project Managers

20

Typical Package
-
based
W
orkflows

Project Manager

Translator

Reviewer

Project Manager

Translator

Reviewer

or

21

...x 5 languages...

Project Manager

22

Project Manager

Typical Project Workflow

with SDL Studio GroupShare

1.
Project Manager creates a project


Performs analysis, pre
-
translation using SDL
Trados Studio connected to a TM on TM
Server


2.
Project Manager publishes project


Uses Publish command in Studio,
select server and location, and Studio
takes care of the rest


Contact team via email, phone

23

Project Manager

3.
Team Accesses Project


Use Studio 2011 to open project


Check out files as required for translation, review, or
signoff


Studio only gets files as needed


Project Server tracks file versions


Studio and Project Server synchronize metadata


Typical Project Workflow

with SDL Studio GroupShare

Translator

Reviewer

24

Looking forward…

25


Current theme for CAT tools


reviewer productivity


Inclusion of track changes and commenting mechanisms in translation editor


Automation in the broader production chain



… and the Studio “Platform” which includes the OpenExchange

26

The SDL OpenExchange… current state of affairs

27


57 Apps on the OpenExchange

42 are completely free



29,804 downloads
(August 2012)



7,141
app users

(August 2012)



396 developers

(August 2012)


Copyright © 2008
-
2012 SDL plc. All rights reserved..

All company names, brand names, trademarks,
service marks, images and logos are the property of their respective owners.


This presentation and its content are SDL confidential unless otherwise specified, and may not be
copied, used or distributed except as authorised by SDL.