Document Translation

/_ Title: Document Translation Sort: 6 _/

In order to translate a document with the Lilt API, you need to upload the file into a Lilt project first and then you can download it with the translations:

  1. Upload a file to a project
# Python; Note: This is just a stub
def upload_document(filename, projid):
    payload = {"key": lilt_api_key}
    jsonData = {"name": filename, "project_id": projid}
    headers = { "LILT-API": json.dumps(jsonData), "Content-Type": "application/octet-stream" }
    with open(fileName, 'r') as fp:
        rawdata = fp.read()
    res = requests.post(lilt_api_url + "/documents/files", params=payload, data=rawdata, headers=headers, verify=False)
    return res.json()["id"]
  1. Check upload status and success. A document needs to be successfully imported before you can translate and export it. You can check if a document import is still in progress and whether it was successful.
# Python; Note: This is just a stub
import_success = False
payload = {"key": lilt_api_key, "id": docid}
while True:
    res = requests.get(lilt_api_url + "/documents", params=payload, verify=False)
    if res:
        doc_stats = res.json()
        import_success = doc_stats["import_succeeded"]
        if not doc_stats["import_in_progress"]:
            break
        else:
            sleep(2)
if import_success:
    translate_document(docid)
    download_document(filename, docid)
  1. Translate the document
# Python; Note: This is just a stub
def translate_document(docid):
    payload = {"key": lilt_api_key}
    jsonData = {"id": [docid]} # can take a list of document IDs
    res = requests.post(lilt_api_url + "/documents/pretranslate", params=payload, json=jsonData, verify=False)
  1. Check translation progress and download the file
# Python; Note: This is just a stub
def download_document(filename, docid):
    is_translating = True
    payload = {"key": lilt_api_key, "id": docid}
    while True:
        res = requests.get(lilt_api_url + "/documents", params=payload, verify=False)
        if res:
            doc_stats = res.json()
            pretranslation_status = doc_stats["status"]
            if pretranslation_status["pretranslation"] == "idle":
                break
            else:
                sleep(5)
    payload = {"key": lilt_api_key, "id": docid, "is_xliff": "false"}
    res = requests.get(lilt_api_url + "/documents/files", params=payload, verify=False)
    with open(filename, 'wb') as fp:
        fp.write(res.content)

To adapt your machine translation engine and update the TM in your Memory you should also add the new segments with their final correct translation (after verification/processing by a translator or bilingual) to the Memory.

Segmentation

Lilt performs sentence segmentation on source segments for internal representation. To bypass the segmentation of XLIFF files, add a <seg-source> element which indicates a segmented source, and corresponding <mrk> markers inside the segment to specify the segment boundaries. Note that you have to put both, <source> and <seg-source>, even if ultimately the latter overrides the former. For example:

<trans-unit>
    <source>Segment this source. Any content here will be imported as multiple segments in Lilt. Try it out!</source>
    <seg-source><mrk mtype="seg">Do not segment this source. Any content here will be imported as one segment in Lilt. Try it out!</mrk></seg-source>
    <target />
</trans-unit>

To learn more, see the full API reference.

Still need help? Get in touch!
Last updated on 12th May 2019