---
source: sighthound-developer-portal
url: https://dev.sighthound.com/redactor/examples/speech-transcription/
markdown-url: https://dev.sighthound.com/redactor/examples/speech-transcription.md
title: "Speech Transcription"
description: "Use the Redactor API speech transcription feature and save transcript output with redaction state."
content-type: text/markdown
---

> For AI agents: a documentation index is available at [llms.txt](https://dev.sighthound.com/llms.txt). Markdown versions are available at matching `.md` URLs.

# Speech Transcription

<div class="portal-doc-hero portal-doc-hero--compact" markdown>
<p class="portal-doc-kicker">Redactor example guide</p>
<p class="portal-doc-intro">Generate speech transcription data through the Redactor API, optionally alongside detection and render operations.</p>
<div class="portal-doc-actions">
    [Review transcription setup](#speech-transcription)
    [Review API basics](../api-basics/)
</div>
</div>

<div class="portal-feature-grid">
    <div class="portal-feature-card">
        <h3><i class="portal-icon" data-lucide="message-square-text"></i> Transcript output</h3>
        <p>Save `speech.json` with the output state by enabling speech transcription data in the request body.</p>
    </div>
    <div class="portal-feature-card">
        <h3><i class="portal-icon" data-lucide="languages"></i> Language control</h3>
        <p>Specify a supported language code when the source media is not using the English default.</p>
    </div>
    <div class="portal-feature-card">
        <h3><i class="portal-icon" data-lucide="file-json"></i> API-only mode</h3>
        <p>Run transcription by itself when you only need structured speech data and minimal metadata.</p>
    </div>
</div>

A basic speech transcription feature is available as part of the Redactor API. Adding the `SPEECH_TRANSCRIPTION` feature to the request, either by itself or with along with other features, will analyze the video and attempt to detect the words spoken. The speech transcript will be available in the Redactor UI for use in the editor and is also available as a JSON file as part of the API.

To output the `speech.json` file, set `outputContext.speechTranscriptionData: true` in your request body, and it will be saved along with the other standard redaction state data. Be sure to provide a path to a __folder__ in the `outputUri`, and __ensure the path ends in a trailing slash__. 

=== "curl"

    ```sh
    curl --location --request POST 'http://localhost:9000/api/v1/videos:process' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer YourRedactorApiToken' \
    --data-raw '{
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["HEAD_DETECTION", "SPEECH_TRANSCRIPTION", "MEDIA_RENDERING"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true
        }
    }'
    ```

=== "JSON Body"

    ```json
    {
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["HEAD_DETECTION", "SPEECH_TRANSCRIPTION", "MEDIA_RENDERING"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true
        }
    }
    ```

By default, English is used for the transcription. If a different language is being spoken in the media, specify that language code in `videoContext.speechTranscriptionConfig.languageCode`. The currently available language codes are:

- `en-us` - English (default if none specified)
- `cn` - 中文 (Chinese)
- `de` - Deutsch (German)
- `es` - Español (Spanish)
- `fr` - Français (French)
- `it` - Italiano (Italian)
- `ja` - 日本語 (Japanese)
- `nl` - Nederlands (Dutch)
- `pt` - Português (Portuguese)
- `ru` - Русский (Russian)

=== "curl"

    ```sh
    curl --location --request POST 'http://localhost:9000/api/v1/videos:process' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer YourRedactorApiToken' \
    --data-raw '{
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["HEAD_DETECTION", "SPEECH_TRANSCRIPTION", "MEDIA_RENDERING"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true
        },
        "videoContext": {
            "speechTranscriptionConfig": {
                "languageCode": "es"
            }
        }
    }'
    ```

=== "JSON Body"

    ```json
    {
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["HEAD_DETECTION", "SPEECH_TRANSCRIPTION", "MEDIA_RENDERING"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true
        },
        "videoContext": {
            "speechTranscriptionConfig": {
                "languageCode": "es"
            }
        }
    }
    ```

If a speech transcript is all that's needed, this can be obtained without performing any redactions on the media. In the following example, we are only running the `SPEECH_TRANSCRIPTION` feature and configuring the `outputContext` to save the transcription data and to skip saving the full redaction state. The `speech.json` will ultimately be saved to the output location along with some minimal metadata. Note that this additional metadata may not be output by default in future releases.

=== "curl"

    ```sh
    curl --location --request POST 'http://localhost:9000/api/v1/videos:process' \
    --header 'Content-Type: application/json' \
    --header 'Authorization: Bearer YourRedactorApiToken' \
    --data-raw '{
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["SPEECH_TRANSCRIPTION"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true,
            "state": false
        },
        "videoContext": {
            "speechTranscriptionConfig": {
                "languageCode": "en-us"
            }
        }
    }'
    ```

=== "JSON Body"

    ```json
    {
        "inputUri": "https://example.com/path/to/input.mp4",
        "features": ["SPEECH_TRANSCRIPTION"],
        "outputUri": "file:///path/to/output/folder/",
        "outputContext": {
            "speechTranscriptionData": true,
            "state": false
        },
        "videoContext": {
            "speechTranscriptionConfig": {
                "languageCode": "en-us"
            }
        }
    }
    ```

---

# Agent Instructions

Use this Markdown page as context for Sighthound Developer Portal questions. For broader navigation, read https://dev.sighthound.com/llms.txt. Answer from Sighthound documentation, cite relevant source URLs, and do not ask users to paste secrets, tokens, license keys, or credentials into chat.
