Speech To Text

from aibrary import AiBrary

aibrary = AiBrary()
aibrary.audio.transcriptions.create(
    model="whisper-1", file=open("path/to/audio", "rb")
)

Generation

post
Authorizations
Body
modelstringRequired

ID of the model to use

languageany ofOptional

Language in ISO-639-1 format

stringOptional
or
nullOptional
promptany ofOptional

Optional text to guide the model's style or continue a previous audio segment

Default: ""
stringOptional
or
nullOptional
response_formatstring · enumOptional

Format of the transcription response

Default: verbose_jsonPossible values:
temperatureany ofOptional

Sampling temperature for model responses (default is None).

Default: 0
numberOptional
or
nullOptional
timestamp_granularitiesany ofOptional

Granularities of timestamps to populate

or
nullOptional
filestring · binaryRequired
Responses
200
Successful Response
application/json
Responseany of
or
stringOptional
post
POST /v0/audio/transcriptions HTTP/1.1
Host: api.aibrary.dev
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: multipart/form-data
Accept: */*
Content-Length: 146

{
  "model": "text",
  "language": "text",
  "prompt": "",
  "response_format": "verbose_json",
  "temperature": 0,
  "timestamp_granularities": [
    "word"
  ],
  "file": "binary"
}
{
  "text": "text",
  "task": "text",
  "language": "text",
  "duration": 1,
  "words": [
    {
      "word": "text",
      "start": 1,
      "end": 1
    }
  ],
  "segments": [
    {
      "id": 1,
      "seek": 1,
      "start": 1,
      "end": 1,
      "text": "text",
      "tokens": [
        1
      ],
      "temperature": 1,
      "avg_logprob": 1,
      "compression_ratio": 1,
      "no_speech_prob": 1
    }
  ]
}

Last updated