AiBrary
ChatAiBrary Website
  • Getting Started
    • 👋Welcome to AiBrary Docs
  • 🧑‍🔬Try APIs
  • 🪙All Pricing
  • Chat & Multimodal Features
    • Chat
      • Chat Pricing
      • Multimodal Pricing
  • Audio Features
    • Speech To Text
      • Pricing
    • Text To Speech
      • Pricing
  • Translation Features
    • Automatic Translation
      • Pricing
  • Image Features
    • Image Generation
      • Pricing
    • Object Detection
      • Pricing
    • Image Embedding
      • Pricing
  • OCR Features
    • OCR
      • Pricing
  • Embedding
    • Embedding
      • Pricing
  • Video Features
    • Coming Soon!
Powered by GitBook
On this page
  1. Audio Features

Speech To Text

PreviousMultimodal PricingNextPricing

Last updated 2 months ago

from aibrary import AiBrary

aibrary = AiBrary()
aibrary.audio.transcriptions.create(
    model="whisper-1", file=open("path/to/audio", "rb")
)

Generation

post
Authorizations
Body
modelstringRequired

ID of the model to use

languageany ofOptional

Language in ISO-639-1 format

stringOptional
or
nullOptional
promptany ofOptional

Optional text to guide the model's style or continue a previous audio segment

Default: ""
stringOptional
or
nullOptional
response_formatstring · enumOptional

Format of the transcription response

Default: verbose_jsonPossible values:
temperatureany ofOptional

Sampling temperature for model responses (default is None).

Default: 0
numberOptional
or
nullOptional
timestamp_granularitiesany ofOptional

Granularities of timestamps to populate

or
nullOptional
filestring · binaryRequired
Responses
200
Successful Response
application/json
Responseany of
or
stringOptional
422
Validation Error
application/json
post
POST /v0/audio/transcriptions HTTP/1.1
Host: api.aibrary.dev
Authorization: Bearer YOUR_SECRET_TOKEN
Content-Type: multipart/form-data
Accept: */*
Content-Length: 146

{
  "model": "text",
  "language": "text",
  "prompt": "",
  "response_format": "verbose_json",
  "temperature": 0,
  "timestamp_granularities": [
    "word"
  ],
  "file": "binary"
}
{
  "text": "text",
  "task": "text",
  "language": "text",
  "duration": 1,
  "words": [
    {
      "word": "text",
      "start": 1,
      "end": 1
    }
  ],
  "segments": [
    {
      "id": 1,
      "seek": 1,
      "start": 1,
      "end": 1,
      "text": "text",
      "tokens": [
        1
      ],
      "temperature": 1,
      "avg_logprob": 1,
      "compression_ratio": 1,
      "no_speech_prob": 1
    }
  ]
}