Skip to main content
POST
/
documents
Ingest documents from files or URLs
import requests

url = "https://app.ubik-agent.com/api/v1/documents"

files = { "files.0.items": ("example-file", open("example-file", "rb")) }
payload = {
    "files": "<string>",
    "urls": "<string>",
    "workspace_ids": "<string>",
    "scraping_mode": "Simple Scraping",
    "crawl_depth": "2",
    "same_domain_only": "true",
    "limit": "10",
    "delay": "1",
    "youtube_download_format": "audio"
}
headers = {"X-API-KEY": "<api-key>"}

response = requests.post(url, data=payload, files=files, headers=headers)

print(response.text)
[
  {
    "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a",
    "name": "<string>",
    "status": "<string>",
    "created_at": "2023-11-07T05:31:56Z",
    "updated_at": "2023-11-07T05:31:56Z",
    "file_type": "<string>",
    "processing_pipeline": "<string>",
    "error_message": "<string>",
    "file_name": "<string>",
    "markdown_content": "<string>"
  }
]

Authorizations

X-API-KEY
string
header
required

Body

multipart/form-data
files
file[] | null

A list of files to upload.

urls
string | null

A comma-separated list of URLs to scrape.

workspace_ids
string | null

A comma-separated list of workspace IDs to add the documents to.

scraping_mode
string
default:Simple Scraping

Scraping mode ('Simple Scraping' or 'Crawling').

crawl_depth
integer
default:2

The maximum depth for crawling links.

same_domain_only
boolean
default:true

Whether to only crawl links on the same domain.

limit
integer
default:10

The maximum number of pages to crawl.

delay
number
default:1

The delay in seconds between requests.

youtube_download_format
string
default:audio

The format to download YouTube videos in ('audio' or 'video').

Response

Successful Response

id
string<uuid>
required

The unique identifier for the document.

name
string
required

The display name of the document.

status
string
required

The current processing status of the document.

created_at
string<date-time>
required

The timestamp when the document was created.

updated_at
string<date-time>
required

The timestamp when the document was last updated.

file_type
string | null

The MIME type of the document file.

processing_pipeline
string | null

The name of the processing pipeline used for this document.

error_message
string | null

If processing failed, this field will contain the error message.

file_name
string | null

The original file name of the document.

markdown_content
string | null

The full content of the document converted to Markdown format.