Ingest documents from files or URLs

import requests url = "https://app.ubik-agent.com/api/v1/documents" files = { "files.0.items": ("example-file", open("example-file", "rb")) } payload = { "files": "<string>", "urls": "<string>", "workspace_ids": "<string>", "api_metadata": "<string>", "scraping_mode": "Simple Scraping", "crawl_depth": "2", "same_domain_only": "true", "limit": "10", "delay": "1", "youtube_download_format": "audio" } headers = {"X-API-KEY": "<api-key>"} response = requests.post(url, data=payload, files=files, headers=headers) print(response.text)

[ { "id": "3c90c3cc-0d44-4b50-8888-8dd25736052a", "name": "<string>", "status": "<string>", "created_at": "2023-11-07T05:31:56Z", "updated_at": "2023-11-07T05:31:56Z", "file_type": "<string>", "processing_pipeline": "<string>", "error_message": "<string>", "api_metadata": {}, "file_name": "<string>", "markdown_content": "<string>" } ]

Authorizations

X-API-KEY

string

header

required

Headers

X-End-User-ID

string | null

Body

multipart/form-data

files

file[] | null

A list of files to upload.

urls

string | null

A comma-separated list of URLs to scrape.

workspace_ids

string | null

A comma-separated list of workspace IDs to add the documents to.

api_metadata

string | null

Custom API metadata (JSON string).

scraping_mode

string

default:Simple Scraping

Scraping mode ('Simple Scraping' or 'Crawling').

crawl_depth

integer

default:2

The maximum depth for crawling links.

same_domain_only

boolean

default:true

Whether to only crawl links on the same domain.

limit

integer

default:10

The maximum number of pages to crawl.

delay

number

default:1

The delay in seconds between requests.

youtube_download_format

string

default:audio

The format to download YouTube videos in ('audio' or 'video').

Response

Successful Response

string<uuid>

required

The unique identifier for the document.

name

string

required

The display name of the document.

status

string

required

The current processing status of the document.

created_at

string<date-time>

required

The timestamp when the document was created.

updated_at

string<date-time>

required

The timestamp when the document was last updated.

file_type

string | null

The MIME type of the document file.

processing_pipeline

string | null

The name of the processing pipeline used for this document.

error_message

string | null

If processing failed, this field will contain the error message.

api_metadata

Api Metadata · object

Custom API metadata resolved for the current user. Useful for retrieving stored external references (e.g. {'internal_app_ref': 'REF-123'}). Note: This value is resolved based on the request's external_user_id. A document shared globally but also scoped to a specific user may return different metadata depending on who is asking.

file_name

string | null

The original file name of the document.

markdown_content

string | null

The full content of the document converted to Markdown format.