Ingestion

Docket accepts files via multipart upload and processes them through a pipeline: validate, extract, classify (rich mode), embed, store.

Upload a file

curl -X POST http://localhost:3000/ingest \
  -F "file=@tests/fixtures/bicycle.txt" \
  -F "async=false"

The file's MIME type is used as contentType. For large files or batch uploads, use async mode:

curl -X POST http://localhost:3000/ingest \
  -F "file=@video.mp4" \
  -F "async=true"

Response:

{
  "jobId": "job_xyz789",
  "status": "pending"
}

Async jobs are enqueued via the configured QueueAdapter. In v0.2.0 the in-memory queue stores jobs; workers for processing are not yet wired.

Upload raw text

curl -X POST http://localhost:3000/ingest \
  -H "Content-Type: application/json" \
  -d '{"text": "Rust uses ownership instead of GC", "contentType": "text/plain"}'

Parameters

Field	Type	Required	Description
`file`	File	Yes*	The file to ingest (multipart only)
`text`	string	Yes*	Raw text to ingest (JSON only)
`contentType`	string	Yes*	MIME type of the content. Inferred from the uploaded file for multipart
`async`	boolean	No	`true` queues the job, `false` waits for completion
`metadata`	JSON	No	Arbitrary key-value metadata
`sectorHint`	string	No	Force sector: `episodic`, `semantic`, `procedural`, `emotional`, `reflective`
`validFrom`	datetime	No	ISO 8601 start of validity window (rich mode)
`validTo`	datetime	No	ISO 8601 end of validity window (rich mode)

*Either file (multipart) or text (JSON) is required.

Supported content types

Currently implemented:

text/plain, text/markdown, text/html, and other text/* types

Deferred to later phases:

Images (OCR)
PDF documents
Audio/video (Whisper transcription)

What happens during ingestion

Validate — Check MIME type and size limits
Store blob — Save raw file to BlobAdapter (multipart only)
Extract text — Plain-text extraction (OCR/PDF/audio deferred)
Classify sector (rich mode only) — LLM decides: episodic, semantic, procedural, emotional, reflective
Generate embedding — Send extracted text or summary to EmbedderAdapter
Generate summary — LLM produces a short summary
Store memory — Save record to StoreAdapter with metadata, access policy, and relations

When RBAC is enabled, the current principal becomes the memory owner unless owner or accessPolicy is provided explicitly.

Ingestion jobs

The queue processes these job types:

Type	Description
`ingestion`	Full pipeline for a new file
`extraction`	Re-run text extraction (e.g., after updating extractor)
`summarization`	Re-generate summary with a new prompt
`insight-generation`	Cross-memory pattern detection

Ingestion

Ingestion

Upload a file

Upload raw text

Parameters

Supported content types

What happens during ingestion

Ingestion jobs

On this page