Skip to content

Documents

paperless.documents provides the full feature set of the Paperless-ngx document API — in addition to the standard CRUD operations available on all resources.


Fetching a document

document = await paperless.documents(42)

print(document.id)
print(document.title)
print(document.correspondent)   # int (id) or None
print(document.document_type)   # int (id) or None
print(document.tags)            # list[int]
print(document.created)         # datetime.date | None
print(document.content)         # extracted text content
print(document.page_count)
print(document.mime_type)
print(document.archive_serial_number)

Downloading file contents

Every document can be fetched in three modes: download (archived), preview and thumbnail. All return a DownloadedDocument instance.

download  = await paperless.documents.download(42)
preview   = await paperless.documents.preview(42)
thumbnail = await paperless.documents.thumbnail(42)

Or using an already-fetched document:

doc = await paperless.documents(42)

download  = await doc.get_download()
preview   = await doc.get_preview()
thumbnail = await doc.get_thumbnail()

The DownloadedDocument holds:

Attribute Description
content Raw binary file data
content_type MIME type, e.g. "application/pdf"
disposition_type "attachment" or "inline"
disposition_filename Suggested filename from Content-Disposition
original Whether the original file was requested

Requesting the original file

By default, the archived (processed) version is returned. Pass original=True to get the original uploaded file:

download = await paperless.documents.download(42, original=True)

Searching documents

async for document in paperless.documents.search("type:invoice"):
    print(document.title, document.search_hit.score)

You can also pass the query as a keyword argument:

async for document in paperless.documents.search(query="annual report"):
    ...

Custom field query

async for document in paperless.documents.search(
    custom_field_query='["amount", "gte", 10000]'
):
    ...

Custom field query syntax is documented in the Paperless-ngx API reference.

Search hits

When a document was returned from a search, it carries a DocumentSearchHit:

if document.has_search_hit:
    hit = document.search_hit
    print(hit.score)
    print(hit.highlights)
    print(hit.rank)

Find documents similar to a given document:

async for document in paperless.documents.more_like(42):
    print(document.title)

Metadata

meta = await paperless.documents.metadata(42)

# or via a fetched document
doc = await paperless.documents(42)
meta = await doc.get_metadata()

The returned DocumentMeta object includes embedded metadata from the file (e.g. EXIF or PDF metadata):

for entry in meta.original_metadata:
    print(entry.namespace, entry.key, entry.value)

Suggestions

Paperless-ngx can suggest classifiers (correspondent, document type, tags) for a document:

suggestions = await paperless.documents.suggestions(42)

# or via a fetched document
doc = await paperless.documents(42)
suggestions = await doc.get_suggestions()

print(suggestions.correspondents)   # list[int]
print(suggestions.document_types)   # list[int]
print(suggestions.tags)             # list[int]
print(suggestions.storage_paths)    # list[int]

Notes

Every document can have a list of notes attached to it.

# Fetch notes for a document
notes = await paperless.documents.notes(42)

# or via a fetched document
doc = await paperless.documents(42)
notes = await doc.notes()

for note in notes:
    print(note.note, note.created)

Adding a note

# Pass the document pk as the first positional argument
draft = paperless.documents.notes.draft(42, note="This needs review")
note_id, doc_id = await paperless.documents.notes.save(draft)

Or via a fetched document (the document pk is bound automatically):

doc = await paperless.documents(42)
draft = doc.notes.draft(note="This needs review")
note_id, doc_id = await doc.notes.save(draft)

Deleting a note

note = notes[0]
await paperless.documents.notes.delete(note)

Next available ASN

Request the next free archive serial number from Paperless-ngx:

next_asn = await paperless.documents.get_next_asn()
print(f"Next ASN: {next_asn}")

Uploading a document

Use draft() to construct a document upload and save() to submit it. The document content must be provided as bytes.

with open("invoice.pdf", "rb") as f:
    content = f.read()

draft = paperless.documents.draft(
    document=content,
    filename="invoice.pdf",
    title="Invoice 2024-01",
    created=datetime.datetime(2024, 1, 15),
    correspondent=3,
    document_type=2,
    tags=[1, 5],
    archive_serial_number=1042,
)

task_id = await paperless.documents.save(draft)
print(f"Upload queued as task: {task_id}")

Note

Unlike other resources, save() for documents returns a task ID string, not an integer ID. The document is processed asynchronously by Paperless-ngx. Use paperless.tasks to monitor the task.

Document draft fields

Field Description
document Required. Raw file content
filename Optional original filename
title Document title
created Document creation date
correspondent Correspondent ID
document_type Document type ID
storage_path Storage path ID
tags list[int]
archive_serial_number ASN
custom_fields DocumentCustomFieldList

Uploading with custom fields

You can attach custom fields in two ways:

As a list of field IDs (Paperless assigns null as value):

draft = paperless.documents.draft(
    document=content,
    custom_fields=[3, 8],
)

As a DocumentCustomFieldList (with explicit values):

from pypaperless.models.documents import DocumentCustomFieldList
from pypaperless.models.custom_fields import CustomFieldValue

cf_list = DocumentCustomFieldList(paperless, data=[])
cf_list += CustomFieldValue(field=3, value="ACME Corp")
cf_list += CustomFieldValue(field=8, value=42)

draft = paperless.documents.draft(document=content, custom_fields=cf_list)

See Custom fields for the full custom field API.


Monitoring upload tasks

After uploading a document, use paperless.tasks to check the status:

task_id = await paperless.documents.save(draft)

import asyncio
for _ in range(30):
    await asyncio.sleep(2)
    task = await paperless.tasks(task_id)
    if task.status in ("SUCCESS", "FAILURE"):
        break

print(task.status, task.result)

Sending documents by e-mail

You can send one or more documents as attachments to one or more e-mail addresses:

await paperless.documents.email(
    [23, 42],
    addresses="alice@example.com, bob@example.com",
    subject="Your requested documents",
    message="Please find the documents attached.",
)

A single document can also be passed as an integer:

await paperless.documents.email(
    42,
    addresses="alice@example.com",
    subject="Invoice",
    message="See attachment.",
    use_archive_version=False,  # send original instead of archived version
)
Parameter Default Description
documents Document ID(s) to send
addresses Comma-separated recipient e-mail addresses
subject E-mail subject
message E-mail body text
use_archive_version True Send archived version; False for original

Raises SendEmailError if the Paperless server rejects the request.