Skip to content

Documents

paperless.documents provides the full feature set of the Paperless-ngx document API - in addition to the standard CRUD operations available on all resources.


Fetching a document

document = await paperless.documents(42)

print(document.id)
print(document.title)
print(document.correspondent)
print(document.document_type)
print(document.tags)
print(document.created)
print(document.content)
print(document.page_count)
print(document.mime_type)
print(document.archive_serial_number)

Downloading file contents

Every document can be fetched in three modes: download (archived), preview and thumbnail. All return a DownloadedDocument instance.

download  = await paperless.documents.download(42)
preview   = await paperless.documents.preview(42)
thumbnail = await paperless.documents.thumbnail(42)

DownloadedDocument gives you the raw bytes plus everything from the response headers you'd need to save or serve the file:

# save to disk using the filename suggested by the API
with open(download.disposition_filename, "wb") as f:
    f.write(download.content)

print(download.content_type)       # e.g. "application/pdf"
print(download.disposition_type)   # "attachment" or "inline"

Requesting the original file

By default, the archived (processed) version is returned. Pass original=True to get the original uploaded file:

download = await paperless.documents.download(42, original=True)

Searching documents

async for document in paperless.documents.search("type:invoice"):
    print(document.title, document.search_hit.score)

You can also pass the query as a keyword argument:

async for document in paperless.documents.search(query="annual report"):
    ...

Search hits

When a document was returned from a search, it carries a DocumentSearchHit. Use has_search_hit to branch on it, or the walrus operator to check and bind in one step:

if document.has_search_hit:
    print(f"{document.title} matched the query")

if hit := document.search_hit:
    print(hit.score)
    print(hit.highlights)
    print(hit.note_highlights)
    print(hit.rank)

search_hit is None for documents fetched directly (e.g. paperless.documents(42)).

Custom field query

For building expressions in a type-safe way, see Custom field query.


Find documents similar to a given document:

async for document in paperless.documents.more_like(42):
    print(document.title)

Metadata

meta = await paperless.documents.metadata(42)

The returned DocumentMeta object includes embedded metadata from the file (e.g. EXIF or PDF metadata):

for entry in meta.original_metadata:
    print(entry.namespace, entry.key, entry.value)

Suggestions

Paperless-ngx can suggest classifiers (correspondent, document type, tags) for a document:

suggestions = await paperless.documents.suggestions(42)

print(suggestions.correspondents)
print(suggestions.document_types)
print(suggestions.tags)
print(suggestions.storage_paths)
print(suggestions.dates)

Notes

Every document can have a list of notes attached to it. When a document is fetched from the API the notes are already embedded in the response - calling doc.notes() returns them immediately from an in-memory cache without a second HTTP request.

doc = await paperless.documents(42)

notes = await doc.notes()  # served from cache, no HTTP request
for note in notes:
    print(note.id, note.note, note.created)

To force a fresh fetch from the API and refresh the cache, pass force_request=True:

notes = await doc.notes(force_request=True)

The standalone service always requests the API:

notes = await paperless.documents.notes(42)

Adding a note

# Pass the document pk as the first positional argument
draft = paperless.documents.notes.create(42, note="This needs review")
note_id = await paperless.documents.notes.save(draft)

Or via a fetched document (the document pk is bound automatically):

doc = await paperless.documents(42)
draft = doc.notes.create(note="This needs review")
note_id = await doc.notes.save(draft)

After save() the cache is updated automatically - the next doc.notes() call returns the latest state without an extra request.

Deleting a note

notes = await doc.notes()
await doc.notes.delete(notes[0])

After a successful delete the cache is updated in-place.


Every document can have share links attached to it. These are read-only from the document sub-service - to create or delete share links use paperless.share_links.

# Fetch share links for a document
links = await paperless.documents.share_links(42)

# or via a fetched document
doc = await paperless.documents(42)
links = await doc.share_links()

for link in links:
    print(link.slug, link.expiration)

Next available ASN

Request the next free archive serial number from Paperless-ngx:

next_asn = await paperless.documents.get_next_asn()
print(f"Next ASN: {next_asn}")

Updating & deleting a document

Modify fields on a fetched document and persist them with update(), or remove the document with delete(). Both can be called on the service directly or via the client-level dispatcher:

doc = await paperless.documents(42)
doc.title = "Invoice 2024-01"
doc.correspondent = 3

# service
await paperless.documents.update(doc)
await paperless.documents.delete(doc)

# dispatcher — no need to reference the service explicitly
await paperless.update(doc)
await paperless.delete(doc)

See Resources — Updating items and Resources — Deleting items for full options (only_changed, silent_fail).


Uploading a document

Use create() to construct a document upload and save() to submit it. The document content must be provided as bytes. All fields except document are optional.

with open("invoice.pdf", "rb") as f:
    content = f.read()

draft = paperless.documents.create(
    document=content,           # required - raw file bytes
    filename="invoice.pdf",     # original filename
    title="Invoice 2024-01",
    created=datetime.datetime(2024, 1, 15),
    correspondent=3,            # correspondent ID
    document_type=2,            # document type ID
    storage_path=1,             # storage path ID
    tags=[1, 5],                # tag IDs
    archive_serial_number=1042,
    custom_fields=[3, 8],       # custom field IDs (Paperless assigns null values)
)

task_id = await paperless.documents.save(draft)
print(f"Upload queued as task: {task_id}")

Note

Unlike other resources, save() for documents returns a task ID string, not an integer ID. The document is processed asynchronously by Paperless-ngx. Use paperless.tasks to monitor the task.

Uploading with custom field values

To set explicit values on custom fields at upload time, use DocumentCustomFieldList:

from pypaperless.models.documents import DocumentCustomFieldList
from pypaperless.models.custom_fields import CustomFieldValue

cf_list = DocumentCustomFieldList(paperless, data=[])
cf_list += CustomFieldValue(field=3, value="ACME Corp")
cf_list += CustomFieldValue(field=8, value=42)

draft = paperless.documents.create(document=content, custom_fields=cf_list)

See Custom fields for the full custom field API.


Monitoring upload tasks

After uploading a document, use paperless.tasks to check the status:

task_id = await paperless.documents.save(draft)

import asyncio
for _ in range(30):
    await asyncio.sleep(2)
    task = await paperless.tasks(task_id)
    if task.status in ("SUCCESS", "FAILURE"):
        break

print(task.status, task.result)

Checking if a document is deleted

The is_deleted property returns True when the document is currently in the trash:

doc = await paperless.documents(42)
print(doc.is_deleted)  # False for active documents

# Documents returned from paperless.trash also have this set
async for doc in paperless.trash:
    print(doc.id, doc.is_deleted, doc.deleted_at)

Sending documents by e-mail

You can send one or more documents as attachments to one or more e-mail addresses:

await paperless.documents.email(
    [23, 42],
    addresses="alice@example.com, bob@example.com",
    subject="Your requested documents",
    message="Please find the documents attached.",
)

A single document can also be passed as an integer:

await paperless.documents.email(
    42,
    addresses="alice@example.com",
    subject="Invoice",
    message="See attachment.",
    use_archive_version=False,  # send original instead of archived version
)
Parameter Default Description
documents - Document ID(s) to send
addresses - Comma-separated recipient e-mail addresses
subject - E-mail subject
message - E-mail body text
use_archive_version True Send archived version; False for original

Raises SendEmailError if the Paperless server rejects the request.


Audit history

Every change to a document is recorded as an audit-log entry. Use document.history() or the service directly to retrieve the full history of a document.

# Via a fetched document (document pk is bound automatically)
doc = await paperless.documents(42)
entries = await doc.history()

for entry in entries:
    print(entry.timestamp, entry.action, entry.actor.username if entry.actor else "-")
    print(entry.changes)   # dict of changed fields

# Via the service, passing the document pk explicitly
entries = await paperless.documents.history(42)

Bulk editing

paperless.documents.bulk_edit lets you apply operations to many documents at once in a single API call.

Metadata

await paperless.documents.bulk_edit.set_correspondent([1, 2, 3], 5)
await paperless.documents.bulk_edit.set_document_type([1, 2], 3)
await paperless.documents.bulk_edit.set_storage_path([1, 2], 4)

# clear correspondent
await paperless.documents.bulk_edit.set_correspondent([1, 2, 3], None)

Tags

await paperless.documents.bulk_edit.add_tag([1, 2, 3], 7)
await paperless.documents.bulk_edit.remove_tag([1, 2, 3], 7)

# Add and remove in one call
await paperless.documents.bulk_edit.modify_tags(
    [1, 2, 3],
    add_tags=[5, 6],
    remove_tags=[2],
)

Custom fields

await paperless.documents.bulk_edit.modify_custom_fields(
    [1, 2],
    add_custom_fields={3: "open"},   # {pk: value} or list of PKs
    remove_custom_fields=[4],
)

Permissions

from pypaperless.models.types import Permissions

await paperless.documents.bulk_edit.set_permissions(
    [1, 2, 3],
    owner=1,
    permissions=Permissions(view_users=[2, 3], change_users=[1]),
    merge=False,  # True merges with existing instead of replacing
)

Document operations

# Move to trash
await paperless.documents.bulk_edit.delete([10, 11, 12])

# Re-run OCR
await paperless.documents.bulk_edit.reprocess([1, 2, 3])

# Rotate pages
await paperless.documents.bulk_edit.rotate([1, 2], 90)

# Merge into a new single document
await paperless.documents.bulk_edit.merge(
    [10, 11, 12],
    metadata_document_id=10,   # whose metadata to use for the result
    delete_originals=True,     # move source documents to trash after merging
)

All bulk edit operations raise BulkEditError (a ResponseError subclass) when the API returns a non-OK result.