Skip to content

neurolinker-sdk-python

A Python SDK for the NeuroLinker API from Ainexxo S.R.L. The SDK provides sync and async clients to submit documents, track extraction jobs, and retrieve processed results.

You can find more info about the repo here.

Download

pip install neurolinker-sdk

Usage

Set credentials in environment variables preferibly (API key is required).

NEUROLINKER_API_KEY (required): generate it from the official neurolinker website NeuroLinker.com - Login and go to the API KEY section.

NEUROLINKER_BASE_URL (optional): when set, it becomes the default API endpoint for the SDK. If not set, the SDK defaults to https://neurolinker.api.ainexxo.com.

export NEUROLINKER_API_KEY="your_token"
# Optional (override default API endpoint)
export NEUROLINKER_BASE_URL="https://neurolinker.api.ainexxo.com"

Quick start

  • sync
from neurolinker_sdk import NeuroLinker

with NeuroLinker(token="nl_****") as client:
    tasks = client.tasks.list()
  • with .env (sync):
from neurolinker_sdk import NeuroLinker

with NeuroLinker.from_env() as client:
    tasks = client.tasks.list()
  • async
from neurolinker_sdk import AsyncNeuroLinker

async with AsyncNeuroLinker(token="nl_****") as client:
    tasks = await client.tasks.list()
  • with .env (async):
from neurolinker_sdk import AsyncNeuroLinker

async with AsyncNeuroLinker.from_env() as client:
    tasks = await client.tasks.list()

For more refs about examples usages refer to the tets inside the repository at SDK tests.

SDK functionality (minimal usage + parameters)

These are the ways to define a client before it get used.

Client constructors

NeuroLinker(
    token,
    base_url=None,
    timeout_s=600.0,
    poll_interval_s=2.0,
    poll_max_interval_s=10.0,
    http_client=None,
)

Minimal sync client constructor. You can pass only token, other parameters are optional; if base_url is not provided, the SDK uses NEUROLINKER_BASE_URL when set, otherwise it defaults to https://neurolinker.api.ainexxo.com.

AsyncNeuroLinker(
    token,
    base_url=None,
    timeout_s=600.0,
    poll_interval_s=2.0,
    poll_max_interval_s=10.0,
    http_client=None,
)

Minimal async client constructor. You can pass only token, other parameters are optional; if base_url is not provided, the SDK uses NEUROLINKER_BASE_URL when set, otherwise it defaults to https://neurolinker.api.ainexxo.com.

Or if you want to define .env file you can override these parameters:

NeuroLinker.from_env(timeout_s=None, poll_interval_s=None, poll_max_interval_s=None)

Loads these parameters from default if they aren't set, otherwise override them.

AsyncNeuroLinker.from_env(timeout_s=None, poll_interval_s=None, poll_max_interval_s=None)

Async version of from_env.

Available methods

These are a list of methods that can be used. Async equivalents exist for every resource and use the same parameters with await.

Note: In order to facilitate the workflow, the sdk offers methods for polling results since many actions have success only when the result of the document is completed.

Method Description
client.tasks.list() List the processing tasks available in the system.
client.extract.extract(documents=[("file.pdf", b"...")], urls=None, alias=None, description=None) Upload PDFs from bytes. documents and urls are mutually exclusive.
client.extract.extract(documents=None, urls=["https://..."], alias="optional", description="optional") Submit a URL-based extraction job.
client.status.request(request_id) Check the status of an extraction request by request ID.
client.status.document(document_id) Check the status of a single document by document ID.
client.extract_request_uid(extract_response) Extract request_uid from the extract response (supports both top-level and nested data payloads).
client.extract_document_ids(status_response) Extract document IDs from a request-status response.
client.wait_for_request_completion(request_uid, timeout_s=None, poll_interval_s=None, poll_max_interval_s=None) Built-in polling helper that waits for terminal status (completed, failed, pending), handling transient 404 during early processing.
client.documents.markdown(document_ids, content_types=None) Retrieve markdown results for document IDs. content_types can be a list of ContentType values or strings.
client.documents.json(document_ids, content_types=None) Retrieve JSON results for document IDs, with optional content type filtering.
client.documents.images(document_ids) Retrieve extracted image metadata for document IDs.
client.documents.page_summaries(document_ids) Retrieve per-page summaries.
client.documents.section_summaries(document_ids) Retrieve summaries grouped by detected sections.
client.documents.document_summary(document_ids, summary_type="page" \| "section") Retrieve a single consolidated summary. summary_type is required and supports page or section.
from neurolinker_sdk.resources.documents import ContentType Use ContentType.TEXT, ContentType.FORMULA, ContentType.TABLES, ContentType.IMAGES to filter content returned by markdown/json endpoints.
client.zip.make_zip(job_uid="...", document_uid=None, local_images=False, content_types=None) Request a ZIP archive for a completed extraction job (entire job or a single document). job_uid maps to a generic extraction; if document_uid is set, it maps to a specific document to download. With local_images=True, JSON/Markdown references are rewritten to local relative image paths. content_types is optional (example: ["text"] or others ContentType) and filters JSON/Markdown content included in the ZIP.
NeuroLinkerAPIError, NeuroLinkerConfigError Exceptions raised for non-2xx API responses or missing/invalid configuration.