Skip to content

Full-Text Ingestion

ingest_paper_fulltext accepts any of:

  • doi
  • paper_url
  • pdf_url
  • local_pdf_path

You must pass at least one.

  • auto (default): tries parser fallback chain
  • grobid: force GROBID path
  • simple: force basic parsing fallback

In auto mode, ScholarMCP now attempts grobid -> simple.

  • DOI page has no downloadable PDF URL:
    • retry with pdf_url or local_pdf_path
  • Remote downloads disabled:
    • set RESEARCH_ALLOW_REMOTE_PDFS=true
  • Local ingestion disabled:
    • set RESEARCH_ALLOW_LOCAL_PDFS=true
  • Throttling or timeout pressure:
    • increase SCHOLAR_REQUEST_DELAY_MS and/or RESEARCH_TIMEOUT_MS