DOCUMENT
Status: PRODUCTION A
DOCUMENT
link type represents common document formats including:
- PDF files (
.pdf
) - Microsoft Word documents (
.doc
,.docx
) - Rich Text Format files (
.rtf
) - OpenDocument Text files (
.odt
) - Microsoft Excel spreadsheets (
.xls
,.xlsx
) - Microsoft PowerPoint presentations (
.ppt
,.pptx
) - XML documents (
.xml
)
title
, description
, image
), DOCUMENT
links provide detailed information like page count, author, and optional raw text access. If available, you can also retrieve responsive images for visual previews and even inspect individual pages of the document.
What It Includes
Whentype
is DOCUMENT
, the response includes a document
object with fields like:
- title: Title of the document.
- type: The document format (e.g., pdf, docx, xlsx, etc.).
- description: A brief summary if available or inferred.
- estimatedReadingTime: Approximate reading time in minutes based on the document’s length.
- rawTextUrl: A URL to fetch the document’s raw text.
- image: A responsive image object providing different resolutions of a representative page image (e.g., cover page).
- pages: An array of page-level metadata (if available).
- pageCount: Total number of pages in the document.
- author: The author or creator’s name.
- isEncrypted: Indicates whether the document is encrypted or password-protected.
- lastModified: Timestamp of the last modification date.
- language: The primary language of the document’s content.
Example Request
Example Response
Special Notes
- Fallback Strategies: Similar to PAGE links, if the document’s metadata is limited, Peekalink uses AI-driven techniques to infer missing title or description.
- Estimated Reading Time: Calculated based on the extracted text’s length.
- Page-Level Images & Data: Each page may have its own responsive image set, letting you show previews of individual pages.
- Format & Encryption: The
type
field helps identify the file format (e.g., pdf), andisEncrypted
warns if the document is protected.