Keep up-to-date with the latest changes and updates to the Dotprod API.

Coming in next release

Implemented enhancements:

v0.7.0 (2024-06-11)

Implemented enhancements:

Experiment with visually describing files using a Visual Language Model
New /v1/metadata endpoint to easily extract relevant information (mime type, creation date, number of pages)
Add a page parameter to the /v1/preview endpoint, to generate the preview of specific slides for example

Implemented enhancements:

Fixed bugs:

Implemented enhancements:

Improve the title detection
Support rotated files
Support Microsoft Word .doc files
Fail early with a 503 status code when we are at capacity
Implement a detection filter for scanned images to increase the parsing speed at 99% recall

Implemented enhancements:

Implemented enhancements:

Support reading PDFs with text inside figure constructs, usually old PDF files generated by EvoPDF
Track various processing times to measure our improvements
Experiment with processing pdf pages in parallel

Fixed bugs:

Low-resolution images in PDF files used to be upscaled to A4 sizes, slowing down the processing
Reading order was broken with overlapping text lines

Implemented enhancements:

First commit: