🚗

Auto Chunking, Slides, and Bar Graph Detection, and Latency Improvements

Version
Date
Usage Tracking
You can now track your page volume/usage information at: https://app.reducto.ai/profile
 
Automatic Chunking
If a chunk size is omitted in requests to our API, we will now automatically chunk documents based on layout. The chunk size will range between 2000 and 8000 characters and will be based on layout elements within the document like titles, section headers, and avoid splitting long lists where indentation is difficult. In our early RAG benchmarks, we’ve seen improvements across the board with this method. We’ll be able to share more detailed benchmarking results soon!
 
Slides Endpoint
We’ve released a slides endpoint, available here in beta:
This endpoint chunks according to the page number of each slide and by default adds a secondary call to a vision LLM for every page to incorporate visual information into the document response (e.g. capturing arrows between objects on the slides in the response text). Vision enrichment, if enabled, will be billed as an extra page for each slide.
 
Bar Graph Detection
If figure_summarization is enabled in your chunking request, we now support pulling charts/plots from the document in a tabular format.
 
Latency Improvements
We’ve been rolling out latency improvements.