Details
-
Improvement
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
Retrieving, sorting and writing out documents in ExportWriter are three aspects of the /export handler that can be further optimized.
SOLR-14470 introduced some level of caching in StringValue. Further options for caching and speedups should be explored.
Currently the sort/retrieve and write operations are done sequentially, but they could be parallelized, considering that they block on different channels - the first is index reading & CPU bound, the other is bound by the receiving end because it uses blocking IO. The sorting and retrieving of values could be done in parallel with the operation of writing out the current batch of results.
One possible approach here would be to use "double buffering" where one buffered batch that is ready (already sorted and retrieved) is being written out, while the other batch is being prepared in a background thread, and when both are done the buffers are swapped. This wouldn't complicate the current code too much but it should instantly give up to 2x higher throughput.
Attachments
Issue Links
- links to