Details
-
Improvement
-
Status: Closed
-
Blocker
-
Resolution: Fixed
-
None
-
None
-
None
Description
1. Current approach (problem) of Solr
Below is the diagram describe the model on how currently handling a request.
The main-thread that handles the search requests, will submit n requests (n equals to number of shards) to an executor. So each request will correspond to a thread, after sending a request that thread basically do nothing just waiting for response from other side. That thread will be swapped out and CPU will try to handle another thread (this is called context switch, CPU will save the context of the current thread and switch to another one). When some data (not all) come back, that thread will be called to parsing these data, then it will wait until more data come back. So there will be lots of context switching in CPU. That is quite inefficient on using threads.Basically we want less threads and most of them must busy all the time, because threads are not free as well as context switching. That is the main idea behind everything, like executor
2. Async call of Jetty HttpClient
Jetty HttpClient offers async API like this.
httpClient.newRequest("http://domain.com/path") // Add request hooks .onRequestQueued(request -> { ... }) .onRequestBegin(request -> { ... }) // Add response hooks .onResponseBegin(response -> { ... }) .onResponseHeaders(response -> { ... }) .onResponseContent((response, buffer) -> { ... }) .send(result -> { ... });
Therefore after calling send() the thread will return immediately without any block. Then when the client received the header from other side, it will call onHeaders() listeners. When the client received some byte[] (not all response) from the data it will call onContent(buffer) listeners. When everything finished it will call onComplete listeners. One main thing that will must notice here is all listeners should finish quick, if the listener block, all further data of that request won’t be handled until the listener finish.
3. Solution 1: Sending requests async but spin one thread per response
Jetty HttpClient already provides several listeners, one of them is InputStreamResponseListener. This is how it is get used
InputStreamResponseListener listener = new InputStreamResponseListener(); client.newRequest(...).send(listener); // Wait for the response headers to arrive Response response = listener.get(5, TimeUnit.SECONDS); if (response.getStatus() == 200) { // Obtain the input stream on the response content try (InputStream input = listener.getInputStream()) { // Read the response content } }
In this case, there will be 2 thread
- one thread trying to read the response content from InputStream
- one thread (this is a short-live task) feeding content to above InputStream whenever some byte[] is available. Note that if this thread unable to feed data into InputStream, this thread will wait.
By using this one, the model of HttpShardHandler can be written into something like this
handler.sendReq(req, (is) -> { executor.submit(() -> try (is) { // Read the content from InputStream } ) })
The first diagram will be changed into this
Notice that although “sending req to shard1” is wide, it won’t take long time since sending req is a very quick operation. With this operation, handling threads won’t be spin up until first bytes are sent back. Notice that in this approach we still have active threads waiting for more data from InputStream
4. Solution 2: Buffering data and handle it inside jetty’s thread.
Jetty have another listener called BufferingResponseListener. This is how it is get used
client.newRequest(...).send(new BufferingResponseListener() { public void onComplete(Result result) { try { byte[] response = getContent(); //handling response } } });
On receiving data, Jetty (one of its thread) will call the listener with the given data (data here is just byte[] represent part of the response). The listener will then buffer that byte[] into an internal buffer. When all the data are received, Jetty will call onComplete of the listener and inside that method we will get all the response.
By using this one, the model of HttpShardHandler can be written into something like this
handle.send(req, (byte[]) -> { // handling data here })
The first diagram will be changed into this
Pros:
- We don’t need additional threads for each request → Less threads
- No thread are activately waiting for data from an InputStream → Threads are more busy
Cons
- Data must be buffered all before able being to parse → double memory being used for parsing a response.
5. Solution 3: Why not both?
Solution 1 is good for parsing very large response or sometimes unbounded (like in StreamingExpression) response.
Solution 2 is good for parsing small response (may be < 10KB) since overhead is little.
Should we combine both solutions above? After all what is returned by HttpSolrClient so far for all requests is a NamedList<>, so as long as we can return a NamedList<> using Solution 1 or Solution 2 are not matter with users.
Therefore the idea here is based on “CONTENT_LENGTH” of the response’s headers. If the response body less than a certain size we will go with solution 2 and vice versa.
Note: Solr seems doesn’t return content-length accurately, need more investigation.
6. Further improvement
The best approach to solve this problem is instead of converting InputStream to NamedList, why don’t we just converting byte by byte and make it resumable. Like this
Parser parser = new Parser(); public void onContent(ByteBuffer buffer) { parser.parse(buffer) } public void onComplete() { NamedList<> result = parser.getResult(); }
Therefore, there will be no blocking operation inside parser, thus making a very efficient model. But doing this requires tons of change in Solr, rewrite all ResponseParsers in Solr, not mention the flow here must be rewritten. Not sure it is worth it for doing that.
Attachments
Attachments
Issue Links
- is related to
-
SOLR-17211 HttpJdkSolrClient: Support Async
- Closed
- relates to
-
SOLR-14763 SolrJ Client Async HTTP/2 Requests
- Closed
- links to