Page MenuHomePhabricator

Do not run tests synchronously in wikilambda_perform_test
Closed, DeclinedPublic

Description

Currently viewing every function pages results in a front-end wikilambda_perform_test request, which runs tests of every combinations of implementations (connected or not) on every tests (connected or not). I have summerized the issues several months before:

Some of already occurs issues are:

  • Rate limit - we have a rate limit on function invocation and just viewing some function pages (which run all tests, connected and unconnected, for all implementation, connected and unconnected) may hit this rate limit.
  • When viewing a page tests are invoked via synchronous web requests without using any job system, nor deduplication. First, expensive sync web requests on page view is already an antipattern. Second, user can not immediately see the result of a test, not even a cached one. Third, since request are synchronous, users with unstable internet connection may not see the correct result. Fourth, running tests on every page view of functions, without any backend job system nor deduplication, is prone to DoS or DDoS.

Proposed:

  • Introduce a cache system to persistently cache the result of a test (i.e. test ID, test revision ID, implementation ID and implementation revision ID).
  • Introduce a job system so requests to run the test can be deduplicated. Running the jobs in the job system should not have rate limit itself based on user (i.e. if we want to prevent job system from being misused, we should limit sending of jobs instead of running, and the limit should be large enough so that it can not be hit by viewing one single function page). Currently wikilambda_perform_test request is affected by rate limit.
  • wikilambda_perform_test be changed to only send jobs to the job system, and read the (previous, not current) result from cache.
  • When viewing a function page, there should be indications that test results may not be instantaneous and may be outdated. Potentially we can provide a button to refresh the test result (of course it should not run tests synchronously either).

Event Timeline

@Bugreporter Following the bulleting above…

  • It is my understanding that, in general, the test results shown when viewing a function page are cached results. In fact, we have a problem that timeouts and other intermittent failures are persisted and we need to edit something to get rid of the cached results.
  • I think we should consider an asynchronous approach under T369953. The test results will most often be affected by an edit to an implementation or test case, or by connections and disconnections. Re-testing the whole function after an edit to the function object’s keys is almost always going to be unnecessary; the odd exception could be picked up asynchronously. The case where a test fails for an intermittent reason is more interesting. It may be that an implementation fails to be listed as the first (preferred) as a result of such a failure, so it is worth repeating the test once or twice at a later time. It should not be necessary to repeat all the tests for a particular implementation in this case, but I think it might be sensible. If the implementation then compares favourably with the currently preferred implementation, that too should be re-tested. If its new results differ materially from its previous results, then it would make sense to re-test the function as a whole.
  • If an implementation or test is changed, its re-test could be synchronous. In fact, this should probably be required before Publish (with Publish still being possible after failures). I believe this is technically a separate evaluation, however (all tests are re-run after Publish).
  • Yes, I think some suitably vague indication of the age of cached results would be helpful, particularly if they are not all roughly the same age. It would also be worth considering highlighting the newer results somehow, in the case where the whole function has not been re-tested (I’m not sure we would have a straightforward definition of “newer”, but it is something like: since the latest contribution to the function, or any of its implementations or tests).

For clarification, what I mean for "caching" is different of the current cache of evaluation result. Currently if you view function page that have not been viewed before, it will run all test combinations synchronously and cache each evaluation result; When viewing the test again, the cached evaluation results are used. However what I prefer is a persistent cache of test results.

For "get rid of the cached results", see also T343615: Provide a user-facing mechanism to evict a cached result of a function call so it is recalculated. Though doing multiple evaluations synchronously on web request is still a bad idea.

For clarification […]

Thank you for the clarification. I don’t understand the specifics of how evaluation results are cached but I would be wary of storing an evaluation result for a given combination of implementation and test using the identifier of the test, rather than the arguments to the function it tests. That having been said, I’m happy to leave it to those with greater expertise to consider!

For clarification, what I mean for "caching" is different of the current cache of evaluation result. Currently if you view function page that have not been viewed before, it will run all test combinations synchronously and cache each evaluation result; When viewing the test again, the cached evaluation results are used. However what I prefer is a persistent cache of test results.

I do not understand. We have a persistent cache of test results, in the database. It is only cleared when the result is no longer valid, because you have edited the page. It does not wait for the page to be viewed.

If you mean "don't run tests if they're not current, until a user manually requests it", I worry that that would make Functions even harder to understand for non-technical users.

Introduce a job system so requests to run the test can be deduplicated. Running the jobs in the job system should not have rate limit itself based on user (i.e. if we want to prevent job system from being misused, we should limit sending of jobs instead of running, and the limit should be large enough so that it can not be hit by viewing one single function page). Currently wikilambda_perform_test request is affected by rate limit.

Moving the requests from an async API to an async Job won't improve the de-duplication, because in practice we very rarely have different users triggering the API, but we could do. Not over-loading the back-end is a feature, not a bug.

The issue is API request will timeout. E.g. when viewing https://www.wikifunctions.org/wiki/Z10786, there is an synchronous API request to https://www.wikifunctions.org/w/api.php?action=wikilambda_perform_test&format=json&wikilambda_perform_test_zfunction=Z10786&wikilambda_perform_test_zimplementations=Z10789%7CZ10807&wikilambda_perform_test_ztesters=Z10787%7CZ10805%7CZ10808%7CZ10860%7CZ10861%7CZ14225%7CZ14266%7CZ14267%7CZ14268%7CZ14269%7CZ14270%7CZ14271%7CZ14272%7CZ14273%7CZ14274&uselang=en, which can not be completed in one minute and thus will be killed by server. If the requests are run asynchronously with persistent cache, at least you can see a result immediately next time.

Also I concern attackers can use wikilambda_perform_test to DoS the system.

If you mean "don't run tests if they're not current, until a user manually requests it", I worry that that would make Functions even harder to understand for non-technical users.

What I propose: when user view function page without cached result (1) the page will show the tests not yet has a result and provide a button to refresh, or show the result once available; (2) send a request to run tests asynchronously (via a job system). Optionally we can use some push technology (such as polling or WebSocket) so when a result is available it is shown in function page, but such request must query results only, not invoking test itself.

i.e. tests must be invoked by backend job system, so they will not be affected by any timeout setting of app servers.

We don't think that the major product changes this idea proposes are based in evidence of what user change we'd want want to see, as written.

If you have specific outcomes you think we should focus on, such as "I want Function pages to load for me faster" or "I want the site to cache the test results of changed Implementations before publishing so they're immediately visible", please re-open this task with those.