-
Notifications
You must be signed in to change notification settings - Fork 4
Verification
- A probabilistic algorithm: https://github.com/imapp-pl/golem_rd/tree/master/lottery-distributed-verification
Currently subtasks are verified by called after subtask results is send back to the requester. It may be modified in the future, when we add different verification method in which each subtask will be computed by few computing nodes and their results will be compared.
Tasks are verified after receiving last subtask results (after task returns True with finished_computation
method). For rendering tasks there are currently no full task verification methods implemented.
Computations can be deterministic or non-deterministic. Apart from that we can divide them in following groups:
- Easily verificable PoW-like tasks
- You can verify results by running some time-consuming, but order of magnitude less difficult computation.
- You can verify results but it requires to repeat full computations
- You cannot verify computation at all
The last type of tasks are not typical Golem tasks and should be run only with nodes with highest reputation and some trusted cerificates signed.
Number 3 can be solved with redundant verification.
Number 2 can be solved by running verification method on user machine. It can be also partially combined with method 3. For example subtasks send to different nodes may partially overlap. Overlapping part will be compared.
It's difficult to verify images from Luxrender because of their non-determinism. To validate the results, one have to calculate a metric against a reference image. No matter which metric we use, there will always be some mismatch because of non-determinism. To overcome this issue, requestor's machine renders two sample images and calculates the metric between them. Since the result depends on image resolution and quality (sampling rate), we have to formulate a self-adjustable validation mechanism:
if Metric(sample1, providers_results) < relaxation_coeff * Metric(sample1, sample2)
validation success
else
validation false
User should be able to choose among different verification methods, depending on how he valueate probability of correctness of computation, time and prices. For rendering tasks there should be at least:
- Simple verification - check only format, size, name of the final picture and some log files format. Partially implemented (logs needed).
- Advance singular verification - choose random smaller part of a subtask and render this part on requester machine. Compare received results with PSNR or similiar metric. User may choose whether this operation should be done for every subtask, for the first subtask received from a specific node or for a random subtasks (with specific probability). Partially implemented (boundary conditions for different renderes and renderers settings must be tested).
- Advance verification - send the same subtask to two or more computing nodes and compare received results with PSNR or similiar metric. Not implemented.
There is a quite large class of tasks, where computation consists of many small steps run sequentially. In fact, we can model every program as sequential task, involving executing code instructions sequentially.
We are now focusing on a subclass of sequential tasks, in which there is a possibility of dumping the program state and then restoring it for computation later on (so, for example, it is possible in case of machine learning algorithms, because the state is usually explicitly represented as some collection of matrices and seeds for rand() functions, and is quite hard in case of webserver, where state is implicitly represented as incoming connections, data stored in databases and so on).
The ideas for verification of such type of tasks are described here.