Skip to content

Commit d7401a8

Browse files
xuniqlenkis
andauthored
Add recommendations on non-idempotent operations to vshard docs (#5252)
* Add recommendations on non-idempotent operations * Add the `request_timeout` parameter Fixes #5242 Fixes #5286 Co-authored-by: Elena Shebunyaeva <elena.shebunyaeva@gmail.com>
1 parent 106258d commit d7401a8

File tree

2 files changed

+135
-0
lines changed

2 files changed

+135
-0
lines changed

doc/platform/sharding/vshard_admin.rst

Lines changed: 98 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -551,6 +551,104 @@ In a router application, you can define the ``put`` function that specifies how
551551

552552
Learn more at :ref:`vshard-process-requests`.
553553

554+
.. _vshard-deduplication:
555+
556+
Deduplication of non-idempotent requests
557+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
558+
559+
**Idempotent requests** produce the same result every time they are executed.
560+
For example, a data read request or a multiplication by one are both idempotent.
561+
Therefore, incrementing by one is an example of a non-idempotent operation.
562+
When such an operation is applied again, the value for the field increases by 2 instead of just 1.
563+
564+
.. note::
565+
566+
Any write requests that are intended to be executed repeatedly (for example, retried after an error) should be idempotent.
567+
The operations' idempotency ensures that the change is applied **only once**.
568+
569+
A request may need to be run again if an error occurs on the server or client side.
570+
In this case:
571+
572+
- Read requests can be executed repeatedly.
573+
For this purpose, :ref:`vshard.router.call() <router_api-call>` (with ``mode=read``) uses the ``request_timeout`` parameter
574+
(since ``vshard`` 0.1.28).
575+
It is necessary to pass the ``request_timeout`` and ``timeout`` parameters together, with the following requirement:
576+
577+
.. code-block:: text
578+
579+
timeout > request_timeout
580+
581+
582+
For example, if ``timeout = 10`` and ``request_timeout = 2``,
583+
within 10 seconds the router is able to make 5 attempts (2 seconds each) to send a request to different replicas
584+
until the request finally succeeds.
585+
586+
- Write requests (:ref:`vshard.router.callrw() <router_api-callrw>`) generally **cannot be re-executed** without verifying
587+
that they have not been applied before.
588+
Lack of such a check might lead to duplicate records or unplanned data changes.
589+
590+
For example, a client has sent a request to the server. The client is waiting for a response within a specified timeout.
591+
If the server sends a successful response after this time has elapsed,
592+
the client won't see this response due to a timeout, and will consider the request as failed.
593+
When re-executing this request without additional check, the operation may be applied twice.
594+
595+
A write request can be executed repeatedly without a check in two cases:
596+
597+
- The request is idempotent.
598+
599+
- It's known for sure that the previous request raised an error before executing any write operations.
600+
For example, ER_READONLY was thrown by the server.
601+
In this case, we know that the request couldn't complete due to server in read-only mode.
602+
603+
**Deduplication examples**
604+
605+
To ensure that the write requests (INSERT, UPDATE, UPSERT, and autoincrement) are idempotent,
606+
you should implement a check that the request is applied for the first time.
607+
608+
.. note::
609+
610+
There is no built-in deduplication check in Tarantool.
611+
Currently, deduplication can be only implemented by the user in the application code.
612+
613+
For example, when you add a new tuple to a space, you can use a unique insert ID to check the request.
614+
In the example below, within a single transaction:
615+
616+
1. It is checked whether a tuple with the ``key`` ID exists in the ``bands`` space.
617+
2. If there is no tuple with this ID in the space, the tuple is inserted.
618+
619+
.. code-block:: lua
620+
621+
box.begin()
622+
if box.space.bands:get{key} == nil then
623+
box.space.bands:insert{key, value}
624+
end
625+
box.commit()
626+
627+
For update and upsert requests, you can create a *deduplication space* where the request IDs will be saved.
628+
*Deduplication space* is a user space that contains a list of unique identifiers.
629+
Each identifier corresponds to one applied request.
630+
This space can have any name, in the example it is called ``deduplication``.
631+
632+
In the example below, within a single transaction:
633+
634+
1. It is checked whether the ``deduplication_key`` request ID exists in the ``deduplication`` space.
635+
2. If there is no such ID, the ID is added to the deduplication space.
636+
3. If the request hasn't been applied before, it increments the specified field in the ``bands`` space by one.
637+
638+
This approach ensures that each data modification request will be executed **only once**.
639+
640+
.. code-block:: lua
641+
642+
function update_1(deduplication_key, key)
643+
box.begin()
644+
if box.space.deduplication:get{deduplication_key} == nil then
645+
box.space.deduplication:insert{deduplication_key}
646+
box.space.bands:update(key, {{'+', 'value', 1 }})
647+
end
648+
box.commit()
649+
end
650+
651+
554652
.. _vshard-maintenance:
555653

556654
Sharded cluster maintenance

doc/reference/reference_rock/vshard/vshard_router.rst

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,15 @@ Router public API
132132
* ``timeout`` — a request timeout, in seconds. If the ``router`` cannot identify a
133133
shard with the specified ``bucket_id``, it will retry until the timeout is reached.
134134

135+
* ``request_timeout`` (since ``vshard`` 0.1.28) — timeout in seconds that serves as a protection against hung replicas.
136+
The parameter is used in read requests only (``mode=read``).
137+
It is necessary to pass the ``request_timeout`` and ``timeout`` parameters together, with the following requirement:
138+
``timeout > request_timeout``.
139+
140+
The ``request_timeout`` parameter controls how much time a single request attempt may take.
141+
When this time is over (the ``TimedOut`` error is raised), the router retries this request on the next replica as long
142+
as the ``timeout`` value is not elapsed.
143+
135144
* other :ref:`net.box options <net_box-options>`, such as ``is_async``,
136145
``buffer``, ``on_push`` are also supported.
137146

@@ -163,6 +172,16 @@ Router public API
163172
optional attribute containing a message with the human-readable error description,
164173
and other attributes specific for the error code.
165174

175+
.. reference_vshard_note_start
176+
177+
.. note::
178+
179+
Any write requests that are intended to be executed repeatedly (for example, retried after an error) should be idempotent.
180+
The operations' idempotency ensures that the change is applied **only once**.
181+
Read more: :ref:`Deduplication of non-idempotent requests <vshard-deduplication>`.
182+
183+
.. reference_vshard_note_end
184+
166185
**Examples:**
167186

168187
To call ``customer_add`` function from ``vshard/example``, say:
@@ -199,6 +218,13 @@ Router public API
199218
* ``timeout`` — a request timeout, in seconds.If the ``router`` cannot identify a
200219
shard with the specified ``bucket_id``, it will retry until the timeout is reached.
201220

221+
* ``request_timeout`` (since ``vshard`` 0.1.28) — timeout in seconds that serves as a protection against hung replicas.
222+
It is necessary to pass the ``request_timeout`` and ``timeout`` parameters together, with the following requirement:
223+
``timeout > request_timeout``.
224+
The ``request_timeout`` parameter controls how much time a single request attempt may take.
225+
When this time is over (the ``TimedOut`` error is raised), the router retries this request on the next replica as long
226+
as the ``timeout`` value is not elapsed.
227+
202228
* other :ref:`net.box options <net_box-options>`, such as ``is_async``,
203229
``buffer``, ``on_push`` are also supported.
204230

@@ -248,6 +274,10 @@ Router public API
248274
optional attribute containing a message with the human-readable error description,
249275
and other attributes specific for this error code.
250276

277+
.. include:: /reference/reference_rock/vshard/vshard_router.rst
278+
:start-after: reference_vshard_note_start
279+
:end-before: reference_vshard_note_end
280+
251281
.. _router_api-callre:
252282

253283
.. function:: vshard.router.callre(bucket_id, function_name, {argument_list}, {options})
@@ -267,6 +297,13 @@ Router public API
267297
* ``timeout`` — a request timeout, in seconds. If the ``router`` cannot identify a
268298
shard with the specified ``bucket_id``, it will retry until the timeout is reached.
269299

300+
* ``request_timeout`` (since ``vshard`` 0.1.28) — timeout in seconds that serves as a protection against hung replicas.
301+
It is necessary to pass the ``request_timeout`` and ``timeout`` parameters together, with the following requirement:
302+
``timeout > request_timeout``.
303+
The ``request_timeout`` parameter controls how much time a single request attempt may take.
304+
When this time is over (the ``TimedOut`` error is raised), the router retries this request on the next replica as long
305+
as the ``timeout`` value is not elapsed.
306+
270307
* other :ref:`net.box options <net_box-options>`, such as ``is_async``,
271308
``buffer``, ``on_push`` are also supported.
272309

0 commit comments

Comments
 (0)