Consistent Hashing

soul_departed

AB Bank IUT 8th ICT Fest...

0/1/8

Statement Statistics Discussion

HTML

Limits 3s, 512 MB

In modern software development, hashing is an important concept. It involves generating unique integer key for objects (instance of struct/class) where different values for each key is distributed over some range and unique. It helps to store and lookup these objects for different kinds of future queries. In distributed systems things are bit harder where we need to transfer these objects (thus distributing data and workload among connected workstations) when new workstation can join or existing ones can leave due to network disconnection or workstation shutdown. Consistent hashing is a concept used in distributed systems, so that we can move objects within workstations that affects the system less. To apply consistent hashing algorithm, we assume workstations in a distributed system are connected as a ring network and objects with their keys are stored in between these workstations. Here every object are stored to the workstation which comes first in clockwise direction. For the sake of simplicity let’s assume that we have sequential keys starting from 0 to N-1 (for N keys) and are distributed in a circular fashion.

There are several variations of consistent hashing algorithm. In our case, we implement this algorithm as follows.

In above image, objects with keys 0 to 3 are stored to workstation S2. When a new workstation (let WS1) joins to the ring network, it chooses its expected position (let P) right before (from clockwise direction) the workstation (let WS2) which has maximum number of objects (in case of a tie, which workstation has minimum object key number under it), and divides the workload by sharing objects from WS2. (If there are no workstation in the system, the newly joined workstation will have all the object keys). The position P will be taken such that, if WS2 has X number of object keys, then WS1 will take responsibility for first (in clockwise direction) ceil(X/2) object keys from WS2. This expected position P can be shifted by some drift number D (in clockwise for positive or anticlockwise for negative).

For example, in our above image, there are 5 workstations (S1 to S5) with total 20 number of objects with keys 0 to 19 in clockwise direction. Each workstation has 4 objects. So, if any new workstation S6 wants to join, it will join before S2 (which has 4 maximum objects and minimum 0 key for its object). Now S6 will have object with keys (0 and 1) while S2 will only have object with keys (2 and 3) if the position drift is zero. If the position drift would’ve been 1, then S6 would’ve object with keys (0, 1 and 2) or only (0) if position drift would’ve been -1.

In our algorithm, there are three kind of queries. These are for joining new node, removing a node, and query with an object key to locate it’s position within workstations. At any moment. it might happen that there are no workstation in the system.
There are several sets of inputs. In each input set, the first line of the input is N and Q (number of object-keys and queries). Then Q line follows, each of the following type.

“A” followed by “some string S” followed by “an integer D” standing for, add workstation with name S with position drift D. If it can’t be added before choosen workstation (exceeds boundary in either direction) then output “NA” (for not added). If any workstation with this label already exist then output “AE” (for already exist).

“D” followed by “some string S” standing for, delete workstation with label S. If no workstation with this label exist then output DE (for doesn’t exist)

“Q” followed by “some integer INDEX” standing for, output workstation name which currently stores object with key INDEX or -1 if it’s under no workstation. Remind that initially all objects are under no workstation.

Input

First line of the input set is the number of test cases T (1 ≤ T ≤ 1011). Each input starts with N and Q. Then there are Q queries.

In the input sets, for at most 10 cases we have:

0 < N ≤ 50000
0 ≤ Q ≤ 50000

And for the remaining cases:

0 < N ≤ 500
0 ≤ Q ≤ 500

And for all cases:

-10 ≤ D ≤ 10
1 < Length of string S < 6
0 ≤ INDEX < N

Output

For each input case, output case number and corresponding output for queries (if any) in the given format.

Sample

Input	Output
1 11 22 Q 0 A S1 0 Q 0 Q 10 A S2 0 Q 0 Q 5 Q 6 Q 10 A S3 -2 A S3 -1 A S4 4 D S4 Q 0 Q 1 Q 5 Q 6 D S3 Q 0 Q 1 Q 5 Q 6	Case 1: -1 S1 S1 S2 S2 S1 S1 AE NA DE S3 S2 S2 S1 S2 S2 S2 S1

As stated, “if there are no workstation in the system, the newly joined workstation will have all the object keys”, so the newly joined workstation will have object keys from 0 to N-1 (inclusive) and you can assume the position P is in between 0 and N-1, unaffected by drift D.

Factors

	CPU	Memory	Source
Bash 5.2	1×	1×	1×
Brainf*ck	1×	1×	1×
C# Mono 6.0	1×	1×	1×
C++17 GCC 13.2	1×	1×	1×
C++20 Clang 16.0	1×	1×	1×
C++20 GCC 13.2	1×	1×	1×
C++23 GCC 13.2	1×	1×	1×
C11 GCC 13.2	1×	1×	1×
C17 GCC 13.2	1×	1×	1×
C23 GCC 13.2	1×	1×	1×
Common Lisp SBCL 2.0	1×	1×	1×
D8 11.8	1×	1×	1×
Erlang 22.3	1×	1×	1×
Free Pascal 3.0	1×	1×	1×
Go 1.22	1×	1×	1×
Grep 3.7	1×	1×	1×
Haskell 8.6	1×	1×	1×
Java 1.8	1×	1×	1×
Kotlin 1.1	1×	1×	1×
Lua 5.4	1×	1×	1×
Node.js 10.16	1×	1×	1×
Perl 5.30	1×	1×	1×
PHP 8.3	1×	1×	1×
PyPy 7.1 (3.6)	1×	1×	1×
Python 3.12	1×	1×	1×
Ruby 3.2	1×	1×	1×
Rust 1.57	1×	1×	1×
Swift 5.3	1×	1×	1×
Whitespace	1×	1×	1×

Submit

Statistics

0% Solution Ratio