Because Transformers need to know token positions in sequences. RoPE injects positional information by rotating query and key vectors by angles that depend on their position.
For token at position i and query/key dimension pair 2k-1:2k, rotate by angle:
Where is the token position, is a constant (like 10000), and is the dimension of query/key vectors.
This is the rotation matrix for each position and dimension pair :
9/15/25