Romulus Spec Round2
Romulus Spec Round2
Romulus Spec Round2
v1.2
Webpage: https://romulusae.github.io/romulus/
Contents
1
1. Introduction
This document specifies Romulus, an authenticated encryption with associated data (AEAD) scheme
based on a tweakable block cipher (TBC) Skinny. Romulus consists of two families, a nonce-based
AE (NAE) Romulus-N and a nonce misuse-resistant AE (MRAE) Romulus-M.
A TBC was introduced by Liskov et al. at CRYPTO 2002 [30]. Since its inception, TBCs
have been acknowledged as a powerful primitive in that it can be used to construct simple and
highly secure NAE/MRAE schemes, including ΘCB3 [28] and SCT [36]. While these schemes
are computationally efficient (in terms of the number of primitive calls) and have high security,
lightweight applications are not the primal use cases of these schemes, and they are not particularly
suitable for small devices. With this in mind, Romulus aims at lightweight, efficient, and highly-
secure NAE and MRAE schemes, based on a TBC.
The overall structure of Romulus-N shares similarity in part with a (TBC-based variant of)
block cipher mode COFB [13, 14], yet, we make numerous refinements to achieve our design goal.
Romulus-N generally requires a fewer number of TBC calls than ΘCB3 thanks to the faster MAC
computation for associated data, while the hardware implementation is significantly smaller than
ΘCB3 thanks to the reduced state size and inverse-freeness (i.e., TBC inverse is not needed). In
fact, Romulus-N’s state size is essentially what is needed for computing TBC. Moreover, it encrypts
an n-bit plaintext block by just one call of the n-bit block TBC, hence there is no efficiency
loss. Romulus-N is extremely efficient for small messages, which is particularly important in many
lightweight applications, requiring for example only 2 TBC calls to handle one associated data
block and one message block (in comparison, other designs like ΘCB3, OCB3, TAE, CCM require
from 3 to 5 TBC to calls in the same situation). Romulus-N achieves these advantages without the
security penalty, i.e., Romulus-N has the full n-bit security, which is a similar security bound to
ΘCB3.
If we compare Romulus-N with other size-oriented and n-bit secure AE schemes, such as
conventional permutation-based AEs using 3n-bit permutation with n-bit rate, the state size is
comparable (3n to 3.5n bits). Our advantage is that the underlying cryptographic primitive is
expected to be much more lightweight and/or faster because of smaller output size (3n vs n bits).
In addition, the n-bit security of Romulus-N is proved under the standard model, which provides a
high-level assurance for security not only quantitatively but also qualitatively. To elaborate a bit
more, with a security proof in the standard model, one can precisely connect the security status
of the primitive to the overall security of the mode that uses this primitive. In our case, for each
of the members of Romulus, the best attack on it implies a chosen-plaintext attack (CPA) in the
single-key setting against Skinny, i.e., unless Skinny is broken by CPA adversaries in the single-key
setting, Romulus indeed maintains the claimed n-bit security. Such a guarantee is not possible
with non-standard models and it is often not easy to deduce the impact of a found “flaw” of the
primitive to the security of the mode. In a more general context, this gap between the proof and
the actual security is best exemplified by “uninstantiable” Random Oracle-Model schemes [5, 11].
To evaluate the security of Romulus, with the standard model proof, we can focus on the security
evaluation of Skinny, while this type of focus is not possible in schemes with proofs in non-standard
models.
2
Another interesting feature of Romulus-N is that it can reduce area depending on the use cases,
without harming security. If it is enough to have a relatively short nonce or a short counter (or
both), which is common to low-power networks, we can save the area by truncating the tweak
length. This was possible because Skinny allows to reduce area if a part of its tweak is never
used. A member of Romulus-N (Romulus-N2) particularly benefits from this feature. Note that this
type of area reduction is not possible with conventional permutation-based AE schemes: it only
offers a throughput/security tread-off. Romulus-M follows the general construction of MRAE called
SIV [39]. Romulus-M reuses the components of Romulus-N as much as possible, and Romulus-M
is simply obtained by processing message twice by Romulus-N. This allows a faster and smaller
operation than TBC-based MRAE SCT, yet, we maintain strong security features of SCT. That
is, Romulus-M achieves n-bit security against nonce-respecting adversaries and n/2-bit security
against nonce-misusing adversaries. Moreover, Romulus-M enjoys a useful feature called graceful
degradation introduced at SCT. This ensures that the full n-bit security is almost retained if the
number of nonce repetitions at encryption is limited. Thanks to the shared components, most of
the advantages of Romulus-N mentioned above also hold for Romulus-M.
We present a detailed comparison of Romulus with other AE candidates in Section 6.
As the underlying TBC, we adopt Skinny proposed at CRYPTO 2016 [2]. The security of this
TBC has been extensively studied, and it has attractive implementation characteristics.
Organization of the document. In Section 2, we first introduce the basic notations and the
notion of tweakable block cipher, followed by the list of parameters for Romulus, the recommended
parameter sets, and the specification of TBC Skinny. In the last part of Section 2, we specify two
families of Romulus, Romulus-N and Romulus-M. We present our security claims in Section 3 and
show our security analysis including the provable security bounds and the status of computational
security of Skinny in Section 4. In Section 5, we describe the desirable features of Romulus. The
design rationale under our schemes, including some details of modes and choice of the TBC, is
presented in Section 6. Finally, we show some implementation aspects of Romulus in Section 7.
3
2. Specification
2.1 Notations
Let {0, 1}∗ be the set of all finite bit strings, including the empty string ε. For X ∈ {0, 1}∗ , let |X|
denote its bit
S length. Here |ε| = 0. For integer n ≥ 0, let {0, 1}n be the set of n-bit strings, and let
{0, 1}≤n = i=0,...,n {0, 1}i , where {0, 1}0 = {ε}. Let JnK = {1, . . . , n} and JnK0 = {0, 1, . . . , n − 1}.
For two bit strings X and Y , X k Y is their concatenation. We also write this as XY if it is
clear from the context. Let 0i (1i ) be the string of i zero bits (i one bits), and for instance we write
10i for 1 k 0i . Bitwise XOR of two variables X and Y is denoted by X ⊕ Y , where |X| = |Y | = c
for some positive integer c. We write msbx (X) (resp. lsbx (X)) to denote the truncation of X to
its x most (resp. least) significant bits. See “Endian” paragraph below.
Padding. For X ∈ {0, 1}≤l of length multiple of 8 (i.e., byte string), let
(
X if |X| = l,
padl (X) = l−|X|−8
X k0 k len8 (X), if 0 ≤ |X| < l,
where len8 (X) denotes the one-byte encoding of the byte-length of X. Here, padl (ε) = 0l . When
l = 128, len8 (X) has 16 variations (i.e., byte length 0 to 15), and we encode it to the last 4 bits of
len8 (X) (for example, len8 (11) = 00001011). The case l = 64 is similarly treated, by using the
last 3 bits.
n
Parsing. For X ∈ {0, 1}∗ , let |X|n = max{1, d|X|/ne}. Let (X[1], . . . , X[x]) ←
− X be the parsing
of X into n-bit blocks. Here X[1] k X[2] k . . . k X[x] = X and x = |X|n . When X = ε, we have
n
X[1] ←
− X and X[1] = ε. Note in particular that |ε|n = 1.
Alternating Parsing. Let n and t be positive integers larger than 8. For X ∈ {0, 1}∗ , let
n,t
(X[1], . . . , X[x]) ←−− X be the parsing of X into n-bit blocks and t-bit blocks in an alternating order.
That is, we have X[1] k X[2] k . . . k X[x] = X, where |X[i]| = n for any odd i ∈ {1, . . . , x − 1},
|X[i]| = t for any even i ∈ {1, . . . , x − 1}, |X[x]| ∈ JnK if x is odd, and |X[x]| ∈ JtK if x is even.
When X 6= ε, x is determined as
2b|X|/(n + t)c if |X| > 0 and |X| mod (n + t) = 0
x = 2b|X|/(n + t)c + 1 if 1 ≤ |X| mod (n + t) ≤ n
2b|X|/(n + t)c + 2 if n < |X| mod (n + t) < n + t.
n,t
When X = ε, X[1] ←−− X (thus x = 1) and X[1] = ε.
Galois Field. An element a in the Galois field GF(2n ) will be interchangeably represented P as ani
n-bit string an−1 . . . a1 a0 , a formal polynomial an−1 xn−1 + · · · + a1 x + a0 , or an integer n−1
i=0 ai 2 .
4
Matrix. Let G be an n × n binary matrix defined over GF(2). For X ∈ {0, 1}n , let G(X) denote
the matrix-vector multiplication over GF(2), where X is interpreted as a column vector. We may
write G · X instead of G(X).
Endian. We employ little endian for byte ordering: an n-bit string X is received as
where Xi denotes the (i + 1)-st bit of X (for i ∈ JnK0 ). Therefore, when c is a multiple of 8 and
X is a byte string, msbc (X) and lsbc (X) denote the last (rightmost) c bytes of X and the first
(leftmost) c bytes of X, respectively. For example, lsb16 (X) = (X7 X6 . . . X0 k X15 X14 . . . X8 ) and
msb8 (X) = (Xn−1 Xn−2 . . . Xn−8 ) with the above X. Since our specification is defined over byte
strings, we only consider the above case for msb and lsb functions (i.e., the subscript c is always a
multiple of 8).
2.2 Parameters
Romulus has the following parameters:
• Nonce length nl ∈ {96, 128}.
• Key length k = 128.
• Message block length n = 128.
• Counter bit length d ∈ {24, 56, 48}.
• AD block length n + t, where t ∈ {96, 128}.
• Tag length τ = 128.
• A TBC E e : K × T × M → M, where K = {0, 1}k , M = {0, 1}n , and T = T × B × D. Here,
T = {0, 1}t , D = J2d − 1K0 , and B = J256K0 for parameters t and d, and B is also represented
as a byte (see Section 2.5.1). For tweak T = (T, B, D) ∈ T , T is always assumed to be a byte
e is either Skinny-128-384 or Skinny-128-256 with
string including ε, and t is a multiple of 8. E
appropriate tweakey encoding functions as described in Section 2.4.
e T is used to process the nonce or an AD block, D is used for counter, and B is for domain
For E,
separation, i.e., deriving a small number of independent instances.
While our submission fixes τ = 128, a tag for NAE schemes can be truncated if needed, at the
cost of decreased security against forgery. See Section 4.
NAE and MRAE families. Romulus has two families, Romulus-N and Romulus-M, and each
family consists of several members (the sets of parameters). The former implements nonce-based
AE (NAE) secure against Nonce-respecting adversaries, and the latter implements nonce Misuse-
resistant AE (MRAE) introduced by Rogaway and Shrimpton [39]. The name Romulus stands for
the set of two families.
5
Table 2.1: Members of Romulus.
Family Name e
E k nl n t d τ
Romulus-N1 Skinny-128-384 128 128 128 128 56 128
Romulus-N Romulus-N2 Skinny-128-384 128 96 128 96 48 128
Romulus-N3 Skinny-128-256 128 96 128 96 24 128
Romulus-M1 Skinny-128-384 128 128 128 128 56 128
Romulus-M Romulus-M2 Skinny-128-384 128 96 128 96 48 128
Romulus-M3 Skinny-128-256 128 96 128 96 24 128
Skinny Versions.
The lightweight block ciphers of the Skinny family have 64-bit and 128-bit block versions. However,
we will only use the n = 128 bits versions here. The internal state is viewed as a 4 × 4 square array
of cells, where each cell is a byte. We denote ISi,j the cell of the internal state located at Row i
and Column j (counting starting from 0). One can also view this 4 × 4 square array of cells as a
vector of cells by concatenating the rows. Thus, we denote with a single subscript ISi the cell of
the internal state located at Position i in this vector (counting starting from 0) and we have that
ISi,j = IS4·i+j .
Skinny follows the TWEAKEY framework from [25] and thus takes a tweakey input instead of
a key or a pair key/tweak. The family of lightweight block ciphers Skinny have three main tweakey
size versions, but we will use only two of them: for a block size n, we will use versions with tweakey
size t = 2n and t = 3n. We denote z = t/n the tweakey size to block size ratio. The tweakey state
is also viewed as a collection of z 4 × 4 square arrays of cells. We denote these arrays T K1 and
T K2 when z = 2, and T K1, T K2 and T K3 when z = 3. Moreover, we denote T Kzi,j the cell of
the tweakey state located at Row i and Column j of the z-th cell array. As for the internal state,
we extend this notation to a vector view with a single subscript: T K1i , T K2i and T K3i . Moreover,
we define the adversarial model SK (resp. TK1, TK2 or TK3) where the attacker cannot (resp.
can) introduce differences in the tweakey state.
6
Initialization.
The cipher receives a plaintext m = m0 km1 k · · · km14 km15 , where the mi are bytes. The initializa-
tion of the cipher’s internal state is performed by simply setting ISi = mi for 0 ≤ i ≤ 15:
m0 m1 m2 m3
m4 m5 m6 m7
IS =
m m m m
8 9 10 11
m12 m13 m14 m15
This is the initial value of the cipher internal state and note that the state is loaded row-wise
rather than in the column-wise fashion we have come to expect from the AES; this is a more
hardware-friendly choice, as pointed out in [34].
The cipher receives a tweakey input tk = tk0 ktk1 k · · · ktk30 ktk16z−1 , where the tki are 8-bit
cells. The initialization of the cipher’s tweakey state is performed by simply setting for 0 ≤ i ≤ 15:
T K1i = tki and T K2i = tk16+i when z = 2, and finally T K1i = tki , T K2i = tk16+i and
T K3i = tk32+i when z = 3. We note that the tweakey states are loaded row-wise.
>>> 1
SC AC
>>> 2
>>> 3
Figure 2.1: The Skinny round function applies five different transformations: SubCells (SC),
AddConstants (AC), AddRoundTweakey (ART), ShiftRows (SR) and MixColumns (MC).
SubCells. An 8-bit Sbox is applied to every cell of the cipher internal state. The action of this
Sbox is given in hexadecimal notation by the following Table 2.2.
Note that S8 can also be described with eight NOR and eight XOR operations, as depicted
in Figure 2.2. If x0 , . . ., x7 represent the eight inputs bits of the Sbox (x0 being the least
significant bit), it basically applies the below transformation on the 8-bit state:
(x7 , x6 , x5 , x4 , x3 , x2 , x1 , x0 ) −→ (x2 , x1 , x7 , x6 , x4 , x0 , x3 , x5 ),
repeating this process four times, except for the last iteration where there is just a bit swap
between x1 and x2 .
AddConstants. A 6-bit affine LFSR, whose state is denoted (rc5 , rc4 , rc3 , rc2 , rc1 , rc0 ) (with rc0
being the least significant bit), is used to generate round constants. Its update function is
defined as:
(rc5 ||rc4 ||rc3 ||rc2 ||rc1 ||rc0 ) → (rc4 ||rc3 ||rc2 ||rc1 ||rc0 ||rc5 ⊕ rc4 ⊕ 1).
7
Table 2.2: 8-bit Sbox S8 used in Skinny when s = 8.
uint8_t S8 [256] = {
0 x65 ,0 x4c ,0 x6a ,0 x42 ,0 x4b ,0 x63 ,0 x43 ,0 x6b ,0 x55 ,0 x75 ,0 x5a ,0 x7a ,0 x53 ,0 x73 ,0 x5b ,0 x7b ,
0 x35 ,0 x8c ,0 x3a ,0 x81 ,0 x89 ,0 x33 ,0 x80 ,0 x3b ,0 x95 ,0 x25 ,0 x98 ,0 x2a ,0 x90 ,0 x23 ,0 x99 ,0 x2b ,
0 xe5 ,0 xcc ,0 xe8 ,0 xc1 ,0 xc9 ,0 xe0 ,0 xc0 ,0 xe9 ,0 xd5 ,0 xf5 ,0 xd8 ,0 xf8 ,0 xd0 ,0 xf0 ,0 xd9 ,0 xf9 ,
0 xa5 ,0 x1c ,0 xa8 ,0 x12 ,0 x1b ,0 xa0 ,0 x13 ,0 xa9 ,0 x05 ,0 xb5 ,0 x0a ,0 xb8 ,0 x03 ,0 xb0 ,0 x0b ,0 xb9 ,
0 x32 ,0 x88 ,0 x3c ,0 x85 ,0 x8d ,0 x34 ,0 x84 ,0 x3d ,0 x91 ,0 x22 ,0 x9c ,0 x2c ,0 x94 ,0 x24 ,0 x9d ,0 x2d ,
0 x62 ,0 x4a ,0 x6c ,0 x45 ,0 x4d ,0 x64 ,0 x44 ,0 x6d ,0 x52 ,0 x72 ,0 x5c ,0 x7c ,0 x54 ,0 x74 ,0 x5d ,0 x7d ,
0 xa1 ,0 x1a ,0 xac ,0 x15 ,0 x1d ,0 xa4 ,0 x14 ,0 xad ,0 x02 ,0 xb1 ,0 x0c ,0 xbc ,0 x04 ,0 xb4 ,0 x0d ,0 xbd ,
0 xe1 ,0 xc8 ,0 xec ,0 xc5 ,0 xcd ,0 xe4 ,0 xc4 ,0 xed ,0 xd1 ,0 xf1 ,0 xdc ,0 xfc ,0 xd4 ,0 xf4 ,0 xdd ,0 xfd ,
0 x36 ,0 x8e ,0 x38 ,0 x82 ,0 x8b ,0 x30 ,0 x83 ,0 x39 ,0 x96 ,0 x26 ,0 x9a ,0 x28 ,0 x93 ,0 x20 ,0 x9b ,0 x29 ,
0 x66 ,0 x4e ,0 x68 ,0 x41 ,0 x49 ,0 x60 ,0 x40 ,0 x69 ,0 x56 ,0 x76 ,0 x58 ,0 x78 ,0 x50 ,0 x70 ,0 x59 ,0 x79 ,
0 xa6 ,0 x1e ,0 xaa ,0 x11 ,0 x19 ,0 xa3 ,0 x10 ,0 xab ,0 x06 ,0 xb6 ,0 x08 ,0 xba ,0 x00 ,0 xb3 ,0 x09 ,0 xbb ,
0 xe6 ,0 xce ,0 xea ,0 xc2 ,0 xcb ,0 xe3 ,0 xc3 ,0 xeb ,0 xd6 ,0 xf6 ,0 xda ,0 xfa ,0 xd3 ,0 xf3 ,0 xdb ,0 xfb ,
0 x31 ,0 x8a ,0 x3e ,0 x86 ,0 x8f ,0 x37 ,0 x87 ,0 x3f ,0 x92 ,0 x21 ,0 x9e ,0 x2e ,0 x97 ,0 x27 ,0 x9f ,0 x2f ,
0 x61 ,0 x48 ,0 x6e ,0 x46 ,0 x4f ,0 x67 ,0 x47 ,0 x6f ,0 x51 ,0 x71 ,0 x5e ,0 x7e ,0 x57 ,0 x77 ,0 x5f ,0 x7f ,
0 xa2 ,0 x18 ,0 xae ,0 x16 ,0 x1f ,0 xa7 ,0 x17 ,0 xaf ,0 x01 ,0 xb2 ,0 x0e ,0 xbe ,0 x07 ,0 xb7 ,0 x0f ,0 xbf ,
0 xe2 ,0 xca ,0 xee ,0 xc6 ,0 xcf ,0 xe7 ,0 xc7 ,0 xef ,0 xd2 ,0 xf2 ,0 xde ,0 xfe ,0 xd7 ,0 xf7 ,0 xdf ,0 xff
};
MSB LSB
MSB LSB
The six bits are initialized to zero, and updated before use in a given round. The bits from
the LFSR are arranged into a 4 × 4 array (only the first column of the state is affected by the
LFSR bits), depending on the size of internal state:
c0 0 0 0
c1 0 0 0
,
c
2 0 0 0
0 0 0 0
The round constants are combined with the state, respecting array positioning, using bitwise
exclusive-or. The values of the (rc5 , rc4 , rc3 , rc2 , rc1 , rc0 ) constants for each round are given
8
in the table below, encoded to byte values for each round, with rc0 being the least significant
bit.
Rounds Constants
1 - 16 01,03,07,0F,1F,3E,3D,3B,37,2F,1E,3C,39,33,27,0E
17 - 32 1D,3A,35,2B,16,2C,18,30,21,02,05,0B,17,2E,1C,38
33 - 48 31,23,06,0D,1B,36,2D,1A,34,29,12,24,08,11,22,04
49 - 62 09,13,26,0C,19,32,25,0A,15,2A,14,28,10,20
AddRoundTweakey. The first and second rows of all tweakey arrays are extracted and bitwise
exclusive-ored to the cipher internal state, respecting the array positioning. More formally,
for i = {0, 1} and j = {0, 1, 2, 3}, we have:
• ISi,j = ISi,j ⊕ T K1i,j ⊕ T K2i,j when z = 2,
• ISi,j = ISi,j ⊕ T K1i,j ⊕ T K2i,j ⊕ T K3i,j when z = 3.
LFSR
LFSR
PT
Extracted
8s-bit subtweakey
Figure 2.3: The tweakey schedule in Skinny. Each tweakey word T K1, T K2 and T K3 (if any)
follows a similar transformation update, except that no LFSR is applied to T K1.
Then, the tweakey arrays are updated as follows (this tweakey schedule is illustrated in
Figure 2.3). First, a permutation PT is applied on the cells positions of all tweakey arrays:
for all 0 ≤ i ≤ 15, we set T K1i ← T K1PT [i] with
and similarly for T K2 when z = 2, and for T K2 and T K3 when z = 3. This corresponds to
the following reordering of the matrix cells, where indices are taken row-wise:
P T
(0, . . . , 15) 7−→ (9, 15, 8, 13, 10, 14, 12, 11, 0, 1, 2, 3, 4, 5, 6, 7)
Finally, every cell of the first and second rows of T K2 and T K3 (for the Skinny versions
where T K2 and T K3 are used) are individually updated with an LFSR. The LFSRs used are
given in Table 2.3 (x0 stands for the LSB of the cell).
Table 2.3: The LFSRs used in Skinny to generate the round constants. The T K parameter gives
the number of tweakey words in the cipher.
TK s LFSR
T K2 8 (x7 ||x6 ||x5 ||x4 ||x3 ||x2 ||x1 ||x0 ) → (x6 ||x5 ||x4 ||x3 ||x2 ||x1 ||x0 ||x7 ⊕ x5 )
T K3 8 (x7 ||x6 ||x5 ||x4 ||x3 ||x2 ||x1 ||x0 ) → (x0 ⊕ x6 ||x7 ||x6 ||x5 ||x4 ||x3 ||x2 ||x1 )
9
ShiftRows. As in AES, in this layer the rows of the cipher state cell array are rotated, but they
are to the right. More precisely, the second, third, and fourth cell rows are rotated by 1, 2
and 3 positions to the right, respectively. In other words, a permutation P is applied on the
cells positions of the cipher internal state cell array: for all 0 ≤ i ≤ 15, we set ISi ← ISP [i]
with
MixColumns. Each column of the cipher internal state array is multiplied by the following binary
matrix M:
1 0 1 1
1 0 0 0
M= 0
.
1 1 0
1 0 1 0
The final value of the internal state array provides the ciphertext with cells being unpacked in
the same way as the packing during initialization. Test vectors for Skinny-128-256 or Skinny-128-384
are provided below.
/* Skinny -128 -256 */
Key : 009 c e c 8 1 6 0 5 d 4 a c 1 d 2 a e 9 e 3 0 8 5 d 7 a 1 f 3
1 ac123ebfc00fddcf01046ceeddfcab3
Plaintext : 3 a 0 c 4 7 7 6 7 a 2 6 a 6 8 d d 3 8 2 a 6 9 5 e 7 0 2 2 e 2 5
Ciphertext : b 7 3 1 d 9 8 a 4 b d e 1 4 7 a 7 e d 4 a 6 f 1 6 b 9 b 5 8 7 f
10
Note that all nonce-respecting modes have b5 = 0 and all nonce-misuse resistant modes have
b5 = 1.
- b4 is set to 1 once we have handled the last block of data (AD and message chains are treated
separately), to 0 otherwise.
- b3 is set to 1 when we are performing the authentication phase of the operating mode (i.e., when
no ciphertext data is produced), to 0 otherwise. In the special case where b5 = 1 and b4 = 1
(i.e., last block for the nonce-misuse mode), b3 will instead denote if the number of message
blocks is even (b5 = 1 if that is the case, 0 otherwise).
- b2 is set to 1 when we are handling a message block, to 0 otherwise. Note that in the case of
the misuse-resistant modes, the message blocks will be used during authentication phase (in
which case we will have b3 = 1 and b2 = 1). In the special case where b5 = 1 and b4 = 1 (i.e.,
last block for the nonce-misuse mode), b3 will instead denote if the number of message blocks
is even (b5 = 1 if that is the case, 0 otherwise).
- b1 is set to 1 when we are handling a padded AD block, to 0 otherwise.
- b0 is set to 1 when we are handling a padded message block, to 0 otherwise.
The reader can refer to Table ?? in the Appendix to obtain the exact specifications of the
domain separation values depending on the various cases.
message block
last block
parameter (or M even)
sets auth.
padded AD
(or AD even)
padded M
b7 b6 b5 b4 b3 b2 b1 b0
Figure 2.4: Domain separation when using the tweakable block cipher
LFSR. We use LFSRs for counter. For positive integer c, lfsrc is a one-to-one mapping
lfsrc : J2c − 1K0 → {0, 1}c \ {0c } defined as follows. For positive integer c, let Fc (x) be the
lexicographically-first polynomial among the the irreducible degree c polynomials of a minimum
number of coefficients. Specifically Fc (x) for c ∈ {56, 24} are
and
lfsrc (D) = 2D mod Fc (x).
Note that we use lfsrc (D) as a block counter, so most of the time D changes incrementally
with a step of 1, and this enables lfsrc (D) to generate a sequence of 2c − 1 pairwise-distinct
values. From an implementation point of view, it should be implemented in the sequence form,
xi+1 = 2 · xi mod Fc (x).
Let (zc−1 k zc−2 k . . . k z1 k z0 ) denote the state of c-bit LFSR. In our modes, these LFSRs are
initialized to 1 mod Fc (x), i.e., (07 1 k 0c−8 ), in little-endian format. Incrementation of LFSRs is
11
defined as follows: for c = 56,
Our LFSRs are also called doubling over GF(2c ) in the context of modes [37].
Tweakey Encoding. We specify the following tweakey encoding functions for implementing
TBC E e : K × T × M → M using Skinny-128-256 or Skinny-128-384. The tweakey encoding is a
function
encodem,t : K × T → KT ,
where KT = {0, 1}m is the tweakey space for either Skinny-128-256 with m = 256 or Skinny-128-384
with m = 384. As defined earlier, T = T × B × D, K = {0, 1}k and T = {0, 1}t , D = J2d − 1K0 ,
B = J256K0 .
• Case (m, t) = (384, 128): this variant is used for Romulus-N1 and Romulus-M1. The
encode function is defined as follows:
• Case (m, t) = (384, 96): this variant is used for Romulus-N2 and Romulus-M2. The encode func-
tion is defined as follows:
12
For plaintext M ∈ {0, 1}n and tweak T = (T, B, D) ∈ T × B × D, E e (T,B,D) (M ) denotes
K
encryption of M with m-bit tweakey state encodem,t (K, T, B, D). Tweakey encode is always
implicitly applied, hence the counter D is never arithmetic in the tweakey state. To avoid confusion,
we may write D (in particular when it appears in a part of tweak) in order to emphasize that this
is indeed an LFSR counter. One can interpret D as a state of LFSR when clocked D times (but in
that case it is a part of tweakey state and not a part of input of encode).
where 0 here represents the 8 × 8 zero matrix, and Gs is an 8 × 8 binary matrix, defined as
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
0 0 0 0 1 0 0 0
Gs = 0 0 0 0
.
0 1 0 0
0 0 0 0 0 0 1 0
0 0 0 0 0 0 0 1
1 0 0 0 0 0 0 1
Alternatively, let X ∈ {0, 1}n , where n is a multiple of 8, then the matrix-vector multiplication
G · X can be represented as
where
Gs · X[i] = (X[i][1], X[i][2], X[i][3], X[i][4], X[i][5], X[i][6], X[i][7], X[i][7] ⊕ X[i][0])
8 1
for all i ∈ Jn/8K0 , such that (X[0], . . . , X[n/8 − 1]) ←
− X and (X[i][0], . . . , X[i][7]) ←
− X[i], for all
i ∈ Jn/8K0 .
The state update function ρ : {0, 1}n × {0, 1}n → {0, 1}n × {0, 1}n and its inverse ρ−1 :
{0, 1}n × {0, 1}n → {0, 1}n × {0, 1}n are defined as
ρ(S, M ) = (S 0 , C),
ρ−1 (S, C) = (S 0 , M ),
where M = C ⊕ G(S) and S 0 = S ⊕ M . We note that we abuse the notation by writing ρ−1 as this
function is only the invert of ρ according to its second parameter. For any (S, M ) ∈ {0, 1}n ×{0, 1}n ,
if ρ(S, M ) = (S 0 , C) holds then ρ−1 (S, C) = (S 0 , M ). Besides, we remark that ρ(S, 0n ) = (S, G(S))
holds.
13
2.5.3 Romulus-N nonce-based AE mode
The specification of Romulus-N is shown in Figure 2.5. Figure 2.6 shows the encryption of Romulus-N.
For completeness, the definition of ρ is also included. Note that the algorithm always assumes
t = nl.
14
Algorithm Romulus-N.EncK (N, A, M ) Algorithm Romulus-N.DecK (N, A, C, T )
1. S ← 0n 1. S ← 0n
n,t n,t
2. (A[1], . . . , A[a]) ←−− A 2. (A[1], . . . , A[a]) ←−− A
3. if a mod 2 = 0 then u ← t else n 3. if a mod 2 = 0 then u ← t else n
4. if |A[a]| < u then wA ← 26 else 24 4. if |A[a]| < u then wA ← 26 else 24
5. A[a] ← padu (A[a]) 5. A[a] ← padu (A[a])
6. for i = 1 to ba/2c 6. for i = 1 to ba/2c
7. (S, η) ← ρ(S, A[2i − 1]) 7. (S, η) ← ρ(S, A[2i − 1])
8. S←E e (A[2i],8,2i−1) (S) 8. S←E e (A[2i],8,2i−1) (S)
K K
9. end for 9. end for
10. if a mod 2 = 0 then V ← 0n else A[a] 10. if a mod 2 = 0 then V ← 0n else A[a]
11. (S, η) ← ρ(S, V ) 11. (S, η) ← ρ(S, V )
12. S ← E e (N,wA ,a) (S) 12. S ← E e (N,wA ,a) (S)
K K
n n
13. (M [1], . . . , M [m]) ← −M 13. (C[1], . . . , C[m]) ← −C
14. if |M [m]| < n then wM ← 21 else 20 14. if |C[m]| < n then wC ← 21 else 20
15. for i = 1 to m − 1 15. for i = 1 to m − 1
16. (S, C[i]) ← ρ(S, M [i]) 16. (S, M [i]) ← ρ−1 (S, C[i])
17. S ← E e (N,4,i) (S) 17. S ← E e (N,4,i) (S)
K K
18. end for 18. end for
19. M 0 [m] ← padn (M [m]) 19. Se ← (0|C[m]| k msbn−|C[m]| (G(S)))
20. (S, C 0 [m]) ← ρ(S, M 0 [m]) 20. C 0 [m] ← padn (C[m]) ⊕ Se
21. C[m] ← lsb|M [m]| (C 0 [m]) 21. (S, M 0 [m]) ← ρ−1 (S, C 0 [m])
22. S ← E e (N,wM ,m) (S) 22. M [m] ← lsb|C[m]| (M 0 [m])
K
23. (η, T ) ← ρ(S, 0n ) 23. S ← E e (N,wC ,m) (S)
K
24. C ← C[1] k . . . k C[m − 1] k C[m] 24. (η, T ∗ ) ← ρ(S, 0n )
25. return (C, T ) 25. M ← M [1] k . . . k M [m − 1] k M [m]
26. if T ∗ = T then return M else ⊥
Figure 2.5: The Romulus-N nonce-based AE mode. Lines of [if (statement) then X ← x else
x0 ] are shorthand for [if (statement) then X ← x else X ← x0 ]. The dummy variable η is
always discarded. We use Romulus-N1 as working example. For other Romulus-N members, the
values of the bits b7 and b6 in the domain separation need to be adapted accordingly.
15
Case a is even
A[1] A[2] A[3] A[4] A[a − 1] pad(A[a]) 0n N
n t
ρ E ρ E ρ E ρ E S
0n n n K K K K
e 8,1 e 8,3 e 8,a−1 e wA ,a
Case a is odd
A[1] A[2] A[3] A[4] A[a − 2] A[a − 1] pad(A[a]) N
n t
ρ ρ ρ ρ
16
0n n n EK
e 8,1 EK
e 8,3 E K
e 8,a−2 EK
e wA ,a S
wA ∈ [24, 26]
S ρ E ρ E ρ E ρ
n n K K K
n
e 4,1 e 4,2 e wM ,m
C[m]
(Middle) process of AD with odd AD blocks (Bottom) Encryption. We use Romulus-N1 as working
example. For other Romulus-N members, the values of the bits b7 and b6 in the domain separation
Figure 2.6: The Romulus-N nonce-based AE mode. (Top) process of AD with even AD blocks
Algorithm Romulus-M.EncK (N, A, M ) Algorithm Romulus-M.DecK (N, A, C, T )
1. S ← 0n 1. if C = then M ←
n,t
2. (X[1], . . . , X[a]) ←−− A 2. else
3. if a mod 2 = 0 then u ← t else n 3. S←T
n+t−u,u n
4. (X[a + 1], . . . , X[a + m]) ←−−−−− M 4. (C[1], . . . , C[m0 ]) ←−C
5. if m mod 2 = 0 then v ← u else n+t−u 5. z ← |C[m0 ]|
6. w ← 48 6. C[m0 ] ← padn (C[m0 ])
7. if |X[a]| < u then w ← w ⊕ 2 7. for i = 1 to m0
8. if |X[a + m]| < v then w ← w ⊕ 1 8. S←E e (N,36,i−1) (S)
K
9. if a mod 2 = 0 then w ← w ⊕ 8 9. (S, M [i]) ← ρ−1 (S, C[i])
10. if m mod 2 = 0 then w ← w ⊕ 4 10. end for
11. X[a] ← padu (X[a]) 11. M [m0 ] ← lsbz (M [m0 ])
12. X[a + m] ← padv (X[a + m]) 12. M ← M [1] k . . . k M [m0 − 1] k M [m0 ]
13. x ← 40 13. S ← 0n
n,t
14. for i = 1 to b(a + m)/2c 14. (X[1], . . . , X[a]) ←−− A
15. (S, η) ← ρ(S, X[2i − 1]) 15. if a mod 2 = 0 then u ← t else n
n+t−u,u
16. if i = ba/2c + 1 then x ← x ⊕ 4 16. (X[a + 1], . . . , X[a + m]) ←−−−−− M
17. S ← E e (X[2i],x,2i−1) (S) 17. if m mod 2 = 0 then v ← u else n+t−u
K
18. end for 18. w ← 48
19. if a mod 2 = m mod 2 then 19. if |X[a]| < u then w ← w ⊕ 2
20. (S, η) ← ρ(S, 0n ) 20. if |X[a + m]| < v then w ← w ⊕ 1
21. else 21. if a mod 2 = 0 then w ← w ⊕ 8
22. (S, η) ← ρ(S, X[a + m]) 22. if m mod 2 = 0 then w ← w ⊕ 4
e (N,w,a+m) (S) 23. X[a] ← padu (X[a])
23. S ← E K
24. X[a + m] ← padv (X[a + m])
24. (η, T ) ← ρ(S, 0n )
25. x ← 40
25. if M = then return (, T )
26. for i = 1 to b(a + m)/2c
26. S ← T
n 27. (S, η) ← ρ(S, X[2i − 1])
27. (M [1], . . . , M [m0 ]) ←−M
28. if i = ba/2c + 1 then x ← x ⊕ 4
28. z ← |M [m0 ]|
29. M [m0 ] ← padn (M [m0 ]) 29. S ← E e (X[2i],x,2i−1) (S)
K
30. for i = 1 to m0 30. end for
31. S ← E e (N,36,i−1) (S) 31. if a mod 2 = m mod 2 then
K
32. (S, C[i]) ← ρ(S, M [i]) 32. (S, η) ← ρ(S, 0n )
33. end for 33. else
34. C[m0 ] ← lsbz (C[m0 ]) 34. (S, η) ← ρ(S, X[a + m])
35. C ← C[1] k . . . k C[m0 − 1] k C[m0 ] 35. S ← E e (N,w,a+m) (S)
K
36. return (C, T ) 36. (η, T ) ← ρ(S, 0n )
37. if T ∗ = T then return M else ⊥
Figure 2.7: The Romulus-M misuse-resistant AE mode. Lines of [if (statement) then X ← x
else x0 ] are shorthand for [if (statement) then X ← x else X ← x0 ]. The dummy variable η is
always discarded. We use Romulus-M1 as working example. For other Romulus-M members, the
values of the bits b7 and b6 in the domain separation need to be adapted accordingly. Note that in
the case of empty message, no encryption call has to be performed in the encryption part.
17
Case (a,m) = (even,even)
A[1] A[2] A[a − 1] pad(A[a]) M [1] M [2] M [m − 1] pad(M [m]) 0n N 0n
n t n t
ρ E ρ E ρ E ρ E ρ E ρ
K K K K K
0n n n
e 40,1 e 40,a−1 e 44,a+1 e 44,a+m−1 e w,a+m
w ∈ [60, . . . , 63]
T
n ρ ρ ρ ρ ρ
0 n n E K
e 40,1 E K
e 40,a−1 EK
e 44,a+1 EK
e w,a+m
w ∈ [56, . . . , 59]
T
18
ρ E ρ E ρ E ρ E ρ
0n n n K K K K
e 40,1 e 44,a e 44,a+2 e w,a+m
w ∈ [52, . . . , 55]
T
ρ E ρ E ρ E ρ E ρ E ρ
0n n n K K K K K
e 40,1 e 44,a e 44,a+2 e 44,a+m−1 e w,a+m
w ∈ [48, . . . , 51]
T
T E ρ E ρ E ρ E ρ
K n n K K K
n
e 36,0 e 36,1 e 36,2 e 36,m0 −1
lsb|M [m0 ]|
C[1] C[2] C[m0 − 1]
Romulus-M1 as working example. For other Romulus-M members, the values of the bits b7 and b6
even/odd, odd/even, odd/odd AD and M blocks respectively (Bottom) Encryption. We use
Figure 2.8: The Romulus-M misuse-resistant AE mode. (Top) process of AD with even/even,
C[m0 ]
3. Security Claims
Attack Models. We consider two models of adversaries: nonce-respecting (NR) and nonce-
misusing (NM)1 . In the former model, nonce values in encryption queries (the tuples (N, A, M ))
may be chosen by the adversary but they must be distinct. In the latter, nonce values in encryption
queries can repeat. Basically, an NM adversary can arbitrarily repeat a nonce, hence even using the
same nonce for all queries is possible. We can further specify NM by the distribution of a nonce,
such as the maximum number of repetition of a nonce in the encryption queries.
For both models, adversaries can use any nonce values in decryption queries (the tuples
(N, A, C, T )): it can collide with a nonce in an encryption query or with other decryption queries.
Security Claims. Our security claims are summarized in Table 3.1. The variables in the table
denote the required workload, in terms of data complexity, of an adversary to break the cipher, in
logarithm base 2. The data complexity of attacker consists of the number of queries and the total
amount of processed message blocks. If it reaches the suggested number, then there is no security
guarantee anymore, and the cipher can be broken. For simplicity, small constant factors, which
are determined from the concrete security bounds, are neglected in these tables. A more detailed
analysis is given in Section 4.
We claim these numbers hold as long as Skinny is a tweakable pseudorandom permutation, that
is, it is computationally hard to distinguish Skinny from the set of uniform random permutations
(URP) indexed by the tweak (a tweakable URP or TURP), using chosen-plaintext queries in the
single-key setting.
Table 3.1: Security claims of Romulus. NR denotes Nonce-Respecting adversary and NM denotes
Nonce-Misusing adversary.
For all the members of Romulus-N, Table 3.1 shows n-bit security for privacy and authenticity
against NR adversary. For all the members of Romulus-M, Table 3.1 shows n-bit security for privacy
and authenticity against NR adversary and in addition, n/2-bit security for privacy and authenticity
against NM adversary. The n/2-bit security assumes that the NM adversary has full control over
the nonce, but in practice, the nonce repetition can happen accidentally, and it is conceivable
that the nonce is repeated only a few times. As we present in Section 4, the security bounds of
Romulus-M show the notable property of graceful security degradation with respect to the number
of nonce repetition [36]. This property is similar to SCT, and if the number of nonce repetition is
limited, the actual security bound is close to the full n-bit security.
1
Also known as Nonce Repeating or Nonce Ignoring. We chose “Nonce Misuse” for notational convenience of
using acronyms, NR for nonce-respecting and NM for nonce-misuse.
19
Table 3.1 does not show the time complexity. We claim k-bit time complexity of attacker for
all the members of Romulus that use Skinny with k-bit keys, which is common to schemes having
security proofs in the standard model. This also indicates that the time complexity of key recovery
is k bits, i.e., key recovery is no easier than attacking Skinny itself, under the single-key setting.
Note that all members have k = 128. See Table 3.2.
Table 3.2: Security claims of Romulus against key recovery.
20
4. Security Analysis
Security Notion for TBC. The security of TBC: K × T × M → M is defined by the indis-
tinguishability from an ideal object, tweakable uniform random permutation (TURP), denoted
e using chosen-plaintext, chosen-tweak queries. It is a set of independent uniform random
by P,
tprp
permutations (URPs) over M indexed by tweak T ∈ T . Let Adv e (A) denote the TPRP
E
advantage of TBC E e against adversary A. It is defined as
h i h i
tprp
Adv e (A) = Pr K ← K : AEK (·,·) ⇒ 1 − Pr AP(·,·) ⇒ 1 .
def $ e e
E
Security Notions for MRAE. We adopt the security notions of MRAE following the same
security definitions as above, with the exception that the adversary can now repeat nonces. We
write the corresponding privacy advantage as
h i h i
nm-priv
(A) = Pr K ← K : AΠ.EK (·,·,·) ⇒ 1 − Pr A$(·,·,·) ⇒ 1 ,
def $
AdvΠ
and the authenticity advantage as
h i
def $ Π.EK (·,·,·),Π.DK (·,·,·,·)
Advnm-auth
Π (A) = Pr K ← K : A forges .
We note that while NM adversaries can repeat nonces, we without loss of generality assume that
they do not repeat the same query. See also [39] for reference.
21
4.2 Security of Romulus-N
n,t
For A ∈ {0, 1}∗ , we say A has a AD blocks if it is parsed as (A[1], . . . , A[a]) ←−− A. Let ã = ba/2c+1
which is a bound of actual number of primitive calls for AD. Similarly for plaintext M ∈ {0, 1}∗ ,
def
we say M has m message blocks if |M |n = d|M |/ne = m. The same applies to ciphertext C. For
encryption query (N, A, M ) or decryption query (N, A, C, T ) of a AD blocks and m message blocks,
the number of total TBC calls is at most ã + m, which is called the number of effective blocks of a
query.
Let A be an NR adversary against Romulus-N using q encryption queries with time complexity
tA and with total number of effective blocks σpriv . Moreover, let B be an NR adversary using qe
encryption queries and qd decryption queries, with total number of effective blocks for encryption
and decryption queries σauth , and time complexity tB . Then
priv tprp
AdvRomulus-N (A) ≤ Adv e (A0 ),
E
tprp 3qd 2qd
Advauth
Romulus-N (B) ≤ Adv e (B 0 ) + + τ
E 2n 2
hold for some A0 using σpriv chosen-plaintext queries with time complexity tA + O(σpriv ), and for
some B 0 using σauth chosen-plaintext queries with time complexity tB + O(σauth ). These bounds
hold for all the members of Romulus-N. Note that (n, τ ) = (128, 128) holds for all the members. If
1 ≤ τ < n (which is not a part of our submission), it still keeps n-bit privacy and τ -bit authenticity.
The security of Romulus-N crucially relies on the n × n matrix G defined over GF(2). Let G(i)
be an n × n matrix that is equal to G except the (i + 1)-st to n-th rows, which are set to all zero.
Here, G(0) is the zero matrix and G(n) = G, and for X ∈ {0, 1}n , G(i) (X) = lsbi (G(X))k0n−i for
all i = 0, 8, 16, . . . , n; note that all variables are byte strings, and lsbi (X) is the leftmost i/8 bytes
(Section 2). Let I denote the n × n identity matrix. We say G is sound if (1) G is regular and (2)
G(i) + I is regular for all i = 8, 16, . . . , n. The above security bounds hold as long as G is sound.
The proofs are similar to those for iCOFB [14]. We have verified the soundness of our G, for a range
of n including n = 64 and n = 128, by a computer program.
22
Then in the NR case, we have
tprp 5qd
Advauth
Romulus-M (B) ≤ Adv e (B 0 ) + .
E 2n
In the NM case, we have
23
5. Features
The primary goal of Romulus is to provide a lightweight, yet highly-secure, highly-efficient AE based
on a TBC. Romulus has a number of desirable features. Below we detail some representative ones:
• Security margin. Skinny family of tweakable block ciphers was published at CRYPTO
2016. Even though a thorough security analysis was provided by the authors in the original
article, these primitives attracted a lot of attention and third party cryptanalysis in the past
years. So far, Skinny functions still offer a very comfortable security margin. For example, the
Skinny members used in Romulus still have more than 50% security margin in the related-key
related-tweakey model. Actually the security margin rate is probably even higher as these
attacks can’t be directly applied to Skinny in the Romulus setting due to data limitations,
limited tweak space, etc. Moreover, our security assumption on the internal primitive is only
single-key, not related-key.
• Security proofs. Both Romulus-N and Romulus-M have provable security reductions to
Skinny in the standard model. See [21] for the proofs. This is very important for high security
confidence of Romulus and allows us to rely on the security of Romulus to that of Skinny,
which has been extensively studied since the proposal in 2016.
• Beyond-birthday-bound security. The security bounds of Romulus shown in Section 4
are comparable to the state-of-the-art TBC modes of operation, namely ΘCB3 for NAE and
SCT for MRAE. In particular, Romulus-N and Romulus-M (under NR adversary) achieve
beyond-birthday-bound (BBB) security with respect to the block length. This level of security
is much stronger than the up-to-birthday-bound, n/2-bit security achieved by conventional
block cipher modes using n-bit block ciphers, e.g. GCM. Our provable security results are in
the standard model, where there is a reduction from the security of the entire modes to the
underlying primitive, Skinny, where the security of Skinny refers to the standard single-key
setting. This implies that, up to the security bounds, our schemes cannot be broken without
breaking the security of the underlying primitive in the single-key setting.
• Misuse resistance. Romulus-M is an MRAE mode which is secure against misuse (repeat)
of nonces in encryption queries. More formally, it provides the best-possible security against
nonce repeat in that ciphertexts do not give any information as long as the uniqueness of the
input tuple (N, A, M ) is maintained. In contrast to this, popular nonce-based AE modes are
often vulnerable against nonce repeat, even one repetition can be significant. For example,
the famous nonce repeat attack against GCM [19, 26] reveals its authentication key.
• Performances. Romulus-N is smaller than ΘCB3 in that it does not need an additional state
beyond the internal TBC. Besides, it is faster as it processes (n + t)-bit AD blocks per TBC
|M |
call. In general, it requires only |A|−n
n+t + n + 1 TBC calls, as opposed to ΘCB3, which
|A| |M |
requires n + n + 1. Although Romulus is serial in nature, i.e., not parallelizable, it
was shown during the CAESAR competition that parallelizability does not lead to significant
performance gains in hardware performance, [17, 27, 29]. Moreover, parallelizability is not
considered crucial for in lightweight applications, so it is a small price for a simple, small and
24
fast design.
In Romulus-M, a plaintext is processed twice, once for generating a tag and once for encryption.
Romulus-M inherits the overall design of Romulus-N, and thanks to the highly efficient tag
generation, the efficiency loss is minimized. Romulus-M is about only 1.5 times slower than
Romulus-N when associated data is empty, and becomes closer to Romulus-N for long associated
data.
• Simplicity/Small footprint. Romulus has a quite small footprint. Especially for Romulus-
N, we essentially need what is needed to implement the TBC Skinny itself. We remark that
this becomes possible thanks to the permutation-based structure of Skinny’s tweakey schedule,
which allows to share the state registers used for storing input variable and for deriving
round-key values. Thus, this feature is specific to our use of Skinny, though one can expect
a similar effect with TBC using a simple tweak(ey) schedule. There is no OCB-like masks
applied to the primitive, and we do not need the inverse circuit for Skinny which was needed
for ΘCB3. A comparison in Section 6 (Table 6.1) shows that Romulus-N is quite small and
especially efficient in terms of a combined metric of size and speed, compared with other
schemes.
Romulus-M also has a small footprint due to the shared structure with Romulus-N.
• Small messages. Romulus-N has a small computational overhead, thus has a good perfor-
mance for small messages. For example, it just needs two TBC calls to encrypt one-block AD
and one-block message, i.e., 16 Bytes of AD and 16 Bytes of message. In particular, in the
authentication part, the first 16 Bytes of AD can be processed for free in that it is processed
without calling the TBC.
• Flexibility. Romulus has a large flexibility. Generally, it is defined as a generic mode for
TBCs, and the provable security reduction under standard model contributes to a high
confidence of the scheme when combined with a secure TBC.
• Side channels and Fault Attacks. Romulus does not inherently guarantee security against
Side Channel Analysis and Fault Attacks. However, standard countermeasures are easily
adaptable for Romulus, e.g. Fresh Rekeying [32], Masking [33], etc. Moreover, powerful fault
attacks that require a small number of faults and pairs of faulty and non-faulty ciphertexts,
such as DFA, are not applicable to Romulus-N without violating the security model, i.e.,
repeating the nonce or releasing unverified plaintexts.
We also note that in Romulus-N1, Romulus-N2, Romulus-M1 and Romulus-M3, we do not
require the full tweakey size of Skinny-128-384, so a potential countermeasure to both SCA
and DFA is to randomize the round keys by adding a random value to each TBC call. The
downfall of this idea is that the randomness needs to be synchronized for correct decryption,
but this property is shared with most SCA and DFA randomized countermeasures. However,
we plan to analyze this idea and other ideas to make Romulus resistant to such attacks in
details in subsequent works.
25
6. Design Rationale
6.1 Overview
Romulus is designed with the following goals in mind:
1. Have a very small area compared to other TBC/BC based AEAD modes.
2. Have relatively high efficiency in general.
3. Smaller overhead and fewer TBC calls for the AD processing.
4. Use the underlying TBC as a black box, with the standard security reduction to the TBC.
Rationale of MRAE Mode. Romulus-M is designed as an MRAE mode following the structure
of SIV [39] and SCT [36]. Romulus-M reuses the components of Romulus-N as much as possible to
inherit its implementation advantages and the security. In fact, this brings us several advantages
(not only for implementation aspects) over SIV/SCT. Compared with SCT, Romulus-M needs a
fewer number of primitive calls thanks to the faster MAC part. Moreover, Romulus-M has a smaller
state than SCT because of single-state encryption part taken from Romulus-N (SCT employs a
variant of counter mode). The provable security of Romulus-M is equivalent to SCT: the security
depends on the maximum number of repetition of a nonce in encryption (r), and if r = 1 (i.e.,
NR adversary) we have the full n-bit security. Security will gradually decreasing as r increases,
also known as “graceful degradation”, and even if r equals to the number of encryption queries,
implying nonces are fixed, we maintain the birthday-bound, n/2-bit security.
ZAE [23] is another TBC-based MRAE. Although it is faster than SCT, the state size is much
larger than SCT and Romulus-M.
26
Table 6.1: Features of Romulus-N members compared to ΘCB3 and other lightweight AEAD
algorithms: λ is the bit security level of a mode. Here, (n, k)-BC is a block cipher of n-bit block
and k-bit key, (n, t, k)-TBC is a TBC of n-bit block and k-bit key and t-bit tweak, and n-Perm is
an n-bit cryptographic permutation.
Rationale of TBC. We chose some of the members of the Skinny family of tweakable block
ciphers [2] as our internal TBC primitives. Skinny was published at CRYPTO 2016 and has received
a lot of attention since its proposal. In particular, a lot of third party cryptanalysis has been
provided (in part motivated by the organization of cryptanalysis competitions of Skinny by the
designers) and this was a crucial point in our primitive choice. Besides, our mode requested a
lightweight tweakable block cipher and Skinny is the main such primitive. It is very efficient and
27
Table 6.2: Features of Romulus-M members compared to other MRAE modes : λ is the bit security
level of a mode. Here, (n, k)-BC is a block cipher of n-bit block and k-bit key, (n, t, k)-TBC is a
TBC of n-bit block and k-bit key and t-bit tweak. Security is for Nonce-respecting adversary.
‡
|A|+|M | |M |
SCT [36] n + n +1 (n, n, k)-TBC, n = k n 4n = 4λ 1/2 8λ Yes
|A|+|M | |M |
SUNDAE [1] n + n +1 (n, k)-BC, n = k n/2 2n = 4λ 1/2 8λ Yes
]
|A|+|M | |M |
ZAE [23] 2n + n +6 (n, n, k)-TBC, n = k n 7n = 7λ 1/2 14λ Yes
lightweight, while providing a very comfortable security margin. Provable constructions that turn a
block cipher into a tweakable block cipher were considered, but they are usually not lightweight,
not efficient, and often only guarantee birthday-bound security.
28
either kept without change, updated with the TBC round output (which includes a single round of
the key scheduling algorithm) or the output of a simple linear transformation, which consists of
ρ/ρ−1 , the unrolled inverse key schedule and the block counter. In order estimate the hardware cost
of Romulus-N1 the mode we consider the round based implementation with an n/4-bit input/output
bus:
• 4 XOR gates for computing G.
• 64 XOR gates for computing ρ.
• 67 XOR gates for the correction of the tweakey and counting.
• 56 multiplexers to select whether to choose to increment the counter or not.
• 320 multiplexers to select between the output of the Skinny round and lt.
This adds up to 135 XOR gates and 376 multiplexers. For estimation purposes assume an
XOR gate costs 2.25 GEs and a multiplexer costs 2.75 GEs, which adds up to 1337.75 GEs. In the
original Skinny paper [2], the authors reported that Skinny-128-384 requires 4, 268 GEs, which adds
up to ∼ 5, 605 GEs. This is ∼ 1.4 KGEs smaller than the round based implementation of Ascon [18].
Moreover, a smart design can make use of the fact that 64 bits of the tweakey of Skinny-128-384
are not used, replacing 64 Flip-Flops by 64 multiplexers reducing an extra ∼ 200 GEs. In order to
design a combined encryption/decryption circuit, we show below that the decryption costs only
extra 32 multiplexers and ∼ 32 OR gates, or ∼ 100 GEs. Similar analysis is done for Romulus-N2
and Romulus-N3, estmating that they would cost 1, 217 and 1, 073 GEs, respectively, on top of
there corresponding Skinny variant, or 5, 485 and 4, 385 GEs, respectively.
These estimations show that Romulus-N is not just competitive theoretically but it can be a
very attractive option practically for low area applications. For example, the 8-bit implementation
of ACORN, the smallest implementation publicly available for all the round 3 candidates of the
CAESAR competition, costs 5, 900 GEs, as shown in [29]. If we assume around ∼ 1, 000 GEs as the
cost of the CAESAR Hardware API included in that design, as reported in [18], then Romulus-N3 is
still smaller than that. Besides, we believe the area can be even lower using Serial Implementations
of Skinny, which cost ∼ 3, 000 GE for Skinny-128-384 and ∼ 2, 000 GEs for Skinny-128-256, a gain
of more than 1, 000 GEs compared to the round based implementation.
Another possible optimization is to consider the fact that most of the area of Skinny comes from
the storage elements, hence, we can speed up Romulus to almost double the speed by using a simple
two-round unrolling, which costs ∼ 1, 000 GEs, as only the logic part of Skinny needs replication,
which is only < 20% increase in terms of area.
Romulus-M is estimated to have almost the same area as Romulus-N, except for an additional set
of multiplexers in order to use the tag as an initial vector for the encryption part. This indicates
that it can be a very lightweight choice for high security applications.
For the serial implementations we followed the currently popular bit-sliding framework [24] with
minor tweaks. The state of Skinny is represented as the Feedback-Shift Register which typically
operates on 8 bits at a time, while allowing the 32-bit MixColumns operation, given in Figure 6.2
It can be viewed in Figure 6.2 that several careful design choices such as a lightweight serializable
ρ function without the need of any extra storage and a lightweight padding/truncation scheme
allow the low area implementations to use a very small number of multiplexers on top of the Skinny
circuit for the state update, three 8-bit multiplexer to be exact, two of which have a constant zero
input, and ∼ 22 XORs for the ρ function and block counter. For the key update functions, we did
several experiments on how to serialize the operations and we found the best trade-off is to design
a parallel/serial register for every tweakey, where the key schedule and mode operations are done in
the same manner of the round based implementation, while the AddRoundKey operation of Skinny
is done serial as shown in Figure 6.2.
29
input
input
state
state state
input
Skinny lt
Skinny Skinny
output
(a) Overview of the round based architecture of (b) Overview of the round based architecture
Skinny. of Romulus. lt: The linear transformation that
includes ρ, block counter and inverse key sched-
ule.
S0 S1 S2 S3
S4 S5 S6 S7
S8 S9 Sa Sb
Sc Sd Se Sf SBox
0x00 RC
input
0x00 ρ RTK
len
output
30
6.4 Software Implementations
We refer to Skinny document for discussions on software implementations of the various Skinny
versions. The Romulus mode will have little impact on the global performance of Skinny in software
as long as serial implementations are used. We expect very little increase in ROM or RAM when
compared to Skinny benchmarks. The very performant micro-controller implementations reported
in the Skinny document were benchmarked without assuming parallel cipher calls, and without any
pre-processing. Therefore, Romulus will present a very similar performance profile as the numbers
reported on micro-controllers. Generally, using little amount of RAM, Skinny is easy and efficient
to implement using simple table-based approach.
For high-end platforms, such as latest Intel processors, very efficient highly-parallel bitsliced
implementations of Skinny using SSE, AVX, AVX2 instructions on XMM/YMM registers will not
be directly applicable as our Romulus mode is serial in nature. However, in the classical case of a
server communicating with many lightweight devices, we note that it would be possible to consider
bitslicing the key schedule [8] of Skinny (being relatively simple to compute) or using scheduling
strategies [10]. Classical table-based implementation of Skinny will ensure acceptable performance
on even legacy platforms, while Vector Permute (vperm) might lead to better results on medium
range platforms by parallelizing the computation of the Sbox.
Smaller D for Romulus-N3. The goal of Romulus-N3 is to fit the Romulus algorithm in Skinny-
128-256, which is faster and smaller than Skinny-128-384. In most lightweight applications, the
amount of data to be sent under the same key is small. Hence, Romulus-N3 represents a variant
targeted at such applications that is faster and smaller than the other variants and can encrypt up
to ∼ 256 MBs of data.
Tag Generation. Considering hardware simplicity, the tag is the final output state (i.e., the
same way as the ciphertext blocks), as opposed to the final state S of the TBC. In order to avoid
branching when it comes to the output of the circuit, the tag is generated as G(S) instead of S.
In hardware, this can be implemented as ρ(S, 0n ), i.e., similar to the encryption of a zero vector.
Consequently, the output bus is always connected to the output of ρ and a multiplexer is avoided.
31
Padding. The padding function used in Romulus is chosen so that the padding information is
always inserted in the most significant byte of the last block of the message/AD. Hence, it reduces
the number of decisions for each byte to only two decisions (either the input byte or a zero byte,
except the most significant byte which is either the input byte or the byte length of that block).
Besides, it is also the case when the input is treated as a string of words (16-, 32-, 64- or 128-bit
words). This is much simpler than the classical 10∗ padding approach, where every word has a
lot of different possibilities when it comes to the location of the padding string. Besides, usually
implementations maintain the length of the message in a local variable/register, which means that
the padding information is already available, just a matter of placing it in the right place in the
message, as opposed to the decoder required to convert the message length into 10∗ padding.
Padding Circuit for Decryption. One of the main features of Romulus is that it is inverse
free and both the encryption and decryption algorithms are almost the same. However, it can be
tricky to understand the behavior of decryption when the last ciphertext block has length < n. In
order to understand padding in the decryption algorithm, we look at the ρ and ρ−1 functions when
the input plaintext/ciphertext is partial. The ρ function applied on a partial plaintext block is
shown in Equation (6.1). If ρ−1 is directly applied to padn (C), the corresponding output will be
incorrect, due to the truncation of the last ciphertext block. Hence, before applying ρ−1 we need to
0 0
regenerate the truncated bits. It can be verified that C = padn (C) ⊕ msbn−|C| (G(S)). Once C is
regenerated, ρ−1 can be computed as shown in Equation (6.2):
0
S 1 1 S 0
= and C = lsb|M | (C ). (6.1)
0
C G 1 padn (M )
0
0 S1⊕G 1 S
C = padn (C) ⊕ msbn−|C| (G(S)) and = . (6.2)
0
M G 1 C
While this looks like a special padding function, in practice it is simple. First of all, G(S) needs
to be calculated anyway. Besides, the whole operation can be implemented in two steps:
M = C ⊕ lsb|C| (G(s)),
0
S = padn (M ) ⊕ S
which can have a very simple hardware implementation, as discussed in the next paragraph.
Choice of the G Matrix. We chose the position of G so that it is applied to the output state.
This removes the need of G for AD processing, which improves software performance. In Section 6.2,
we listed the security condition for G, and we choose our matrix G so that it meets these conditions
and suits well for various hardware and software.
32
We noticed that for lightweight applications, most implementations use an input/output bus of
width ≤ 32. Hence, we expect the implementation of ρ to be serialized depending on the bus size.
Consequently, the matrix used in iCOFB can be inefficient as it needs a feedback operation over 4
bytes, which requires up to 32 extra Flip-Flops in order to be serialized, something we are trying to
avoid in Romulus. Moreover, the serial operation of ρ is different for byte, which requires additional
multiplexers.
However, we observed that if the input block is interpreted in a different order, both problems
can be avoided. First, it is impossible to satisfy the security requirements of G without any feedback
signals, i.e., G is a bit permutation.
• If G is a bit permutation with at least one bit going to itself, then there is at least one
non-zero value on the diagonal, so I + G has at least 1 row that is all 0s.
• If G is a bit permutation without any bit going to itself, then every column in I + G has
exactly two 1’s. The sum of all rows in such matrix is the 0 vector, which means the rows are
linearly dependent. Hence, I + G is not invertible.
However, the number of feedback signals can be adjusted to our requirements, starting from only
1 feedback signal. Second, we noticed that the input block/state of length n bits can be treated
as several independent sub-blocks of size n/w each. Hence, it is enough to design a matrix Gs of
size w × w bits and apply it independently n/w times to each sub-block. The operation applied on
each sub-block in this case is the same (i.e., as we can distribute the feedback bits evenly across
the input block). Unfortunately, the choice of w and Gs that provides the optimal results depends
on the implementation architecture. However, we found out that the best trade-off/balance across
different architectures is when w = 8 and Gs uses a single bit feedback.
In order to verify our observations, we generated a family of matrices with different values of w
and Gs , and measured the cost of implementing each of them on different architectures.
33
7. Implementations
In this section we provide implementations results and estimates. Source codes can be found on
our GitHub page: https://github.com/romulusae
34
Table 7.1: ASIC Implementations of Romulus-N1 using the TSMC 65nm standard cell library.
Power and Energy are estimated at 10 Mhz. Energy is for 1 TBC call.
† Minimum Area;
‡1 GHz;
35
Acknowledgments
The second and fourth authors are supported by the Temasek Labs grant (DSOCL16194).
36
Bibliography
[1] Banik, S., Bogdanov, A., Luykx, A., Tischhauser, E.: SUNDAE: Small Universal Deterministic
Authenticated Encryption for the Internet of Things. IACR Trans. Symmetric Cryptol. 2018(3)
(2018) 1–35
[2] Beierle, C., Jean, J., Kölbl, S., Leander, G., Moradi, A., Peyrin, T., Sasaki, Y., Sasdrich, P.,
Sim, S.M.: The SKINNY Family of Block Ciphers and Its Low-Latency Variant MANTIS.
In: CRYPTO 2016 (2). Volume 9815 of Lecture Notes in Computer Science., Springer (2016)
123–153
[3] Beierle, C., Jean, J., Kölbl, S., Leander, G., Moradi, A., Peyrin, T., Sasaki, Y., Sasdrich, P.,
Sim, S.M.: The SKINNY Family of Block Ciphers and its Low-Latency Variant MANTIS.
IACR Cryptology ePrint Archive 2016 (2016) 660
[4] Beierle, C., Jean, J., Kölbl, S., Leander, G., Moradi, A., Peyrin, T., Sasaki, Y., Sasdrich,
P., Sim, S.M.: SKINNY-AEAD and SKINNY-HASH. Submission to NIST Lightweight
Cryptography Project (2019)
[5] Bellare, M., Boldyreva, A., Palacio, A.: An Uninstantiable Random-Oracle-Model Scheme
for a Hybrid-Encryption Problem. In: EUROCRYPT 2004. Volume 3027 of Lecture Notes in
Computer Science., Springer (2004) 171–188
[6] Bellare, M., Namprempre, C.: Authenticated Encryption: Relations among Notions and
Analysis of the Generic Composition Paradigm. J. Cryptology 21(4) (2008) 469–491
[7] Bellare, M., Rogaway, P., Wagner, D.A.: The EAX Mode of Operation. In: FSE 2004. Volume
3017 of Lecture Notes in Computer Science., Springer (2004) 389–407
[8] Benadjila, R., Guo, J., Lomné, V., Peyrin, T.: Implementing Lightweight Block Ciphers on x86
Architectures. In: SAC 2013. Volume 8282 of Lecture Notes in Computer Science., Springer
(2013) 324–351
[9] Bertoni, G., Daemen, J., Peeters, M., Assche, G.V.: Duplexing the Sponge: Single-Pass
Authenticated Encryption and Other Applications. In: SAC 2011. Volume 7118 of Lecture
Notes in Computer Science., Springer (2011) 320–337
[10] Bogdanov, A., Lauridsen, M.M., Tischhauser, E.: Comb to Pipeline: Fast Software Encryption
Revisited. In Leander, G., ed.: FSE 2015. Volume 9054 of Lecture Notes in Computer Science.,
Springer (2015) 150–171
[11] Canetti, R., Goldreich, O., Halevi, S.: The Random Oracle Methodology, Revisited (Preliminary
Version). In: STOC, ACM (1998) 209–218
[12] Chakraborti, A., Datta, N., Nandi, M., Yasuda, K.: Beetle Family of Lightweight and Secure
Authenticated Encryption Ciphers. IACR Trans. Cryptogr. Hardw. Embed. Syst. 2018(2)
(2018) 218–241
37
[13] Chakraborti, A., Iwata, T., Minematsu, K., Nandi, M.: Blockcipher-Based Authenticated
Encryption: How Small Can We Go? In: CHES 2017. Volume 10529 of Lecture Notes in
Computer Science., Springer (2017) 277–298
[14] Chakraborti, A., Iwata, T., Minematsu, K., Nandi, M.: Blockcipher-based Authenticated
Encryption: How Small Can We Go? (Full version of [13]). IACR Cryptology ePrint Archive
2017 (2017) 649
[15] Cogliati, B., Lee, J., Seurin, Y.: New Constructions of MACs from (Tweakable) Block Ciphers.
IACR Trans. Symmetric Cryptol. 2017(2) (2017) 27–58
[16] Dobraunig, C., Eichlseder, M., Mendel, F., Schläffer, M.: Ascon v1. 2. Submission to the
CAESAR Competition (2016)
[17] George Mason University: ATHENa: Automated Tools for Hardware EvaluatioN. https:
//cryptography.gmu.edu/athena/ (2017)
[18] Groß, H., Wenger, E., Dobraunig, C., Ehrenhöfer, C.: Suit up!–Made-to-Measure Hardware
Implementations of ASCON. In: 2015 Euromicro Conference on Digital System Design, IEEE
(2015) 645–652
[19] Handschuh, H., Preneel, B.: Key-Recovery Attacks on Universal Hash Function Based MAC
Algorithms. In: CRYPTO 2008. Volume 5157 of Lecture Notes in Computer Science., Springer
(2008) 144–161
[20] Hirose, S.: Some Plausible Constructions of Double-Block-Length Hash Functions. In: FSE
2006. Volume 4047 of Lecture Notes in Computer Science., Springer (2006) 210–225
[21] Iwata, T., Khairallah, M., Minematsu, K., Peyrin, T.: Duel of the Titans: The Romulus and
Remus Families of Lightweight AEAD Algorithms. IACR Cryptology ePrint Archive 2019
(2019) 992
[22] Iwata, T., Khairallah, M., Minematsu, K., Peyrin, T.: Remus v1. Submission to NIST
Lightweight Cryptography Project (2019)
[23] Iwata, T., Minematsu, K., Peyrin, T., Seurin, Y.: ZMAC: A Fast Tweakable Block Cipher
Mode for Highly Secure Message Authentication. In: CRYPTO 2017 (3). Volume 10403 of
Lecture Notes in Computer Science., Springer (2017) 34–65
[24] Jean, J., Moradi, A., Peyrin, T., Sasdrich, P.: Bit-Sliding: A Generic Technique for Bit-Serial
Implementations of SPN-based Primitives - Applications to AES, PRESENT and SKINNY. In:
CHES 2017. Volume 10529 of Lecture Notes in Computer Science., Springer (2017) 687–707
[25] Jean, J., Nikolic, I., Peyrin, T.: Tweaks and Keys for Block Ciphers: The TWEAKEY
Framework. In: ASIACRYPT 2014 (2). Volume 8874 of Lecture Notes in Computer Science.,
Springer (2014) 274–288
[26] Joux, A.: Authentication Failures in NIST Version of GCM. Comments submitted to
NIST Modes of Operation Process (2006) Available at http://csrc.nist.gov/groups/ST/
toolkit/BCM/documents/comments/800-38_Series-Drafts/GCM/Joux_comments.pdf.
[27] Khairallah, M., Chattopadhyay, A., Peyrin, T.: Looting the LUTs: FPGA Optimization of
AES and AES-like Ciphers for Authenticated Encryption. In: INDOCRYPT 2017. Volume
10698 of Lecture Notes in Computer Science., Springer (2017) 282–301
[28] Krovetz, T., Rogaway, P.: The Software Performance of Authenticated-Encryption Modes. In:
FSE 2011. Volume 6733 of Lecture Notes in Computer Science., Springer (2011) 306–327
38
[29] Kumar, S., Haj-Yihia, J., Khairallah, M., Chattopadhyay, A.: A Comprehensive Performance
Analysis of Hardware Implementations of CAESAR Candidates. IACR Cryptology ePrint
Archive 2017 (2017) 1261
[30] Liskov, M., Rivest, R.L., Wagner, D.A.: Tweakable Block Ciphers. In: CRYPTO 2002. Volume
2442 of Lecture Notes in Computer Science., Springer (2002) 31–46
[31] Liu, G., Ghosh, M., Song, L.: Security Analysis of SKINNY under Related-Tweakey Settings
(Long Paper). IACR Trans. Symmetric Cryptol. 2017(3) (2017) 37–72
[32] Medwed, M., Standaert, F., Großschädl, J., Regazzoni, F.: Fresh Re-keying: Security against
Side-Channel and Fault Attacks for Low-Cost Devices. In: AFRICACRYPT 2010. Volume
6055 of Lecture Notes in Computer Science., Springer (2010) 279–296
[33] Messerges, T.S.: Securing the AES Finalists Against Power Analysis Attacks. In: FSE 2000.
Volume 1978 of Lecture Notes in Computer Science., Springer (2000) 150–164
[34] Moradi, A., Poschmann, A., Ling, S., Paar, C., Wang, H.: Pushing the Limits: A Very
Compact and a Threshold Implementation of AES. In: EUROCRYPT 2011. Volume 6632 of
Lecture Notes in Computer Science., Springer (2011) 69–88
[35] Naito, Y., Sugawara, T.: Lightweight Authenticated Encryption Mode of Operation for
Tweakable Block Ciphers. IACR Cryptology ePrint Archive 2019 (2019) 339
[36] Peyrin, T., Seurin, Y.: Counter-in-Tweak: Authenticated Encryption Modes for Tweakable
Block Ciphers. In: CRYPTO 2016 (1). Volume 9814 of Lecture Notes in Computer Science.,
Springer (2016) 33–63
[37] Rogaway, P.: Efficient Instantiations of Tweakable Blockciphers and Refinements to Modes
OCB and PMAC. In: ASIACRYPT 2004. Volume 3329 of Lecture Notes in Computer Science.,
Springer (2004) 16–31
[38] Rogaway, P.: Nonce-Based Symmetric Encryption. In: FSE 2004. Volume 3017 of Lecture
Notes in Computer Science., Springer (2004) 348–359
[39] Rogaway, P., Shrimpton, T.: A Provable-Security Treatment of the Key-Wrap Problem. In:
EUROCRYPT 2006. Volume 4004 of Lecture Notes in Computer Science., Springer (2006)
373–390
[40] Sadeghi, S., Mohammadi, T., Bagheri, N.: Cryptanalysis of Reduced round SKINNY Block
Cipher. IACR Trans. Symmetric Cryptol. 2018(3) (2018) 124–162
39
A. Appendix
Table A.1: Domain separation byte B of Romulus. Bits b7 and b6 are to be set to the appropriate
value according to the parameter sets.
b7 b6 b5 b4 b3 b2 b1 b0 int(B) case
- - 0 0 1 0 0 0 8 A main
- - 0 1 1 0 0 0 24 A last unpadded
- - 0 1 1 0 1 0 26 A last padded
Romulus-N
- - 0 0 0 1 0 0 4 M main
- - 0 1 0 1 0 0 20 M last unpadded
- - 0 1 0 1 0 1 21 M last padded
- - 1 0 1 0 0 0 40 A main
- - 1 0 1 1 0 0 44 M auth main
- - 1 1 1 1 1 1 63 w: (even,even,padded,padded)
- - 1 1 1 1 1 0 62 w: (even,even,padded,unpadded)
- - 1 1 1 1 0 1 61 w: (even,even,unpadded,padded)
- - 1 1 1 1 0 0 60 w: (even,even,unpadded,unpadded)
- - 1 1 1 0 1 1 59 w: (even,odd,padded,padded)
- - 1 1 1 0 1 0 58 w: (even,odd,padded,unpadded)
- - 1 1 1 0 0 1 57 w: (even,odd,unpadded,padded)
Romulus-M - - 1 1 1 0 0 0 56 w: (even,odd,unpadded,unpadded)
- - 1 1 0 1 1 1 55 w: (odd,even,padded,padded)
- - 1 1 0 1 1 0 54 w: (odd,even,padded,unpadded)
- - 1 1 0 1 0 1 53 w: (odd,even,unpadded,padded)
- - 1 1 0 1 0 0 52 w: (odd,even,unpadded,unpadded)
- - 1 1 0 0 1 1 51 w: (odd,odd,padded,padded)
- - 1 1 0 0 1 0 50 w: (odd,odd,padded,unpadded)
- - 1 1 0 0 0 1 49 w: (odd,odd,unpadded,padded)
- - 1 1 0 0 0 0 48 w: (odd,odd,unpadded,unpadded)
- - 1 0 0 1 0 0 36 M enc main
40
B. Changelog
41