Trial encoding algorithms ensemble

This paper proposes trial algorithms for some basic components in cryptography and lossless bit compression. The symmetric encryption is accomplished by mixing up randomizations and scrambling with hashing of the key playing an essential role. The digital signature is adapted from the Hill cipher with the verification key matrices incorporating un-invertible parts to hide the signature matrix. The hash is a straight running summation (addition chain) of data bytes plus some randomization. One simplified version can be burst error correcting code. The lossless bit compressor is the Shannon-Fano coding that is less optimal than the later Huffman and Arithmetic coding, but can be conveniently implemented without the use of a tree structure and improvable with bytes concatenation.


Introduction
This paper proposes trial encoding algorithms for some common primitives in cryptography and lossless bit compression. The symmetric encryption is slightly faster than some common block ciphers. It consist of randomization + scrambling (permutation) + randomization. The randomization keys (seeds) are the hashes of the key and the scrambling key is the key itself. The IV is hidden (a costly practice, thus optional) to compensate for possible unevenness in the IV/initial generator state relation. The digital signature is an adaptation of the Hill cipher by hiding the signature key matrix with un-invertible verification matrices and turned around into a signature scheme. They are substantially faster than some common schemes. Key pairs can be derived straight from any user specified code for example, user password. The hash has similar speed to some existing ones. Its collision resistance comes from a substantial mixture of nonlinear functions to thwart differential cryptanalysis. One non-cryptographic version can be burst error correcting code. The Shannon-Fano bit compressor is not always optimal like Huffman coding, but simple enough that a medium array may be used as the data structure in place of a binary tree. It can also be complemented with bytes concatenation and Rice coding for better compression. The cryptography part generally follows the guideline of essential cryptographic practices as illustrated in the article "An Overview of Cryptography" (Kessler 2013). Miscellaneous details of relevance including source codes are in "An Encoding Kit" (Lipin Bill 2013).
The test/coding platform for all benchmarks in this document is shown in Table 1.

Symmetric encryption
First generate a random number as the IV (initialization vector). IV is encrypted because among other issues, it is an input parameter which is partially under the control of the adversary (Erik 2007). Data is randomized (stream cipher like), followed by scrambling (block cipher like), and ending with another round of randomization. For convenience, this is named RSC (randomization scrambling cipher).

The algorithm
The random key stream generation is based on the cyclic piling up of modular addition, XOR, AND, increment, and byte permutation of the key (seed).
This generator is compared with the secure random generator from C# system call. No differences have been detected in some statistical randomness tests (frequency, gap, runs, poker and serial) (Lipin Bill 2013;Lott 2005). It also passed all NIST randomness tests (Andrew et al. 2010), as shown in Table 2: bit sequence length = 100,000 (except 1,000,000 for Universal Statistical test), pass decision = p-value > 0.01 (significance level).
The bit scrambling (permutation) is a bit by bit swap with a bit decided by the key.
Putting it altogether: (6 bytes) Key is K = (K1 K2 K3 K4 K5 K6). Data is D = D1 D2 D3 … Generate random R = (R1 R2 R3 R4 R5 R6) as the IV (initialization vector). Get H = hash of K. Any efficient hash can be used, but better a dedicated one (Erik 2007 Continue with the pseudorandom generator to get another random R. Now set D2 = D2^R. Repeat this to the last data byte. Scramble the data either as a whole or block by block for better performance with K as the scrambling key. When the data bit index reaches 6 x 4 = 24 bytes = 192 bits, bit scrambling is switched to byte scrambling since 24! > 2^48, because bit scrambling is expensive. Get H = hash of H.  Do another round of randomization with the same pseudorandom generator and H as the seed, but set D = D + R. Concatenate IV and D to wrap up encryption. Decryption is the natural reverse of encryption.

Security aspect
The initial inner state of the key stream generator is the hash of K + IV, and should be indistinguishable from random (Erik 2007). Since theoretically hash does not meet that criterion, IV is randomly generated to begin with, also encryption of IV is intended to mitigate the potential problem assuming that encrypted IV is no less random than plain IV. Evidently the goal is to break any exploitable predictability of the initial state from the encrypted IV. Statistical tests were thus conducted to help determining whether this is effective. For a given E1/E2 differential, no statistical deviation from random has been found in the H1/H2 differentials, where E is an encrypted IV and H is the corresponding initial state (Lipin Bill 2013). These differentials also passed all NIST randomness tests (Andrew et al. 2010). As a rule of thumb, the key/ IV size complexity should be 4 times the length of data to be encrypted. The size of the inner state is 2 times the key size, but can be increased (at the expense of performance), as there are indications in the literature that 4 times is safer. The IV encryption is made of a XOR and a modular addition with a bit scrambling sandwiched in between. The bit scrambling results in brute force complexity of the order of n!/((n/2)! (n/2)!) ≈ 2 n , where n is the key size in bits. However, IV encryption is only optional. Data encryption is similar to IV encryption: 1 round of XOR key stream randomization and 1 round of modular addition key stream randomization with 1 round of scrambling sandwiched in between. Confusion obviously comes from the combined actions of all 3 rounds while diffusion is accomplished by the scrambling round which is optional, as OTP and its weaker versions, stream ciphers are provably/ acceptably secure with confusion only. The brute force complexity is similar to that of IV encryption, with the effective key size being the lesser of data and key. Key space of the randomizations are made of 2 hashes of the key. The scrambling key is the key itself. In addition, because the IV is not a mere additive to 1st block of plaintext, it has to be factored into the linear approximation of a known plaintext encryption. In other words, linear cryptanalysis will have to start with IV cracking (except for man-inthe-middle attack where the attacker could inter-cept the communications and change the IV on the fly (Frederic 2004)), followed by modification of the key with the IV addition and the subsequent hashing. Another issue is that the random IV is not a nonce, its birthday problem (if IV is deemed as not long enough) is thus rectified with the removal of XOR malleability by mixing up randomizations and scrambling.
As input has no bearing on key stream generation, plaintext input differential cryptanalysis is not applicable to RSC. The main issue here is with the key differential. Since a key stream is primarily made of modular addition and XOR, obviously these relations between a pair of keys K1 and K2 will be carried over to their respective key streams, that is, had it not been for hashing and IV. The seeds to the key streams are H(K1 + IV1) for one and H(K2 + IV2) for the other. The K1/K2 differential is not preserved in H1/H2. So such attacks cannot be trivially effective. Similarly, some stream ciphers were subject to plaintext IV attacks on the key, for example, it was shown that two IV's with some given difference may produce the same key stream (Eli & Orr 2007). However, because the IV1/ IV2 relation does not go to H1/H2, this cannot be effectively applied to RSC either, although more study is still needed on whether the initial generator state is random enough relative to the (encrypted) IV. Table 3.The performance is basically independent of the key size and slightly faster than some of the well-known block ciphers. This is conceivably because RSC does mostly byte scrambling (if data is over 10 times bigger than the key) rather than bit scrambling (permutation). But the initial stage is slowed down by random IV generation, IV encryption, key hashing and bit scrambling.

Symmetric encryption summary
As is well known, this is the black/fine art of striking a balance between the conflicting demands of security and performance. This is true of both the key stream generation and scrambling. The general concept is to produce just enough randomness with minimum operations to ward off some known attacks. But it comes with a costly initial stage where IV is hidden (contrary to common practice, although optional).

Digital Signature
This algorithm is an adaptation of the Hill Cipher (Murray 1999;John & Matthew 2010;Aldar 2009), thus named HDS (Hill digital signature) for convenience.
With half the matrix missing, the signature key is quite clearly not derivable from the verification key. Hence presumably the only viable way to fake a signature is to attack the verification formula. This amounts to solving the linear But this cannot be done because V1 and V2 are un-invertible matrices. The verification key size is the matrix size which can be for example, 2 x 2 x 32 bytes = 128 bytes. The signature faking complexity is assumed to be 1 / 2 of the missing matrix size, or 1 / 4 of the matrix size = 128 / 4 = 32 bytes. Thus the ratio of verification key size to effective key size is 128 / 32 = 4 times. The data size expansion after signature is typically about 4 times.

Performance aspect
AHC benchmark is shown in Table 4, indicating faster speed than the other algorithms tested. Its operation time is proportional to the key size.

Digital Signature summary
Matrix pairs are generated by a pseudorandom generator with a user specified seed (thus keys can be generated on the fly). Then with uninvertible matrices as part of the verification key, Hill cipher is adapted into a digital signature scheme. However, it has an overhead in bigger key size and data size expansion after signature.

Hash functions
This hash tallies running modular summations (addition chain) of data bytes. Another round of running XOR's is tacked on, mixed with more non-linear functions, and named matterof-factly RSH (running summation hash) for convenience.

The algorithm
Optionally, salting may be added to prevent precalculated rainbow table attacks.

Security aspect
In essence, the pre-image resistance is to match the hash size. The collision resistance is to match the inherent birthday problem's one half of the hash size. Pre-image resistance alludes to hash being one way, given a specific message M, it's computationally infeasible to derive M or any of its peers (collisions, if M is bigger than H) from H (M). Collision resistance is the same definition with the word "specific" changed to "arbitrary". From a practical point of view, the only difference between the two is that birthday attacks can be applied to finding collisions, but not to pre-images.
RSH's collision resistance relies on the assumed hardness to solve a system of non-linear equations. If we replace all modular additions by XOR's in a hash function operating on n-bit words (bytes) and originally containing r modular additions, we get probability(3/4) r (n -1) that the output is the same as in the original hash function (Nieke 2011). So linearization cannot help much with cryptanalysis, as this is not much different from the brute force complexity in order of magnitude.
However, all hashes are subject to differential cryptanalysis somewhat like all symmetric ciphers except OTP are subject to linear cryptanalysis. While linear cryptanalysis cannot be applied to hashes (unless the non-linear part of hashing is really weak) because there is no such thing as a "partial" collision, there is no escape from exploitation of uneven differentials among hash inputs and outputs. Assuming for example, that the hash is strictly linear with respect to XOR (just for convenience of discussion, not realistic because pre-images can be directly solved and all collisions found), then for a given input A, any input B such that H(A^B) = 0 would be a collision. The implication is that however poor a linear approximation may be, a differential input pair would give a related output pair (not entirely random). As a realistic example, if a hash is made of modular addition and XOR, then it can be attacked with combined differentials of modular addition and XOR (Nieke 2011).
RSH's approach is to "maximize" non-linearity without losing performance too much. The non-linearity comes from mixing up modular addition and XOR, modular multiplication of the running summation with the running index of looping in postProc(), bit permutation in postPermute() and some AND's at the end. Therefore, combined differentials attacks have to include modular multiplication, bit permutation and AND.
The optimum birthday attack resistance is n/2 bits, where n is the hash size. This is equivalent to the hash output being evenrandom input leads to random output, so does perfectly ordered input. No statistical deviation from random has been found in tests conducted for this (Lipin Bill 2013). These outputs also passed all NIST randomness tests (Andrew et al. 2010).

Performance aspect
RSH benchmark is shown in Table 5. Its speed is basically independent of the hash size and quite similar to some of the hashes currently available except for shorter hashes.
Non-cryptographic hasherror correcting code The 1 pass only version is tailor made for data error detection and correction by solving the quasi-linear equations set of erroneous data bytes (if they are contiguous) out of the hash. If there is a unique solution, then the error can be corrected.
A few extra bytes of encodings are attached to the hash in order to get a unique error correcting because the equations to be solved are quasi-linear (overflow discarded). If the (recalculated) hash is changed upon data reception indicating errors, then it is possible to restore the corrupted bytes from the hash and the attached bytes.

Summary of hash functions
2 rounds of running summations and/or XOR's form the backbone of hashing. With additional non-linear functions packed into the hashing, methodically cracking the collision resistance may be no faster than brute force birthday attacks. The 1 pass only version can be used for burst data error detection and correction.

Bit compressor
This is a simpler precursor of Huffman coding. In essence, it is just a bit representation that reflects the actual probability of each number in the data. This is in contrast to the "regular" bits that assume all possible numbers are equally likely in the data. For example, byte 21 is equal to or less than 127, therefore the 1st bit is 0, then it is compared against 63 and so on, thus arriving at 00010101. But for a typical piece of text data, some bytes would be more frequent than the others. Therefore, the 1st comparison would be with something less than 127, the next would be less than 63, and so on. Thus, it presumably would result in shorter bits along the same line of the later and more optimal Huffman coding, as proposed by Shannon and Fano, and therefore named SFC (Shannon-Fano coding) for convenience.
As a simple illustration, assuming there are 6 different bytes in the data with the following counts in nonascending order: Cnt  …, thus arriving at the medium array, M[] = {0, 0, 1, 1, 2, 2, 3, 3, 4, 5}. Some rearrangement is made to this array to shorten it before eventually attaching it to the compressed data. But this is not critical, as this array is typically quite short compared with the data. There is also a Boolean array associated with the medium array, E[] = {0, 1, 0, 1, 0, 1, 0, 1, 0, 1}, indicating if the element is at the end (of a search).
Byte 0  Decompression reverses this process, mapping the bits back to 0, 1, 2, …, then back to 15, 76, 59, 123, 68, 154. Depending on the text size, some common words may show up quite often. So concatenation of data bytes into bigger units may pay off in achieving better compression. For better performance, the principle of Rice coding is borrowed here. That is, the compression is done not on the data itself, but on the number of bits in each data unit. The (transformed) data stripped of the top bit (must be 1) is sent as it is. Decompression extracts the recorded number of bits back into each unit. Thus compression is primarily data mapping, with bit conversion playing a lesser role. This (recursive) concatenation scheme is somewhat like dictionary coding, such as LZW, in that it can also be applied to the likes of GIF, PNG and clip arts.
Performance SFC benchmark is shown in Table 6. Compression and speed are near identical with Huffman coding for common texts, unless data frequency distribution is significantly "undividable".

Summary of Bit compressor
This calculates the frequency of each data byte just like Huffman coding. But instead of a binary tree as in Huffman coding, an easier to handle medium array is constructed. The downside of this is that the bit conversion is not always optimal. This happens when a medium deviates significantly from evenly dividing a set of data. Conversion of a data byte to (shorter) bits is done by comparing it with the medium array which is then replaced by a mapping table for speed. Where appropriate, data bytes may be concatenated into bigger units for better compression while largely maintaining its speed.

Conclusions
Some trial encoding algorithms were proposed. Their implementation is quite simple relative to some comparable encoders. Performance comparisons with some well-known similar algorithms were conducted. The results show better or no worse performance: the symmetric cipher has a slight edge over some common block ciphers, the digital signature can run considerably faster than some common schemes, the hash and the bit compressor are near the same with some of the most common. To sum up the approaches: optional encryption of the IV at the outset, followed by 2 rounds of data randomization with 1 round of data scrambling sandwiched in between in the symmetric cipher, adaptation of Hill cipher by use of un-invertible matrices as part of the verification keys in the signature scheme, chaining of addition/XOR/multiplication, followed by bit permutation and ending with some AND's in the hash. The bit compressor is slightly different from Huffman coding that uses an array rather than a binary tree as the data structure, with an advantage and a disadvantage. Compression may be enhanced through bytes concatenation and Rice coding. A non-cryptographic version of the hash is for burst error correction.

A.1 CSPRNG proposal
All encryptions and signatures require randomization. The C# CSPRNG (cryptographically secure pseudorandom number generator) can be used for this purpose. Alternatively, the following generator is proposed for better performance.
This is the 2 rounds of randomizations in RSC wrapped into one. Like an individual round, it also passed all NIST randomness tests (Andrew et al. 2010).
To function as a stream cipher, let key1 be the IV randomized secret key and key2 be the hash of key1.

A.2 RSC with nonce
Rather than random IV's, a variation of RSC can use IV that is nonce (non-repeating number), the simplest of which is a plain sequence, 0, 1, 2, 3, … Nonce ensures each encrypted message can only be valid once without incorporating a time stamp or unique ID in the plain message, thus cannot be re-used over the internet by hijackers.
IV is transmitted in the open (no encryption).
2nd randomization key = RK2 = PRF(RK1). The PRF here should be computationally indistinguishable from random, leading to a costly initial stage in the encryption. But depending on the strength of the underlying key stream generator, this requirement can be relaxed for better efficiency (Erik 2007). In lieu of a dedicated PRF that is yet to be found in the literature, the following series of RSH based key generators is proposed. The idea is to list them in increasing complexity (thus decreasing efficiency), with the optimum choice being the one with best efficiency, yet enough security for a particular situation. Given that the in-distinguishability from random is a generally sufficient, but not always necessary condition for a viable PRF, all proposals are therefore tested (for plain sequence input) with the NIST randomness test suite, as a general reference to their relative strengths. keyHash_1() (2nd round removed) and keyHash_2() (modular multiplications removed) are simplified versions of postProc(). They are the only ones failing the NIST randomness tests, but nonetheless likely sufficient for the initial generator state setup given random IVs and possibly for ordered nonce IVs too, although more study is needed for that. So it appears that in general, keyHash_3() is the best tradeoff between security and efficiency. In fact, it is the one used in RSC. On the other hand, keyHash_4() and keyHash_5() may be "overkills" in typical situations, as postProc() alone passed all NIST randomness tests for plain sequence input.

A.3 RSC scrambling
This is to break any exploitable correlation between input and output (diffusion). For testing purposes, the input is kept simple, 10 random bytes, followed by 0's. Bit scrambling is shown, so is byte scrambling which is sufficient to accomplish diffusion if input size is substantially larger than key size. Actual scrambling is mixed.

A.4 RSC linear cryptanalysis
For simplicity, only the 1st round of randomization is taken into consideration here (just for illustration, as 1 round of randomization can easily be cracked with divide and conquer, rather than the pretentious method below): C1 = P1^S1, C2 = P2^S2, …, Cn = Pn^Sn, where C is a cipher byte, P is a plain byte, S is a key stream element and n is the key size in bytes. From the key stream algorithm, S1 = (K1^Kn) + Kn, S2 = (K2^S1) + S1, S3 = (K3^S2) + S2, …, therefore the total number of modular additions in these equations is effectively n. If we replace all the modular additions with XOR, then the accuracy of the resulting linear approximation is (3/4) n (8 -1) (Nieke 2011), or (3/4)  (1/2) 50 , if n = 16 bytes. An effective attack thus requires the order of 2 50 known plaintexts, hence is generally impractical.
In addition, there is another round of randomization and the effect of one bit of IV change leading to an entirely different randomization key, H(K + IV).

A.5 Matrix pair generator
HDS uses a fast matrix pair generator that creates a matrix as (I + Δ1) x (I + Δ2) x (I + Δ3) x … where I is the identity matrix, Δ is a random matrix with all 0 except one non-diagonal element is 1 or -1. All the random numbers used in the matrix generation can be from a pseudorandom generator with a user specified seed. The inverse is by virtue of the property ( . The inverse of each component is the same except that the non-zero non-diagonal element is negated. Besides keeping the matrix simple, it is faster than inverting an existing matrix when matrix dimension reaches 20 × 20.

A.6 HDS Signature Faking
Faking is obtained by solving the linear equation, V x E = D, or (C x A) x E = D. If the matrix dimension is 2 x 2, this reduces to a1 x e1 + a2 x e2 = k1, where k1 = d1 + d2. The expected fake resistance is 1 / 4 of the matrix size. Test results indicate similar order of magnitude between the actual and expected fake resistance, as shown in Table 7.

A.8 RSH divide and conquer
One potentially viable approach to crack the collision resistance is to attack the 2 rounds of running summations/XOR's separately. In this approach, if 2 messages have identical output from both rounds, then they would be collisions, thus bypassing the post processing made of modular multiplication, bit permutation and AND. While it's quite feasible to find messages M1 and M2, such that either R1(M1) = R1(M2), or R2(M1) = R2(M2), the outcome of R1 has no bearing on that of R2, and vice versa. Looking for common solutions this way, is therefore effectively no different from a brute force birthday attack. In fact, it can be shown that in the simple case of m = 3 and h = 2, where m is the number of message bytes and h is the number of hash bytes, the solutions for the 2 rounds are mutually exclusive, out of the 256 R1 collisions for a given R1 hash, no pair collide in R2.

A.9 RSH Differential Cryptanalysis
This is to find m, Δm with Δm ≠ 0 and Δh(m) = 0, where Δm = m^m' and Δh(m) = h(m)^h(m'), where^can  In the latter case, if Δm = m + m' = 1, then m/m' can be 0/1 or 2/3 leading to h(m)/h(m') = 0/3 or 2/1, or Δh(m) = h(m) + h(m') = 3, thus distinctively uneven. RSH is beyond these, the inaccuracy of the linear approximation of modular multiplication, bit permutation and AND is no less than the hash size complexity, thus negating the feasibility of this approach.

A.10 RSH Dry Run
Data: For people still clinging to decent jobs, the challenge is more complicated.
Data: For people still clinging to decent jobs, the challenge is more complicated, If the number of corrupted data bytes is k and k < = n, where n is the number of hash bytes, then the corrupted bytes can be solved from k quasi-linear equations out of the hash. But because any overflow beyond 1 byte is lost from the hash, the solution has some collisions. So Σ (di x i) is stored in a 2 bytes integer, with i ranging from 1 to m, where m is the total number of data bytes contained in the hash. It is attached to the hash along with 1 byte total sum and 1 byte total XOR. The collisions are matched with the attachment to single out the real solution. Obviously this can only work if the hash is intact. So duplicates of hash may be attached to both ends of the data. Original