Discrete and Algorithmic Mathematics Red de Matemática Discreta y Algorítmica

Recent trends II: Codes, finite fields and skew polynomials

Polynomials are used in virtually every area of Mathematics, and Coding Theory is not an exception. One of the most used error-correcting codes in practice are Reed-Solomon codes [13], which are defined as subspaces whose vectors are evaluations of polynomials (over a finite field $\mathbb{F}_q$) up to a certain degree: $\mathcal{C} = { (f(a_1),\ldots, f(a_n)) : \deg(f) < k } \subseteq \mathbb{F}_q^n$, for distinct $a_1, \ldots, a_n \in \mathbb{F}_q$. As in other applications, the error and erasure correcting properties of such codes stem from the basic properties of polynomials: one can multiply them, the degree of their product is the sum of their degrees, their number of zeros is upper bounded by their degree, etc. Using these properties, one can easily show that Reed-Solomon codes are optimal for error correction. In other words, they are maximum distance separable, or MDS, that is, their correction capability (measured by the Hamming distance) attains the Singleton bound.

As the reader knows, polynomials can be generalized in a myriad of ways: multivariate polynomials, algebraic functions over a curve, power series, etc. Over a finite field $\mathbb{F}_{q^m}$, one may consider linearized polynomials, i.e., those of the form $f = a_0x + a_1x^q + a_2x^{q^2} + \cdots + a_d x^{q^d}$, where $a_0,a_1, \ldots, a_d \in \mathbb{F}_{q^m}$. These polynomials work essentially as classical polynomials: If we define the degree as $\deg(f) = d$ if $a_d \neq 0$ and the (non-commutative) product as the composition $f \circ g$, then such a product is again a linearized polynomial and $\deg(f \circ g) = \deg(f) + \deg(g)$. Now, evaluations yield $\mathbb{F}_q$-linear maps (hence the adjective linearized) and the dimension of the subspace of zeros is upper bounded once again by the degree of the polynomial. Using these linearized polynomials, one obtains the well-known Gabidulin codes [3], which found applications, among others, in error correction for Network Coding in the IEEE award-winning paper [6].

Such polynomials over finite fields were known since the nineteenth century [11], and were thought to be $q$-analogs of classical polynomials (they look similar but might not be particular cases of a common general class of polynomials). However, it turns out a common general theory of polynomials and their zeros exists. In the 1930s, Ore [12] asked himself the question of which products over the vector space of polynomials yields a (possibly non-commutative) ring structure such that $\deg(f \cdot g) = \deg(f) + \deg(g)$. Interestingly, such products are exactly those such that $x a = \sigma(a) x + \delta(a)$, where $x$ is the variable, $a$ an arbitrary element of the field, $\sigma$ is a field endomorphism and $\delta$ a so-called $\sigma$-derivation (e.g., a classical derivation if $\sigma$ is the identity). He called them non-commutative polynomials, later known as skew polynomials. The classical case (and the only commutative case) is given by $\sigma$ the identity and $\delta = 0$, whereas linearized polynomials over the finite field $\mathbb{F}_{q^m}$ are obtained when $\sigma(a) = a^q$ and $\delta = 0$ (over finite fields one can always assume $\delta = 0$).

However, it was not until the 1980s that the zeros of such general polynomials were studied (after the invention of Gabidulin codes [3]). In a series of papers and in the language of Vandermonde matrices, Lam and Leroy [7] [8] studied the structure of zeros of skew polynomials in general. More concretely, their zeros can be seen as tuples or lists of vector subspaces, and the sum of their dimensions is upper bounded by the degree of the skew polynomial. In the case of classical polynomials, each subspace in the list corresponds to a field element and has dimension either $0$ or $1$, a $1$ means a zero and a $0$ means a non-zero (thus the sum of dimensions is simply the number of zeros). In the case of linearized polynomials, we have a single subspace and its dimension gives the number of zeros.

In Coding Theory, general skew polynomials yield the so-called linearized Reed-Solomon codes [9], which, as expected, constitute a common generalization of Reed-Solomon codes and Gabidulin codes. The structure of zeros of skew polynomials studied by Lam and Leroy implies that linearized Reed-Solomon codes are optimal for error and erasure correction with respect to a new metric, called the sum-rank metric (corresponding to the sums of dimensions mentioned above). This new metric is also a common generalization of the Hamming metric and the rank metric.

Even though these codes are very recent (2018) and their full potential is not yet fully known, they have already been used to (partially close) some open problems in Coding Theory. For example, in 2021 Cai et al. [2] and Gopi and Guruswami [5] independently provided the first construction of locally repairable codes which are also maximally recoverable and with the smallest possible finite-field sizes. This construction is based on linearized Reed-Solomon codes and the properties of the zeros of skew polynomials. Finding such codes with the smallest field sizes (thus computationally efficient) was a problem opened independently over a decade ago by researchers from Microsoft [4] and IBM [1], and with applications in the reliability of distributed storage of Big Data.

This semester (spring of 2024), the Simons Institute for the Theory of Computing (University of California, Berkeley) is organizing the “Error-Correcting Codes: Theory and Practice” semester. Gopi and Guruswami (authors of [5] are coorganizers of this event). In 2022, a monograph covering the codes in this article appeared in the journal Foundations and Trends in Communications and Information Theory [10].


References

[1] M. Blaum, J. L. Hafner, and S. Hetzler. Partial-MDS codes and their application to RAID type of architectures, 59(7):4510-4519, 2013.

[2] H. Cai, Y. Miao, M. Schwartz, and X. Tang. A construction of maximally recoverable codes with order-optimal field size, 68(1):204-212, 2021.

[3] E. M. Gabidulin. Theory of codes with maximum rank distance, 21(1):1-12, 1985.

[4] P. Gopalan, C. Huang, B. Jenkins, and S. Yekhanin. Explicit maximally recoverable codes with locality, 60(9):5245-5256, 2014.

[5] S. Gopi and V. Guruswami. Improved maximally recoverable LRCs using skew polynomials, 68(11):7198-7214, 2022.

[6] R. Kötter and F. R. Kschischang. Coding for errors and erasures in random network coding, 54(8):3579-3591, 2008.

[7] T. Y. Lam. A general theory of Vandermonde matrices, 4:193-215, 1986.

[8] T. Y. Lam and A. Leroy. andermonde and Wronskian matrices over division rings, 119(2):308-336, 1988.

[9] U. Martínez-Peñas. Skew and linearized Reed-Solomon codes and maximum sum rank distance codes over any division ring, 504:587-612, 2018.

[10] U. Martínez-Peñas, M. Shehadeh, and F. R. Kschischang. Codes in the Sum-Rank Metric, Fundamentals and Applications, 19(5):814-1031, 2022.

[11] E. H. Moore. A two-fold generalization of Fermat’s theorem, 2(7):189-199, 1896.

[12] O. Ore. Theory of non-commutative polynomials, 34(3):480-508, 1933.

[13] I. S. Reed and G. Solomon. Polynomial codes over certain finite fields., 8(2):300-304, 1960.


The author of this post, Umberto Matínez-Peñas, is a Profesor Contratado Doctor (permanent Lecturer) at the University of Valladolid and a member of the SINGACOM group.