Polynomials’ Roots Dataset


Finding the arbitrary roots (real or complex) of a given polynomial is a fundamental task in various areas of science and engineering. Applications of root-finding tasks emerge from, e.g., control and communication systems, filter design, signal and image processing, codification and decodification of information.

Most of the methods available in the literature are based on Newton’s method or derived from it, and rely on the deflation technique to sequentially find the roots of a given polynomial. However, this leads to the accumulation of rounding errors and, as a consequence, inaccurate results. Besides that, most of these methods require good initial approximations in order to converge.

The idea to build this dataset is to use it for testing and comparing tools that compute the roots of polynomials. We have used to test artificial intelligence tools (Artificial Neural Networks and Particle Swarm Optimization) and the dataset was used to compare our solution to the other methods available in the literature (e.g., the Durand–Kerner).


There are two main directories, one for polynomials with only real roots (named real) and the other for polynomials with both real and complex roots (named real_complex). These directories store the coefficients (files named deg_n_coef, with n being the degree of the polynomials) and the roots (files named deg_n_roots) of each polynomial entry.

For both cases, this data set only considers real univariate polynomials of degrees 5, 10, 15, 20 and 25. The files were saved in CSV format, with the first line being the header of the data set.

In the header, coefficients were denoted by a_i, where i (i=0,1,...,n) indicate the index associated to the coefficient i of a polynomial (a_n represents the coefficient of the term with the highest degree). For polynomials with only real roots, roots are identified by alpha_j, being j (j=1,2,...,n) the j-th root of a polynomial. This notation changes slightly for polynomials with both real and complex roots, where re_alpha_j and im_alpha_j denote respectively the real and the imaginary part of the j-th polynomial’s root.

To generate this data set, two algorithms were used to: (i) generate real roots for any polynomial degree, and (ii) given a set of real roots, compute the respective coefficients. Contrary to the strategy employed to generate the databases for the real roots, for polynomials with both real and complex roots, the coefficients were generated first and from these, the exact solutions (i.e., the roots) were calculated (which can be real or complex).

It is also important to point out that these files were generated using the Mathematica software with double-precision arithmetic.

Attribute information

For the case when polynomials have only real roots, roots were generated in the closed interval of -1 to 1. For the case when polynomial have both real and complex roots, coefficients were generated in the closed interval of 0 to 1.

This is a multivariate data set, with the number of attributes being equal to n (for polynomials with only real roots) or n * 2 (for polynomials with both real and complex roots).

General information

There are 100 000 instances per polynomial degree, and the associated recommended task is regression. Besides that, there are no missing values.

Citation request

If you use this data set, please cite the following paper: A Neural Network-Based Approach for Approximating the Arbitrary Roots of Polynomials, to be submitted to IEEE Access.

Other Datasets

Soon we will release other datasets regarding Covid-19 and Football.

To request data

To request the data please fill the form below (contact form) and let us know who you are and what you are planning to use the data for. Do not forget to identify your institution and institutional e-mail.