This data comes from a Number Field Sieve calculation to factor the RSA-155 challenge number (512 bits), using CADO-NFS revision 47b84c3. The c155.purged.gz file contains the matrix output by the purge step (input of the merge step), while the other files correspond to the matrix output by the merge/replay steps (input of the linear algebra step).

The c155.purged.gz file contains a matrix defined over GF(2), with 17117966 rows and 17117806 columns, where the column indices go from 0 to 56385930. After a header line containing that information, each row is encoded by one line, with a,b: followed by a list of hexadecimal numbers from 0 to 56385930 for the column indices for that row.

The matrix represented by the other files is also defined over GF(2), it has 3854889 rows and 3854728 columns, with 170 non-zero coefficients per row on average. It is sorted in binary as a sequence of rows. There is no header. Each row holds a 32-bit little endian integer for the row weight, followed by this very number of 32-bit little endian integers, indicating column indices of non-zero elements.

As is relatively usual with the Number Field Sieve factorization algorithm, the densest columns of the relation matrix produced by the algorithm are put aside before linear algebra processing, and the required correction is performed later on. Here, 32 columns were put aside, in the c155.dense.bin file.

The row weight and column weight files, for each matrix, are stored simply as list of 32-bit little endian integers.

  • c155.purged.gz
  •       1.0G        17117966x17117806, where the column indices go from 0 to 56385930
  • c155.sparse.bin
  •       2.6G        3854889x3854728, 655331417 non-zero coefficients
  •       15M       row weights
  •       15M       column weights
  • c155.dense.bin
  •       108M        3854889x32, 48173007 non-zero coefficients
  •       15M       row weights
  •       <1k       column weights