Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computing echelon form of smaller matrix takes longer than larger matrix? #2129

Open
user202729 opened this issue Dec 25, 2024 · 5 comments
Open

Comments

@user202729
Copy link

user202729 commented Dec 25, 2024

Use flint's algorithm through SageMath's interface (which uses fmpq_mat_rref under the hood).

sage: entry_size = 10000
....: num_col = 20
....: num_row = 20
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
1.3925843238830566
sage: entry_size = 10000
....: num_col = 40
....: num_row = 40
....: A = matrix(QQ, [[randint(1, 2^entry_size) for col in range(num_col)] for row in range(num_row)
....: ])
....: t = walltime()
....: A.echelonize(algorithm="flint")
....: t = walltime(t)
....: t
0.022356271743774414

Why is it that in the first case with a 20 × 20 matrix it takes >1 second while in the second case it is instant?

In both cases the matrix is invertible, thus the echelon form is identity.

(larger context: sagemath/sage#39197 , if flint is the fastest in all cases it would be easiest to just switch to flint, but this is not the case at the moment)

@albinahlback
Copy link
Collaborator

Can you replicate this in C, that is, make a MWE? I am not familiar with how Python works under the hood.

@albinahlback
Copy link
Collaborator

Perhaps the tuning parameters could be changed for fmpz_mat_rref. Nonetheless, they should probably be tuned on a hardware-level.

@user202729
Copy link
Author

Sure:

#include <flint/flint.h>
#include <flint/fmpz.h>
#include <flint/fmpz_mat.h>
#include <stdio.h>
#include <stdlib.h>
#include <time.h>

void random_fmpz_mat(fmpz_mat_t mat, slong entry_size) {
    slong i, j;
    flint_rand_t state;
    flint_randinit(state);
    for (i = 0; i < fmpz_mat_nrows(mat); i++) {
        for (j = 0; j < fmpz_mat_ncols(mat); j++) {
            fmpz* tmp = fmpz_mat_entry(mat, i, j);
            fmpz_randbits(tmp, state, entry_size);
            fmpz_add_ui(tmp, tmp, 1);
        }
    }
}

int main() {
    slong entry_size = 10000;
    slong num_col = 20, num_row = 20;
    //slong num_col = 40, num_row = 40;

    fmpz_mat_t A;
    fmpz_mat_init(A, num_row, num_col);

    random_fmpz_mat(A, entry_size);

    clock_t start = clock();
    fmpz den;
    fmpz_init(&den);
    fmpz_mat_rref(A, &den, A);
    clock_t end = clock();

    double time_taken = (double)(end - start) / CLOCKS_PER_SEC;
    printf("Time taken: %f seconds\n", time_taken);

    fmpz_mat_clear(A);

    return 0;
}

Change the parameter between 20 and 40 at the commented out line.

@albinahlback
Copy link
Collaborator

I would guess fmpz_mat_rref is suboptimal in the sense that it does not take the entry sizes into consideration. I don't know about the complexity of the operation, but it looks like the the entry sizes where not taken into consideration.

@albinahlback
Copy link
Collaborator

I suppose @fredrik-johansson is the expert here. Perhaps he can answer when he is back from vacation and has time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants