Skip to content

olepoeschl/NQueensFAF

Repository files navigation

NQueensFAF

Highly optimized Solvers for the N queens problem, one for GPUs (definitely try this one) and one for CPUs. Also provides useful utilities for implementing custom N queens problem solving algorithms.
Features are:

  1. supports GPU-computing and also CPU-Multithreading
  2. distribution among multiple GPUs (soon done automatically in proportion to performance)
  3. auto progress save and continuing from save file
  4. some advanced settings, see explanation of the command line usage

Can run on Windows, Linux and also Mac (for download and installation see below).
The standard version uses a command line interface, which is easy to use (examples below). However we also linked a GUI program of an older version.
Built with Java 17.
NOTE: currently only works for NVIDIA and integrated Intel GPUs.

1 Distributed Computing

We recently started a distributed computing project for solving the N Queens problem! The goals are:

  1. Solve N=27 and confirm the results of the TU Dresden. (currently in progress)
    You can check the current progress on the 27 queens problem here.
  2. Solve N=28 and set the new world record.

1.1 How to contribute

The project is currently offline due to further development!

2 Local Computing

2.1 Download and Installation

The following sections always refer to the most recent version of NQueensFAF. In case you want to try out older versions, just visit the Releases section. The installation process remains the same as described below.

The simplest Way

  1. Choose the Latest Stable Release or the Latest Nightly Build and download the zip file that fits your Operating System.
  2. Unpack the zip file and open a console in the unpacked directory.
  3. Run the application by typing ./nqueensfaf-cli -n 16 (Linux) or nqueensfaf-cli -n 16 (Windows).
  4. Check out section 5 Usage for an overview of all possible commands including examples.
    NOTE: If you can not run the program try the command chmod +x nqueensfaf-cli between steps 2 and 3.

The Java Way

  1. First install Java for your OS.
  2. Choose the Latest Stable Release or the Latest Nightly Build and download the file "nqueensfaf-cli-***.jar" (NOTE the -cli suffix).
  3. open a console in the directory where the jar is located.
  4. run the jar by typing java -jar nqueensfaf-cli.jar -n 16 cpu (same command for all operating systems).
  5. Check out the section [Usage]('docs/5 Usage) for an overview of all possible commands including examples.

A GUI Version

In case you prefer a graphical user interface, you have to wait until version 3.0.0 is released =)

3 News

  • We are currently developing a new solver which is based on a completely new method. Solving N=22 on the 12600k (single-threaded) takes only 2h25min, which corresponds to a speedup factor of more than 40 compared to the present solver. Additionally, the method possesses much better scaling. At N=24 the speedup factor is already 100. This is still a work in progress and there are lots of optimizations that have to be implemented, so lets see how far we can go. The new solver will be included in the repository as soon as it is finished. However, this may take some time.

Older News

  • The distributed computing for solving the 27 Queens problem has begun.
    Download the client and get started (see 1 Distributed Computing)!
  • We are excited to announce that we have successfully verified the number of solutions for the 26-Queens problem.
    The computation was performed using 3 GPUs (2x3070, 1x3060ti) and it took slightly more than 3 weeks to finish.
    27 - Here we come!

4 Current Benchmarks

During the time we have spent developing NQueensFAF, we have been able to continuously expand our available hardware. Especially the newer graphics cards show the potential of our program.

GPUs

Board size N 18 19 20 21 22 23 24 25
RTX 3080 FE 0.03s 0.77s 5.85s 0:48m 6:56m 1:02h 9:45h 4d 7h
RTX 3060 Ti FE 0.10s 1.26s 10.18s 1:23m 12:10m 1:49h 17:50h 7d 2h
GTX 1650 Ti 0.40s 3.62s 29.08s 4:02m 35:21m not measured not measured not measured
Intel UHD 770 4.71s 32.98s 4:18m 36:13m not measured not measured not measured not measured
RX 6650 XT 0.28s 2.00s 16.60s 2:13m 19:14m 3:03h not measured not measured

CPUs

Board size N 16 17 18 19 20 21 22
i5 - 12600k single 1.12s 7.04s 49.92s 6:21m 57:47m not measured not measured
i5 - 12600k multi 0.203s 0.79s 4.91s 37.1s 4:59m 42:20m 6:09h
i5 - 9300h single 1.32s 8.95s 1:05m 8:20m 1:10h not measured not measured
i5 - 9300h multi 0.25s 1.75s 12.5s 1:35m 13:05m 1:52h 16:18h
Ryzen 5800X single 0.91s 6.09s 44.3s 5:38m 45:24m mot measured not measured
Ryzen 5800X multi 0.28s 0.70s 4.06s 30.3s 4:04m 33:53m not measured

Single stands for single-threaded and multi for Multi-threaded. The CPUs and the GPUs are used with stock settings.

Attention: Your graphics card may go into another power state when running the program. To check this and to avoid this, you can use a tool such as "nvidiainfo".

5 Usage

Show the general help message by using nqueensfaf-cli -hand the device specific help messages by using either nqueensfaf-cli -n 20 gpu -h or nqueensfaf-cli -n 20 cpu -h.
If you just want to get started maybe read 5.2 Extended Explanation with Examples first.

5.1 Compact Explanation

The command format reads as follows:
nqueensfaf-cli [-u=<update-interval>] [-s=<auto-save-interval>] (-n=<N> | -r=<path-to-save-file>) (cpu | gpu) [<extra device options>] [-h] The symbol "|" means that either the first or second option (exclusively, not both) can be specified.

Explanation of the Options:

  • -s=<value> ⟶ auto-save interval as a decimal, for example -s=0.05 for auto-saving in 5% intervals
  • -u=<value> ⟶ update time and solution and progress after <value> milliseconds
  • -n=<N> ⟶ substitute the board size for starting a new computation OR
  • -r=<path-to-save-file> ⟶ path to a save-file to continue a computation from the last checkpoint, for example ./20-queens.faf
  • cpu | gpu ⟶ write cpu for choosing cpu and gpu for choosing gpu (device specific options see below)
  • -h ⟶ print device specific help message
    NOTE: You must enable auto-saving again each time you resume from a save-file.

Device options for the CPU: nqueensfaf [...] 20 cpu [-t=<threadcount>] [-p=<pre-queens>] [-h]

  • -t=<value> ⟶ use <value> threads
  • -p=<value> ⟶ default is 6. A higher number means more but smaller tasks by setting additional queens before sending to the solver device. Most of the time 6 is the best option.
  • -h ⟶ print CPU specific help message

Device options for GPUs: nqueensfaf [...] 20 gpu [-p=<pre-queens>] [-h]

  • -h ⟶ print GPU specific help message

5.2 Extended Explanation with Examples

Depending on your way of installation you start the command with

  • nqueensfaf-cli (Windows)
  • ./nqueensfaf-cli (Linux and Mac)
  • java -jar nqueensfaf-cli.jar (Java). Here we always use nqueensfaf-cli.
    The board size (N) and the device (cpu or gpu) must always be specified.

Explanation for CPU

  • N=16 on CPU with 1 thread
    nqueensfaf-cli -n=16 cpu
  • N=18 on CPU with 8 threads
    nqueensfaf-cli -n=18 cpu -t=8
  • N=20 with 8 threads and auto-saves in 5% steps
    nqueensfaf-cli -n=20 -s=0.05 cpu -t=8
  • continue the solution of the 20 queens problem from the save-file 20-queens.faf
    nqueensfaf-cli -s=0.05 -r=./20-queens.faf cpu -t=8

Explanation for GPUs

  • compute N=20 on the default GPU
    nqueensfaf-cli -n=20 gpu

Selecting GPUs

When choosing GPU mode, you will see a list of all available GPUs provided with indices. You can select which GPUs should be used by entering their indices, separated by commata. For example: 0,1,3 if you have minimum 4 GPUs available and just don't want to use the third one.

Multiple GPU option flags can be set, separated by :. Possible flags are

  • ws ⟶ workgroup size on the GPU, standard option 64 is best for NVIDIA GPUs. Only set it to 24 for integrated Intel GPUs. (also automatically set)
  • bm ⟶ represents the benchmark and is required, but only takes effect if multiple GPUs are used, each one with its own benchmark score. A lower score shifts more work towards a GPU.
    NOTE: A good way to choose the bm value is to solve the same board size with all wanted GPUs and use the rounded time as the benchmark value.

Some Examples:

  • For GPU with index 0 (default GPU) with the workgroup size 128, use
    0:ws128
  • In case you have 3 GPUs and all should contribute equally, use
    0,1,2
  • In case you have 2 GPUs with different performance and the one with index 0 should get twice as much work as the other one, use
    0:bm1, 1:bm2

5.3 Java usage

CpuSolver cs = new CpuSolver();
cs.onInit(() -> System.out.println("Starting Solver for board size " + cs.getN() + "..."))
cs.onFinish(() -> System.out.println("Found " + cs.getSolutions() + " solutions in " + cs.getDuration() + " ms"))
cs.setN(16)
cs.solve();

GpuSolver gs = new GpuSolver();
List<Gpu> availableGpus = gs.getAvailableGpus();
gs.gpuSelection().add(availableGpus.get(0).getId());
gs.setN(18);
gs.solve();

5.4 Implement your own algorithm

The abstract class Solver provides a good structure and handy features for your own N Queens Problem solution algorithm. Just extend it and fill the abstract methods with your code.
The method names are self explanatory.

6 References

This solution is based on three ideas, especially the first two:

7 Contact

If you have a comment, question, idea or whatever, we will be happy to answer! Mail: [email protected]